From davidk at ci.uchicago.edu Thu Jan 2 11:55:52 2014 From: davidk at ci.uchicago.edu (David Kelly) Date: Thu, 2 Jan 2014 11:55:52 -0600 Subject: [Swift-devel] Are XML keys case sensitive in sites.xml? Message-ID: Hello, Are XML keys case sensitive in sites.xml? I'm testing with the following config: 16 sandyb 10000 1 1 1 /scratch/midway/davidkelly999/work When I use: 10000 Things work as expected and I get 16 active tasks. When I use a lowercase "initialscore": 10000 I only get 7 active tasks instead of 16. Maybe it is only keys in the karajan namespace that are case sensitive? jobspernode and jobsPerNode both seem to work, for example. Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jan 2 13:47:51 2014 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 02 Jan 2014 11:47:51 -0800 Subject: [Swift-devel] Are XML keys case sensitive in sites.xml? In-Reply-To: References: Message-ID: <1388692071.9854.0.camel@echo> In general, yes, they are. Some sub-systems may however choose to handle them in a case insensitive way. Mihael On Thu, 2014-01-02 at 11:55 -0600, David Kelly wrote: > Hello, > > Are XML keys case sensitive in sites.xml? > > I'm testing with the following config: > > > > > > 16 > sandyb > 10000 > 1 > 1 > 1 > > /scratch/midway/davidkelly999/work > > > > When I use: > 10000 > > Things work as expected and I get 16 active tasks. > > When I use a lowercase "initialscore": > 10000 > > I only get 7 active tasks instead of 16. > > Maybe it is only keys in the karajan namespace that are case sensitive? > jobspernode and jobsPerNode both seem to work, for example. > > Thanks, > David > Hello, > > > Are XML keys case sensitive in sites.xml? > > > I'm testing with the following config: > > > > > > > 16 > sandyb > 10000 > 1 > 1 > 1 > > /scratch/midway/davidkelly999/work > > > > > When I use: > 10000 > > > > Things work as expected and I get 16 active tasks. > > > When I use a lowercase "initialscore": > 10000 > > > > I only get 7 active tasks instead of 16. > > > Maybe it is only keys in the karajan namespace that are case > sensitive? jobspernode and jobsPerNode both seem to work, for example. > > > Thanks, > David > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Tue Jan 7 18:40:39 2014 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 7 Jan 2014 18:40:39 -0600 (CST) Subject: [Swift-devel] Please look at hung run on Beagle In-Reply-To: <2035640493.47319321.1389141296044.JavaMail.root@mcs.anl.gov> Message-ID: <1319688631.47319504.1389141639575.JavaMail.root@mcs.anl.gov> Hi Mihael and/or David, Can you look at this run on beagle and provide a diagnosis? -rw-r--r-- 1 mattshax ci-users 33307656 Jan 7 18:20 /lustre/beagle/mattshax/swifthome.20140107/sweep8-20140107-1812-obopd7ad.log Its an EnergyPlus run by Matthew of SOM. The progress ticker shows: login1$ grep -i progresstick *ad.log 2014-01-07 18:12:26,570-0600 INFO RuntimeStats$ProgressTicker 2014-01-07 18:12:33,600-0600 INFO RuntimeStats$ProgressTicker Initializing:3 2014-01-07 18:12:34,605-0600 INFO RuntimeStats$ProgressTicker Initializing:7297 Selecting site:1803 2014-01-07 18:12:38,556-0600 INFO RuntimeStats$ProgressTicker Selecting site:9097 Submitting:3 2014-01-07 18:12:43,585-0600 INFO RuntimeStats$ProgressTicker Submitting:9099 Submitted:1 2014-01-07 18:12:44,580-0600 INFO RuntimeStats$ProgressTicker Submitting:7635 Submitted:1465 2014-01-07 18:12:45,580-0600 INFO RuntimeStats$ProgressTicker Submitting:1014 Submitted:8086 2014-01-07 18:12:56,570-0600 INFO RuntimeStats$ProgressTicker Submitted:9100 2014-01-07 18:13:26,571-0600 INFO RuntimeStats$ProgressTicker Submitted:9100 ... 2014-01-07 18:19:26,573-0600 INFO RuntimeStats$ProgressTicker Submitted:9100 2014-01-07 18:19:56,573-0600 INFO RuntimeStats$ProgressTicker Submitted:9100 (at which time it was killed) Beagle had abundant (300+) free nodes, and many PBS jobs started for the run. It seems though that workers started timing out around 18:14. I cant tell if any workers were getting any work started, or not. This has happened several times (on 0.94.1). I will try to get this app moved to 0.95RC as soon as possible, but for now, Matthew is making good progress with the scripts as-is (modulo these timeout situations). He thought, from earlier debugging, that the timeouts were due to actual app failures (eg caused by bad app config files) but I cant see how that could be happening. Any assessment or diagnosis of this situation would be appreciated. Thanks, - Mike -- Michael Wilde Mathematics and Computer Science | Computation Institute Argonne National Laboratory | The University of Chicago From hategan at mcs.anl.gov Tue Jan 7 18:46:27 2014 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 07 Jan 2014 16:46:27 -0800 Subject: [Swift-devel] Please look at hung run on Beagle In-Reply-To: <1319688631.47319504.1389141639575.JavaMail.root@mcs.anl.gov> References: <1319688631.47319504.1389141639575.JavaMail.root@mcs.anl.gov> Message-ID: <1389141987.6131.1.camel@echo> Lorenzo and his group were running some lustre intensive jobs, so lustre was rather unresponsive. If this happened in the past day or two, I would try again. If not, then a jstack on the java process (both swift and coaster service if separate) might shed some light on the issue. Mihael On Tue, 2014-01-07 at 18:40 -0600, Michael Wilde wrote: > Hi Mihael and/or David, > > Can you look at this run on beagle and provide a diagnosis? > > -rw-r--r-- 1 mattshax ci-users 33307656 Jan 7 18:20 > /lustre/beagle/mattshax/swifthome.20140107/sweep8-20140107-1812-obopd7ad.log > > Its an EnergyPlus run by Matthew of SOM. > > The progress ticker shows: > > login1$ grep -i progresstick *ad.log > 2014-01-07 18:12:26,570-0600 INFO RuntimeStats$ProgressTicker > 2014-01-07 18:12:33,600-0600 INFO RuntimeStats$ProgressTicker Initializing:3 > 2014-01-07 18:12:34,605-0600 INFO RuntimeStats$ProgressTicker Initializing:7297 Selecting site:1803 > 2014-01-07 18:12:38,556-0600 INFO RuntimeStats$ProgressTicker Selecting site:9097 Submitting:3 > 2014-01-07 18:12:43,585-0600 INFO RuntimeStats$ProgressTicker Submitting:9099 Submitted:1 > 2014-01-07 18:12:44,580-0600 INFO RuntimeStats$ProgressTicker Submitting:7635 Submitted:1465 > 2014-01-07 18:12:45,580-0600 INFO RuntimeStats$ProgressTicker Submitting:1014 Submitted:8086 > 2014-01-07 18:12:56,570-0600 INFO RuntimeStats$ProgressTicker Submitted:9100 > 2014-01-07 18:13:26,571-0600 INFO RuntimeStats$ProgressTicker Submitted:9100 > ... > 2014-01-07 18:19:26,573-0600 INFO RuntimeStats$ProgressTicker Submitted:9100 > 2014-01-07 18:19:56,573-0600 INFO RuntimeStats$ProgressTicker Submitted:9100 > (at which time it was killed) > > Beagle had abundant (300+) free nodes, and many PBS jobs started for the run. It seems though that workers started timing out around 18:14. I cant tell if any workers were getting any work started, or not. > > This has happened several times (on 0.94.1). I will try to get this app moved to 0.95RC as soon as possible, but for now, Matthew is making good progress with the scripts as-is (modulo these timeout situations). > > He thought, from earlier debugging, that the timeouts were due to actual app failures (eg caused by bad app config files) but I cant see how that could be happening. > > Any assessment or diagnosis of this situation would be appreciated. > > Thanks, > > - Mike > From davidk at ci.uchicago.edu Tue Jan 7 19:11:00 2014 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 7 Jan 2014 19:11:00 -0600 Subject: [Swift-devel] Please look at hung run on Beagle In-Reply-To: <1389141987.6131.1.camel@echo> References: <1319688631.47319504.1389141639575.JavaMail.root@mcs.anl.gov> <1389141987.6131.1.camel@echo> Message-ID: I think there might be an issue with the sites.xml formatting there too. It looks like sites.xml has multple and tags On Tue, Jan 7, 2014 at 6:46 PM, Mihael Hategan wrote: > Lorenzo and his group were running some lustre intensive jobs, so lustre > was rather unresponsive. If this happened in the past day or two, I > would try again. > > If not, then a jstack on the java process (both swift and coaster > service if separate) might shed some light on the issue. > > Mihael > > On Tue, 2014-01-07 at 18:40 -0600, Michael Wilde wrote: > > Hi Mihael and/or David, > > > > Can you look at this run on beagle and provide a diagnosis? > > > > -rw-r--r-- 1 mattshax ci-users 33307656 Jan 7 18:20 > > > /lustre/beagle/mattshax/swifthome.20140107/sweep8-20140107-1812-obopd7ad.log > > > > Its an EnergyPlus run by Matthew of SOM. > > > > The progress ticker shows: > > > > login1$ grep -i progresstick *ad.log > > 2014-01-07 18:12:26,570-0600 INFO RuntimeStats$ProgressTicker > > 2014-01-07 18:12:33,600-0600 INFO RuntimeStats$ProgressTicker > Initializing:3 > > 2014-01-07 18:12:34,605-0600 INFO RuntimeStats$ProgressTicker > Initializing:7297 Selecting site:1803 > > 2014-01-07 18:12:38,556-0600 INFO RuntimeStats$ProgressTicker > Selecting site:9097 Submitting:3 > > 2014-01-07 18:12:43,585-0600 INFO RuntimeStats$ProgressTicker > Submitting:9099 Submitted:1 > > 2014-01-07 18:12:44,580-0600 INFO RuntimeStats$ProgressTicker > Submitting:7635 Submitted:1465 > > 2014-01-07 18:12:45,580-0600 INFO RuntimeStats$ProgressTicker > Submitting:1014 Submitted:8086 > > 2014-01-07 18:12:56,570-0600 INFO RuntimeStats$ProgressTicker > Submitted:9100 > > 2014-01-07 18:13:26,571-0600 INFO RuntimeStats$ProgressTicker > Submitted:9100 > > ... > > 2014-01-07 18:19:26,573-0600 INFO RuntimeStats$ProgressTicker > Submitted:9100 > > 2014-01-07 18:19:56,573-0600 INFO RuntimeStats$ProgressTicker > Submitted:9100 > > (at which time it was killed) > > > > Beagle had abundant (300+) free nodes, and many PBS jobs started for the > run. It seems though that workers started timing out around 18:14. I cant > tell if any workers were getting any work started, or not. > > > > This has happened several times (on 0.94.1). I will try to get this app > moved to 0.95RC as soon as possible, but for now, Matthew is making good > progress with the scripts as-is (modulo these timeout situations). > > > > He thought, from earlier debugging, that the timeouts were due to actual > app failures (eg caused by bad app config files) but I cant see how that > could be happening. > > > > Any assessment or diagnosis of this situation would be appreciated. > > > > Thanks, > > > > - Mike > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Tue Jan 7 21:11:38 2014 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 7 Jan 2014 21:11:38 -0600 (CST) Subject: [Swift-devel] Please look at hung run on Beagle In-Reply-To: <1389141987.6131.1.camel@echo> Message-ID: <257373770.47327551.1389150698978.JavaMail.root@mcs.anl.gov> > Lorenzo and his group were running some lustre intensive jobs, so > lustre > was rather unresponsive. Indeed, lustre responsiveness has been an issue for the SOM jobs, mainly stretching their walltimes unexpectedly long. But we're working around that. > If this happened in the past day or two, I > would try again. > > If not, then a jstack on the java process (both swift and coaster > service if separate) might shed some light on the issue. Matthew just found the problem. He'd been trying to fit tasks of maxwalltime 59:00 into 1-hour maxtime slots, which after the 1-minute deduction, apparently no longer fit. This is an old problem, that I think we have a ticket on: it results in essentially a livelock: the coaster workers repeatedly time out because they get no work, but the scheduler can never give them work because the only available tasks dont fit in the slots. Did we discuss how we should remedy this? Seems a message stating that no sufficiently large slots exist should be generated, which would immediately tell the user why the run is not progressing. - Mike From marialemos72 at gmail.com Sat Jan 11 12:31:03 2014 From: marialemos72 at gmail.com (ML) Date: Sat, 11 Jan 2014 18:31:03 +0000 Subject: [Swift-devel] CISTI'2014: List of Workshops Message-ID: <20140111183108.147977CC089@mailrelay.anl.gov> ********************************** WORKSHOPS ******************************************* CISTI'2014 - 9th Iberian Conference on Information Systems and Technologies Barcelona, Spain, June 18 - 21, 2014 http://www.aisti.eu/cisti2014/index.php/en/workshops **************************************************************************************** List of Workshops to be held in the CISTI'2014 context: - ARWC 2014 - 1st Workshop on Augmented Reality and Wearable Computing - ASDACS 2014 - 1st Workshop on Applied Statistics and Data Analysis using Computer Science - IoT 2014 - 1st Workshop on Internet of Things - SGaMePlay 2014 - 4th Iberian Workshop on Serious Games and Meaningful Play - TICAMES 2014 - 2nd Workshop on Information and Communication Technology in Higher Education: Learning Mathematics - WICTA 2014 - 1st Workshop on ICT for Audit - WISA 2014 - 6th Workshop on Intelligent Systems and Apllications - WLA 2014 - 1st Workshop on Learning Analytics - WNIS 2014 - 1st Workshop on Networks, Information and Society Detailed information about these workshops is available at http://www.aisti.eu/cisti2014/index.php/en/workshops Best regards, CISTI'2014 Team http://www.aisti.eu/cisti2014/index.php/en From hategan at mcs.anl.gov Fri Jan 17 16:32:21 2014 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jan 2014 14:32:21 -0800 Subject: [Swift-devel] provider staging stuff Message-ID: <1389997941.11994.17.camel@echo> This email is to mention a few ideas regarding the issue of site-selectable provider staging. Currently, provider staging can only be enabled globally. Somewhere in swift.k there is a conditional import of either swift-int.k or swift-int.k (or the more recent swift-int-wrapperstaging.k). The nice thing about the global switch is that it is evaluated at karajan-compile time, and the import and compilation of the relevant swift-int*.k is done before anything is executed. I want to replace that by: 1. eliminating all swift-int*.k files 2. moving that logic into Java code Swift needs staging, that's clear. However, it doesn't care how it's done as long as the files are there when the job is started and they are moved back when the job is done. Some providers support staging. There are two levels of support for this. One is the basic level that GRAM does, and the other is the improved version that coasters and the local provider do, which support conditional staging based on job status and lenient staging (i.e. stage a file only if it's there). Of course, some providers don't support staging at all. However, this is an implementation level issue. Out of a non-staging provider and a file op provider, one can build a combined provider that does support file staging. This is a pretty old idea, before swift, that never got implemented because... I don't remember. I think that doing this would be a good move since it would probably simplify some swift code (I always thought swift-int.k was convenient for prototyping, but ugly), would probably reduce memory consumption, would make some of the logic useable outside of swift/k and, of course, enable finer control over how staging is done. Mihael From davidkelly at uchicago.edu Mon Jan 20 11:06:45 2014 From: davidkelly at uchicago.edu (David Kelly) Date: Mon, 20 Jan 2014 11:06:45 -0600 Subject: [Swift-devel] Adding versions in bugzilla Message-ID: Hello, Does anyone know how to add swift releases to the list of "versions" in bugzilla? I'd like to add a 0.95 version there, but I don't see any options to do that (and not sure if I have the permissions to do that or not). Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Tue Jan 21 04:08:31 2014 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Tue, 21 Jan 2014 04:08:31 -0600 Subject: [Swift-devel] CFP: 5th Workshop on Scientific Cloud Computing (ScienceCloud) @ ACM HPDC 2014 Message-ID: <52DE471F.2010005@cs.iit.edu> Call for Papers: 5th Workshop on Scientific Cloud Computing (ScienceCloud) June 23/24, 2014. Vancouver, Canada (http://datasys.cs.iit.edu/events/ScienceCloud2014/) Co-Located with HPDC 2014 ------------------------------------------------------------------------------- IMPORTANT DATES Paper Submission: March 1, 2014 Acceptance Notification: April 4, 2014 Final Papers: April 11, 2014 Workshop: June 23/24, 2014 ------------------------------------------------------------------------------- OVERVIEW Computational and Data-Driven Sciences have become the third and fourth pillar of scientific discovery in addition to experimental and theoretical sciences. Scientific Computing has already begun to change how science is done, enabling scientific breakthroughs through new kinds of experiments that would have been impossible only a decade ago. Today?s ?Big Data? science is generating datasets that are increasing exponentially in both complexity and volume, making their analysis, archival, and sharing one of the grand challenges of the 21st century. The support for data intensive computing is critical to advance modern science as storage systems have exposed a widening gap between their capacity and their bandwidth by more than 10-fold over the last decade. There is a growing need for advanced techniques to manipulate, visualize and interpret large datasets. Scientific Computing is the key to solving ?grand challenges? in many domains and providing breakthroughs in new knowledge, and it comes in many shapes and forms: high-performance computing (HPC) which is heavily focused on compute-intensive applications; high-throughput computing (HTC) which focuses on using many computing resources over long periods of time to accomplish its computational tasks; many-task computing (MTC) which aims to bridge the gap between HPC and HTC by focusing on using many resources over short periods of time; and data-intensive computing which is heavily focused on data distribution, data-parallel execution, and harnessing data locality by scheduling of computations close to the data. The 5th workshop on Scientific Cloud Computing (ScienceCloud) will provide the scientific community a dedicated forum for discussing new research, development, and deployment efforts in running these kinds of scientific computing workloads on Cloud Computing infrastructures. The ScienceCloud workshop will focus on the use of cloud-based technologies to meet new compute-intensive and data-intensive scientific challenges that are not well served by the current supercomputers, grids and HPC clusters. The workshop will aim to address questions such as: What architectural changes to the current cloud frameworks (hardware, operating systems, networking and/or programming models) are needed to support science? Dynamic information derived from remote instruments and coupled simulation, and sensor ensembles that stream data for real-time analysis are important emerging techniques in scientific and cyber-physical engineering systems. How can cloud technologies enable and adapt to these new scientific approaches dealing with dynamism? How are scientists using clouds? Are there scientific HPC/HTC/MTC workloads that are suitable candidates to take advantage of emerging cloud computing resources with high efficiency? Commercial public clouds provide easy access to cloud infrastructure for scientists. What are the gaps in commercial cloud offerings and how can they be adapted for running existing and novel eScience applications? What benefits exist by adopting the cloud model, over clusters, grids, or supercomputers? What factors are limiting clouds use or would make them more usable/efficient? This workshop encourages interaction and cross-pollination between those developing applications, algorithms, software, hardware and networking, emphasizing scientific computing for such cloud platforms. We believe the workshop will be an excellent place to help the community define the current state, determine future goals, and define architectures and services for future science clouds. ------------------------------------------------------------------------------- WORKSHOP SCOPE We invite the submission of original work that is related to the topics below. The papers can be either short (4 pages) position papers, or long (8 pages) research papers. Topics of interest include (in the context of Cloud Computing): Scientific application cases studies on Cloud infrastructure Performance evaluation of Cloud environments and technologies Fault tolerance and reliability in cloud systems Data-intensive workloads and tools on Clouds Use of programming models such as Map-Reduce and its implementations Storage cloud architectures I/O and Data management in the Cloud Workflow and resource management in the Cloud Use of cloud technologies (e.g., NoSQL databases) for scientific applications Data streaming and dynamic applications on Clouds Dynamic resource provisioning Many-Task Computing in the Cloud Application of cloud concepts in HPC environments or vice versa High performance parallel file systems in virtual environments Virtualized high performance I/O network interconnects Virtualization Distributed Operating Systems Many-core computing and accelerators (e.g. GPUs, MIC) in the Cloud Cloud security ------------------------------------------------------------------------------- SUBMISSION INSTRUCTIONS Authors are invited to submit papers with unpublished, original work of not more than 8 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages (including all text, figures, and references), as per ACM 8.5 x 11 manuscript guidelines (document templates can be found at http://www.acm.org/sigs/publications/proceedings-templates). A 250 word abstract and the final paper in PDF format must be submitted online at https://cmt.research.microsoft.com/SCIENCECLOUD2014/ before the deadline. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library. Notifications of the paper decisions will be sent out by April 4th, 2014. Submission implies the willingness of at least one of the authors to register and present the paper. ------------------------------------------------------------------------------- JOURNAL SPECIAL ISSUE: IEEE Transaction on Cloud Computing Selected excellent work will be invited to submit extended versions of the workshop paper to the Special Issue on Scientific Cloud Computing in the IEEE Transactions on Cloud Computing (http://datasys.cs.iit.edu/events/ScienceCloud2014-TCC/). ------------------------------------------------------------------------------- GENERAL CHAIRS - Ioan Raicu, Illinois Institute of Technology & Argonne National Laboratory, USA - Kate Keahey, University of Chicago & Argonne National Laboratory, USA PROGRAM COMMITTEE CHAIRS - Kyle Chard, University of Chicago, USA - Bogdan Nicolae, IBM Research, Ireland STEERING COMMITTEE - Ian Foster, University of Chicago & Argonne National Laboratory, USA - Pete Beckman, University of Chicago & Argonne National Laboratory, USA - Carole Goble, University of Manchester, UK - Dennis Gannon, Microsoft Research, USA - Robert Grossman, University of Chicago, USA - Ed Lazowska, University of Washington & Computing Community Consortium, USA - David O'Hallaron, Carnegie Mellon University & Intel Labs, USA - Jack Dongarra, University of Tennessee, USA - Geoffrey Fox, Indiana University, USA - Yogesh Simmhan, University of Southern California, USA - Gabriel Antoniu, INRIA, France - Lavanya Ramakrishnan, Lawrence Berkeley National Lab, USA PROGRAM COMMITTEE - Samer Al-Kiswany, University of British Columbia - Roger Barga, Microsoft Research - Simon Caton, Karlsruhe Institute of Technology - Ake Edlund, Royal Institute of Technology - Chathura Herath, Indiana University - Neil Chue Hong, University of Edinburgh - Shantenu Jha, Rutgers - Carl Kesselman, University of Southern California - Thilo Kielmann, Vrije University - Shiyong Lu, Wayne State University - Wei Lu, Microsoft Research - David Martin, Argonne National Laboratory - Gabriel Mateescu, EURAC Research, Italy - Paolo Missier, University of Manchester - Ruben Montero, Universidad Complutense de Madrid - Reagan Moore, University of North Carolina - Pasquale Pagano, ISTI - Beth Plale, Indiana University - Omer Rana, Cardiff University - Matei Ripeanu, University of British Columbia - Josh Simons, VMWare - Douglas Thain, University of Notre Dame - Johan Tordsson, Ume University - Zhifeng Yun, Louisiana State University - Yong Zhao, University of Electronic and Science Technology of China -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Editor: IEEE TCC, Springer Cluster, Springer JoCCASA Chair: IEEE/ACM MTAGS, ACM ScienceCloud ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ LinkedIn: http://www.linkedin.com/in/ioanraicu Google: http://scholar.google.com/citations?user=jE73HYAAAAAJ ================================================================= ================================================================= From wilde at mcs.anl.gov Tue Jan 21 15:17:21 2014 From: wilde at mcs.anl.gov (Wilde, Michael J.) Date: Tue, 21 Jan 2014 21:17:21 +0000 Subject: [Swift-devel] Action items for Swift client memory issues Message-ID: <85C85E44DD880E498CEA5A501B27954BEA02A9@DITKA.anl.gov> Here's what I understand we're doing about these issues: - Mihael to enable/add memory logging - David to set up config for running swift command on worker nodes - Yadu to add memory usage tests and tests for the above config (Tests based initially on pattern posted in ticket 23674 To do: - decide what if anything is needed about the retry/recover aspects of ticket 23674 David has already posted a temporary config for swift-command-on-worker to the ticket, for the users to try. - Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketan at mcs.anl.gov Wed Jan 22 14:54:00 2014 From: ketan at mcs.anl.gov (Ketan Maheshwari) Date: Wed, 22 Jan 2014 14:54:00 -0600 Subject: [Swift-devel] staging time to sites from log Message-ID: Hi, Is it possible to find staging in and out times per job with Swift logs with provider staging? Speaking with David about this, he mentioned it is not currently possible with provider staging but there may be some options in log4j that could achieve it. Does anyone know about any such options or any indirect methods? Thanks, Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly at uchicago.edu Wed Jan 22 15:09:36 2014 From: davidkelly at uchicago.edu (David Kelly) Date: Wed, 22 Jan 2014 15:09:36 -0600 Subject: [Swift-devel] staging time to sites from log In-Reply-To: References: Message-ID: I was just looking more closely through a log I just created that used provider staging. I think there is some info about timings there that look like this: 2014-01-22 14:11:50,249-0600 INFO PutFileHandler Handler(tag: 7644, PUT) source: ./hostname-run000/jobs/a/hostname-azak6dll/hostfile.out 2014-01-22 14:11:50,249-0600 INFO PutFileHandler Handler(tag: 7644, PUT) destination: proxy://u2086f356-143bb93f2a2--7fff-u2086f356-143bb93f2a2--8000S//home/davidk/test/./hostfile.out 2014-01-22 14:11:50,251-0600 INFO PutFileCommand Sending Command(tag: 10, PUT) (t) on cpipe://1 2014-01-22 14:11:50,251-0600 INFO PutFileCommand Command(tag: 10, PUT) (t) sending data 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 7644, PUT) -> 10 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 10, PUT) source: ./hostname-run000/jobs/a/hostname-azak6dll/hostfile.out 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 10, PUT) destination: file://localhost//home/davidk/test/hostfile.out 2014-01-22 14:11:50,252-0600 INFO PutFileHandler Handler(tag: 10, PUT) Transfer done 2014-01-22 14:11:50,252-0600 INFO RequestHandler Handler(tag: 10, PUT) unregistering (send) 2014-01-22 14:11:50,252-0600 INFO PutFileHandler Handler(tag: 7644, PUT) Transfer done 2014-01-22 14:11:50,253-0600 INFO RequestHandler Handler(tag: 7644, PUT) unregistering (send) Maybe you could keep track of the tag number and record the difference from "sending data" to "Transfer done" to get an estimate. I'm not sure how to correlate the file transfers to a jobid though. On Wed, Jan 22, 2014 at 2:54 PM, Ketan Maheshwari wrote: > Hi, > > Is it possible to find staging in and out times per job with Swift logs > with provider staging? > > Speaking with David about this, he mentioned it is not currently possible > with provider staging but there may be some options in log4j that could > achieve it. > > Does anyone know about any such options or any indirect methods? > > Thanks, > Ketan > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketan at mcs.anl.gov Wed Jan 22 15:39:36 2014 From: ketan at mcs.anl.gov (Ketan Maheshwari) Date: Wed, 22 Jan 2014 15:39:36 -0600 Subject: [Swift-devel] staging time to sites from log In-Reply-To: References: Message-ID: In the log I have from Swift trunk swift-r7228 cog-r3817, I do not see 'sending data' messages. On Wed, Jan 22, 2014 at 3:09 PM, David Kelly wrote: > I was just looking more closely through a log I just created that used > provider staging. I think there is some info about timings there that look > like this: > > 2014-01-22 14:11:50,249-0600 INFO PutFileHandler Handler(tag: 7644, PUT) > source: ./hostname-run000/jobs/a/hostname-azak6dll/hostfile.out > 2014-01-22 14:11:50,249-0600 INFO PutFileHandler Handler(tag: 7644, PUT) > destination: > proxy://u2086f356-143bb93f2a2--7fff-u2086f356-143bb93f2a2--8000S//home/davidk/test/./hostfile.out > 2014-01-22 14:11:50,251-0600 INFO PutFileCommand Sending Command(tag: 10, > PUT) (t) on cpipe://1 > 2014-01-22 14:11:50,251-0600 INFO PutFileCommand Command(tag: 10, PUT) > (t) sending data > 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 7644, PUT) > -> 10 > 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 10, PUT) > source: ./hostname-run000/jobs/a/hostname-azak6dll/hostfile.out > 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 10, PUT) > destination: file://localhost//home/davidk/test/hostfile.out > 2014-01-22 14:11:50,252-0600 INFO PutFileHandler Handler(tag: 10, PUT) > Transfer done > 2014-01-22 14:11:50,252-0600 INFO RequestHandler Handler(tag: 10, PUT) > unregistering (send) > 2014-01-22 14:11:50,252-0600 INFO PutFileHandler Handler(tag: 7644, PUT) > Transfer done > 2014-01-22 14:11:50,253-0600 INFO RequestHandler Handler(tag: 7644, PUT) > unregistering (send) > > Maybe you could keep track of the tag number and record the difference > from "sending data" to "Transfer done" to get an estimate. I'm not sure how > to correlate the file transfers to a jobid though. > > > On Wed, Jan 22, 2014 at 2:54 PM, Ketan Maheshwari wrote: > >> Hi, >> >> Is it possible to find staging in and out times per job with Swift logs >> with provider staging? >> >> Speaking with David about this, he mentioned it is not currently possible >> with provider staging but there may be some options in log4j that could >> achieve it. >> >> Does anyone know about any such options or any indirect methods? >> >> Thanks, >> Ketan >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly at uchicago.edu Wed Jan 22 15:51:18 2014 From: davidkelly at uchicago.edu (David Kelly) Date: Wed, 22 Jan 2014 15:51:18 -0600 Subject: [Swift-devel] staging time to sites from log In-Reply-To: References: Message-ID: My suspicion is that since you're using ssh-cl, the relevant lines related to staging in and staging out are getting buried in ~/.globus/coasters/coasters.log on the SSH site. In the test I did was running with local:condor staging in and out directly to OSG nodes. On Wed, Jan 22, 2014 at 3:39 PM, Ketan Maheshwari wrote: > In the log I have from Swift trunk swift-r7228 cog-r3817, I do not see > 'sending data' messages. > > > On Wed, Jan 22, 2014 at 3:09 PM, David Kelly wrote: > >> I was just looking more closely through a log I just created that used >> provider staging. I think there is some info about timings there that look >> like this: >> >> 2014-01-22 14:11:50,249-0600 INFO PutFileHandler Handler(tag: 7644, PUT) >> source: ./hostname-run000/jobs/a/hostname-azak6dll/hostfile.out >> 2014-01-22 14:11:50,249-0600 INFO PutFileHandler Handler(tag: 7644, PUT) >> destination: >> proxy://u2086f356-143bb93f2a2--7fff-u2086f356-143bb93f2a2--8000S//home/davidk/test/./hostfile.out >> 2014-01-22 14:11:50,251-0600 INFO PutFileCommand Sending Command(tag: >> 10, PUT) (t) on cpipe://1 >> 2014-01-22 14:11:50,251-0600 INFO PutFileCommand Command(tag: 10, PUT) >> (t) sending data >> 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 7644, PUT) >> -> 10 >> 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 10, PUT) >> source: ./hostname-run000/jobs/a/hostname-azak6dll/hostfile.out >> 2014-01-22 14:11:50,251-0600 INFO PutFileHandler Handler(tag: 10, PUT) >> destination: file://localhost//home/davidk/test/hostfile.out >> 2014-01-22 14:11:50,252-0600 INFO PutFileHandler Handler(tag: 10, PUT) >> Transfer done >> 2014-01-22 14:11:50,252-0600 INFO RequestHandler Handler(tag: 10, PUT) >> unregistering (send) >> 2014-01-22 14:11:50,252-0600 INFO PutFileHandler Handler(tag: 7644, PUT) >> Transfer done >> 2014-01-22 14:11:50,253-0600 INFO RequestHandler Handler(tag: 7644, PUT) >> unregistering (send) >> >> Maybe you could keep track of the tag number and record the difference >> from "sending data" to "Transfer done" to get an estimate. I'm not sure how >> to correlate the file transfers to a jobid though. >> >> >> On Wed, Jan 22, 2014 at 2:54 PM, Ketan Maheshwari wrote: >> >>> Hi, >>> >>> Is it possible to find staging in and out times per job with Swift logs >>> with provider staging? >>> >>> Speaking with David about this, he mentioned it is not currently >>> possible with provider staging but there may be some options in log4j that >>> could achieve it. >>> >>> Does anyone know about any such options or any indirect methods? >>> >>> Thanks, >>> Ketan >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Fri Jan 24 14:11:38 2014 From: wilde at mcs.anl.gov (Wilde, Michael J.) Date: Fri, 24 Jan 2014 20:11:38 +0000 Subject: [Swift-devel] FW: Provenance Presentation Message-ID: <85C85E44DD880E498CEA5A501B27954BEA14D1@DITKA.anl.gov> - Mike -- Michael Wilde Mathematics and Computer Science Computation Institute Argonne National Laboratory The University of Chicago ________________________________________ From: Luiz Gadelha [lgadelha at lncc.br] Sent: Friday, January 24, 2014 1:41 PM To: Wilde, Michael J. Subject: Presentation -- Luiz Gadelha http://www.lncc.br/~lgadelha -------------- next part -------------- A non-text attachment was scrubbed... Name: swift-provenance.pdf Type: application/pdf Size: 122214 bytes Desc: swift-provenance.pdf URL: From wilde at mcs.anl.gov Tue Jan 28 15:29:19 2014 From: wilde at mcs.anl.gov (Wilde, Michael J.) Date: Tue, 28 Jan 2014 21:29:19 +0000 Subject: [Swift-devel] Progress on Swift RAM usage problem? In-Reply-To: References: <85C85E44DD880E498CEA5A501B27954BEA237C@DITKA.anl.gov>, Message-ID: <85C85E44DD880E498CEA5A501B27954BEA2486@DITKA.anl.gov> From: David Kelly [davidkelly at uchicago.edu] Sent: Tuesday, January 28, 2014 2:47 PM ... I don't have too many updates on Sheri's problem. I was able to run the older standalone example I had on Geyser and did not see any issues with excessive amounts of resident memory being used. ... I think the failure was exceeding the Java heap size, not an RSS problem, right? I think we might be better off shifting the way we approach this problem. It's difficult to run these apps, and to run them in the same way the users do. There's also a long delay getting responses. I think we'd be better off focusing on adding comprehensive memory tests to the test suite, measuring, plotting, and then documenting solutions/strategies into the user guide. It will take some time, but I think it's the best approach since everything would be under our own control, and it would provide solutions for all users. That sounds good, while we are waiting for debugging info from users. But we should still strive to reproduce problems that users are encountering, and on giving them code updates with additional debugging hooks or possible remedies to test. - Mike On Tue, Jan 28, 2014 at 12:36 PM, Wilde, Michael J. > wrote: Yadu, David, can you send updates on this to Swift devel, and lets talk this afternoon at 3PM to discuss? Thanks, - Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From lgadelha at lncc.br Tue Jan 28 17:31:02 2014 From: lgadelha at lncc.br (Luiz Gadelha) Date: Tue, 28 Jan 2014 17:31:02 -0600 Subject: [Swift-devel] FW: Provenance Presentation In-Reply-To: <85C85E44DD880E498CEA5A501B27954BEA14D1@DITKA.anl.gov> References: <85C85E44DD880E498CEA5A501B27954BEA14D1@DITKA.anl.gov> Message-ID: <52E83DB6.9080702@lncc.br> Dear all, I uploaded a tutorial in asciidoc to SVN. An HTML version can be accessed in: http://www.lncc.br/~lgadelha/swift-provenance/walkthrough.html It has a basic overview of the database and some example queries. A diagram with the complete database schema can be accessed in: http://www.lncc.br/~lgadelha/swift-provenance/provdb.svg Regards, Luiz On 01/24/2014 02:11 PM, Wilde, Michael J. wrote: > > - Mike > -- > Michael Wilde > Mathematics and Computer Science Computation Institute > Argonne National Laboratory The University of Chicago > > > ________________________________________ > From: Luiz Gadelha [lgadelha at lncc.br] > Sent: Friday, January 24, 2014 1:41 PM > To: Wilde, Michael J. > Subject: Presentation > > -- > Luiz Gadelha > http://www.lncc.br/~lgadelha > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Luiz Gadelha http://www.lncc.br/~lgadelha From wilde at mcs.anl.gov Thu Jan 30 19:02:37 2014 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 30 Jan 2014 19:02:37 -0600 Subject: [Swift-devel] Tracking swift heap overflow culprit In-Reply-To: <1391124161.10180.7.camel@echo> References: <1391124161.10180.7.camel@echo> Message-ID: (Moving this to my MCS email addr and to swift-devel) Mihael, what we are trying to do here is not (initially) change anything in Swift memory usage. We just want to understand the costs in memory of normal Swift operations, eg, call a function with N args and M returns; map a file; create an array of 1000 1MB strings; etc. Then, for any program execution, we want to be able to trace - at some useful level of granularity - the consumption of Java memory caused by these normal Swift activities. For example, if a user writes a function that is going to create - and hold - 10MB of memory, due to its local variables, then having 10,000 of those active at once would consume - and hold - 100GB of RAM. My suspicion is that this is exactly what e.g. Sheri's code is doing. Any I further suspect that once we identify what procedures are using most of the memory in what way, then we can tune the user code to use much less memory, We can - by experiment - develop a cost table for common Swift operations. But Sheri's code is the most complex Swift scripts that exist. Each has a few K lines of swift code. WIthout some auomated memory usage stats that correlate mem consumption to source code, it will be hard to find the culprits. So the question is not how to make Swift use less memor (although thats always desirable), but rather first just to create the tools to know how much a give program run uses for what. Can you suggest affordable ways to get this info? Thanks, - Mike On Thu, Jan 30, 2014 at 5:22 PM, Mihael Hategan wrote: > Weeeellll, > > Stuff eats memory. Some stuff can be made to take less memory, some > stuff can be made to take less memory at the expense of performance, and > some stuff just needs to be there. And then once in a while there's > stuff that doesn't need to be there at all. > > I somewhat routinely have to deal with the first two. It's not an easy > problem, because it only becomes obvious what eats memory at large > scales when you actually have a large scale run, and that's difficult to > analyze both because of technicalities (such as it takes lots of ram to > analyze things) and because it's hard to distinguish signal from noise > when there's a lot of stuff. But, again, that's something I generally > keep in mind with every commit. > > It is however, mostly attributable to design choices. We sacrificed > scalabiilty for convenience initially, because juggling with concurrency > was difficult, and the scales we were looking at were generally pretty > small. Things change though. > > There's the last possibility also. And that is that we have a situation > that doesn't normally occur and shouldn't occur that is probably a bug > and that happened this once. If that's the case we should find and fix > that. So, is that the case? > > Mihael > > > On Thu, 2014-01-30 at 17:06 -0600, Yadu Nand wrote: > > Hi Mike, Mihael > > > > I talked to Mihael about the RAM issue and he said that having heap > > dumps can help but he wasn't sure if that alone is sufficient to pin > > point what is using memory excessively. > > > > Here's what I did : > > * Force the apps to dump the heap and analyse it offline with jhat. > > I've used jhat on one such dump from a memory stress test. If the > > dump is very large > > the user could just start jhat which starts a webserver on port 7000 > > which we can access. > > > > Here's one of the dump analysis from jhat : > > http://swift.rcc.uchicago.edu:7000/histo/ > > http://swift.rcc.uchicago.edu:7000/showInstanceCounts/includePlatform/ > > > > * jmap can be used to get maps of the jvm while it is running : Here's > > a snap on a stress run with 10^6 + 1 ints held in a swift array: > > > > [yadunand at midway001 data_stress]$ jmap -histo:live 31135 | head -n 10 > > > > num #instances #bytes class name > > ---------------------------------------------- > > 1: 1000001 56000056 org.griphyn.vdl.mapping.DataNode > > 2: 1030601 32979232 java.util.HashMap$Entry > > 3: 2014652 32234432 java.lang.Integer > > 4: 1000015 24000360 org.griphyn.vdl.type.impl.FieldImpl > > 5: 14672 4903992 [Ljava.util.HashMap$Entry; > > 6: 29872 4149184 > > 7: 29872 3831968 > > > > These together with the live heap tracking commit from Mihael should > > be able to give a better picture of what is going on with the user's > > run. This again would require the user to run an extra script. > > > > As for Sheri's case there was a core dump, and if we could get her to > > run jhat on her side, I think that would open up some extra detail > > into what is consuming the memory. > > > > Please let me know if this might be something that is worth a shot. > > > > Thanks, > > Yadu > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jan 30 19:33:11 2014 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jan 2014 17:33:11 -0800 Subject: [Swift-devel] Tracking swift heap overflow culprit In-Reply-To: References: <1391124161.10180.7.camel@echo> Message-ID: <1391131991.11650.11.camel@echo> On Thu, 2014-01-30 at 19:02 -0600, Michael Wilde wrote: > (Moving this to my MCS email addr and to swift-devel) > > Mihael, what we are trying to do here is not (initially) change anything in > Swift memory usage. I understand. I, however, have tried and I believe it to be a worthwhile task to continuously identify places where things can be improved. > > We just want to understand the costs in memory of normal Swift operations, > eg, call a function with N args and M returns; map a file; create an array > of 1000 1MB strings; etc. Right. I understand. I think the best way of doing that is to trace memory use for different sizes of the problem and try to fit a line to the results. Essentially we need a model there. We can start with models for simple things, such as the ones you describe. It's possible that these can be combined to get estimates for more complex problems, but I am not entirely sure. > > Then, for any program execution, we want to be able to trace - at some > useful level of granularity - the consumption of Java memory caused by > these normal Swift activities. > > For example, if a user writes a function that is going to create - and hold > - 10MB of memory, due to its local variables, then having 10,000 of those > active at once would consume - and hold - 100GB of RAM. > > My suspicion is that this is exactly what e.g. Sheri's code is doing. Any > I further suspect that once we identify what procedures are using most of > the memory in what way, then we can tune the user code to use much less > memory, Maybe. In my experience profiling swift memory usage over the years is that there are many many things that contribute just a little, but it adds up. > > We can - by experiment - develop a cost table for common Swift operations. > But Sheri's code is the most complex Swift scripts that exist. Each has a > few K lines of swift code. WIthout some auomated memory usage stats that > correlate mem consumption to source code, it will be hard to find the > culprits. I would love something like that. At the expense of sounding pessimistic, I also think that it is a hard problem. I think a good first step in developing such a tool would be to look at existing memory profilers and learn a bit from them. > > So the question is not how to make Swift use less memor (although thats > always desirable), but rather first just to create the tools to know how > much a give program run uses for what. > > Can you suggest affordable ways to get this info? I believe that some of this can be done with the tracing in the log (if tracing is enabled). It can be used to count how many times a procedure is started, what variables are being allocated in that procedure, and so on. It won't give us information on how much memory each variable takes (that might be tricky without some jvm-level profiling tools), but it may be a start. Mihael From hategan at mcs.anl.gov Thu Jan 30 21:53:10 2014 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jan 2014 19:53:10 -0800 Subject: [Swift-devel] Tracking swift heap overflow culprit In-Reply-To: References: <1391124161.10180.7.camel@echo> Message-ID: <1391140390.13028.8.camel@echo> On Thu, 2014-01-30 at 19:02 -0600, Michael Wilde wrote: > > > > > > [yadunand at midway001 data_stress]$ jmap -histo:live 31135 | head -n 10 > > > > > > num #instances #bytes class name > > > ---------------------------------------------- > > > 1: 1000001 56000056 org.griphyn.vdl.mapping.DataNode > > > 2: 1030601 32979232 java.util.HashMap$Entry > > > 3: 2014652 32234432 java.lang.Integer > > > 4: 1000015 24000360 org.griphyn.vdl.type.impl.FieldImpl > > > 5: 14672 4903992 [Ljava.util.HashMap$Entry; > > > 6: 29872 4149184 > > > 7: 29872 3831968 Some specific comments: 1. You need a DataNode for each swift piece of data. 2. Arrays are sparse and implemented as a map, so for each element of an array you will have a HashMap$Entry. 3. A little too many. I've updated some code to try to avoid instantiating Integer objects if not necessary (using Integer.valueOf will typically use existing objects for small numbers). Unfortunately, if you have 1,000,000 different integer values, you will end up with 1,000,000 integers. One can probably implement a version of DataNode that is specialized for primitive values to eliminate that part. 4. Unfortunately array indices are treated as "fields" (i.e. a[1] is treated as a.1). For structures, field instances are cached, but for arrays, since they have different indices, not so much. 5. Every HashMap has an array of HashMap$Entry objects. 6&7. constant space stuff. Mihael From davidkelly at uchicago.edu Fri Jan 31 15:07:42 2014 From: davidkelly at uchicago.edu (David Kelly) Date: Fri, 31 Jan 2014 15:07:42 -0600 Subject: [Swift-devel] Heap plots Message-ID: I wrote a script for plotting java heap statistics over time. This will only work with recent versions of trunk, but thought it might be useful for debugging memory issues. It's in SVN at https://svn.ci.uchicago.edu/svn/vdl2/SwiftApps/heap-plot. Attached a test plot. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: heap-plot.png Type: image/png Size: 7464 bytes Desc: not available URL: From wilde at mcs.anl.gov Fri Jan 31 20:40:19 2014 From: wilde at mcs.anl.gov (Wilde, Michael J.) Date: Sat, 1 Feb 2014 02:40:19 +0000 Subject: [Swift-devel] Heap plots In-Reply-To: References: Message-ID: <85C85E44DD880E498CEA5A501B27954BEA44AE@DITKA.anl.gov> Why is the blue line such a small fraction of the green line? Is all the space in between objects waiting to be garbage collected by Jave? - Mike -- Michael Wilde Mathematics and Computer Science Computation Institute Argonne National Laboratory The University of Chicago ________________________________ From: swift-devel-bounces at ci.uchicago.edu [swift-devel-bounces at ci.uchicago.edu] on behalf of David Kelly [davidkelly at uchicago.edu] Sent: Friday, January 31, 2014 3:07 PM To: swift-devel Subject: [Swift-devel] Heap plots I wrote a script for plotting java heap statistics over time. This will only work with recent versions of trunk, but thought it might be useful for debugging memory issues. It's in SVN at https://svn.ci.uchicago.edu/svn/vdl2/SwiftApps/heap-plot. Attached a test plot. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jan 31 21:15:03 2014 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 31 Jan 2014 19:15:03 -0800 Subject: [Swift-devel] Heap plots In-Reply-To: <85C85E44DD880E498CEA5A501B27954BEA44AE@DITKA.anl.gov> References: <85C85E44DD880E498CEA5A501B27954BEA44AE@DITKA.anl.gov> Message-ID: <1391224503.27683.10.camel@echo> The same reason why the green line is decreasing over time. The green line is determined by the JVM on startup, but can be controlled with -Xms The basic algorithm for the green line is that on every garbage collection round, if the blue line is close to the green line, the green line is increased. If it's very small compared to the green line, the green line is decreased. The green line is how much the JVM mallocs for the heap. It knows very little about the programs requirements in the beginning, so it's a guess. If it turns out that very little of it is used, it releases some percentage of it on each collection. There should be some flags to control some of these aspects, but I can't find them in the man page. Mihael On Sat, 2014-02-01 at 02:40 +0000, Wilde, Michael J. wrote: > Why is the blue line such a small fraction of the green line? > > Is all the space in between objects waiting to be garbage collected by Jave? > > - Mike > -- > Michael Wilde > Mathematics and Computer Science Computation Institute > Argonne National Laboratory The University of Chicago From peronja at fnal.gov Fri Jan 3 14:01:07 2014 From: peronja at fnal.gov (Edit Peronja) Date: Fri, 03 Jan 2014 20:01:07 -0000 Subject: [Swift-devel] question about exceptions Message-ID: <15BBCC69-11FD-4E90-86D6-E777683C6D63@fnal.gov> Hi, Is there a way for swift to catch exceptions and report them? (like a try/catch block). Thanks, Edit Edit Peronja I2U2 - Application Developer peronja at fnal.gov 630-840-4165 -------------- next part -------------- An HTML attachment was scrubbed... URL: