[Swift-commit] r5440 - branches/release-0.93/docs/siteguide

ketan at ci.uchicago.edu ketan at ci.uchicago.edu
Sun Dec 18 17:26:30 CST 2011


Author: ketan
Date: 2011-12-18 17:26:30 -0600 (Sun, 18 Dec 2011)
New Revision: 5440

Modified:
   branches/release-0.93/docs/siteguide/beagle
Log:
 

Modified: branches/release-0.93/docs/siteguide/beagle
===================================================================
--- branches/release-0.93/docs/siteguide/beagle	2011-12-18 20:55:45 UTC (rev 5439)
+++ branches/release-0.93/docs/siteguide/beagle	2011-12-18 23:26:30 UTC (rev 5440)
@@ -178,12 +178,13 @@
 * Command not found: Swift is installed on Beagle as a module. If you see the following error message:
 
 -----
-If 'swift' is not a typo you can run the following command to lookup the package that contains the binary:
+If 'swift' is not a typo you can run the following command to lookup the 
+package that contains the binary:
     command-not-found swift
 -bash: swift: command not found
 -----
 
-The most likely cause is the module is not loaded. Do the following to load the Swift module:
+The most likely cause is the Swift module is not loaded. Do the following to load the Swift module:
 
 -----
 $ module load swift
@@ -198,10 +199,22 @@
 
 It is likely that it is set to a path where the compute nodes can not write, e.g. your /home directory. The remedy for this error is to set your workdirectory to the /lustre path where swift could write from compute nodes.
 
-----
+-----
 <workdirectory >/lustre/beagle/ketan/swift.workdir</workdirectory>
-----
+-----
 
+* Out of heap space error is a typical error that you get when running large number of tasks in parallel from a submit host such as Beagle login nodes.
+
+-----
+java.lang.OutOfMemoryError: Java heap space
+-----
+
+A simple solution to this problem is to increase the java heap space. This can be solved by increasing the heap space Swift gets by the following environment variable:
+
+-----
+WIFT_HEAP_MAX=5000M swift -config cf -tc.file tc -sites.file sites.xml catsn.swift -n=10000
+-----
+
 * If the error message does not give much clue, one can go about the following approaches to find more help:
  - Search for the particular error message on the swift mailing list archive from here: http://www.ci.uchicago.edu/swift/wwwdev/support/index.php
  - Subscribe to the swift-user lists and post your questions here: https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
@@ -224,10 +237,12 @@
   freeMB=$(free -m | grep cache: | awk '{print $4}')
   if [ $freeMB -lt $lowmem ]; then
     if [ $i = $maxtries ]; then
-      echo "$host $(date) freeMB = $freeMB below yellow mark $lowmem after $maxtries $startsleep sec pauses. Exiting." >>$oomlog
+      echo "$host $(date) freeMB = $freeMB below yellow mark $lowmem after $maxtries \
+       $startsleep sec pauses. Exiting." >>$oomlog
       exit 7
     else
-      echo "$host $(date) freeMB = $freeMB below yellow mark $lowmem on try $i. Sleeping $startsleep sec." >>$oomlog
+      echo "$host $(date) freeMB = $freeMB below yellow mark $lowmem on try $i. Sleeping \
+      $startsleep sec." >>$oomlog
       sleep $startsleep
     fi
   else




More information about the Swift-commit mailing list