[Swift-commit] r5440 - branches/release-0.93/docs/siteguide
ketan at ci.uchicago.edu
ketan at ci.uchicago.edu
Sun Dec 18 17:26:30 CST 2011
Author: ketan
Date: 2011-12-18 17:26:30 -0600 (Sun, 18 Dec 2011)
New Revision: 5440
Modified:
branches/release-0.93/docs/siteguide/beagle
Log:
Modified: branches/release-0.93/docs/siteguide/beagle
===================================================================
--- branches/release-0.93/docs/siteguide/beagle 2011-12-18 20:55:45 UTC (rev 5439)
+++ branches/release-0.93/docs/siteguide/beagle 2011-12-18 23:26:30 UTC (rev 5440)
@@ -178,12 +178,13 @@
* Command not found: Swift is installed on Beagle as a module. If you see the following error message:
-----
-If 'swift' is not a typo you can run the following command to lookup the package that contains the binary:
+If 'swift' is not a typo you can run the following command to lookup the
+package that contains the binary:
command-not-found swift
-bash: swift: command not found
-----
-The most likely cause is the module is not loaded. Do the following to load the Swift module:
+The most likely cause is the Swift module is not loaded. Do the following to load the Swift module:
-----
$ module load swift
@@ -198,10 +199,22 @@
It is likely that it is set to a path where the compute nodes can not write, e.g. your /home directory. The remedy for this error is to set your workdirectory to the /lustre path where swift could write from compute nodes.
-----
+-----
<workdirectory >/lustre/beagle/ketan/swift.workdir</workdirectory>
-----
+-----
+* Out of heap space error is a typical error that you get when running large number of tasks in parallel from a submit host such as Beagle login nodes.
+
+-----
+java.lang.OutOfMemoryError: Java heap space
+-----
+
+A simple solution to this problem is to increase the java heap space. This can be solved by increasing the heap space Swift gets by the following environment variable:
+
+-----
+WIFT_HEAP_MAX=5000M swift -config cf -tc.file tc -sites.file sites.xml catsn.swift -n=10000
+-----
+
* If the error message does not give much clue, one can go about the following approaches to find more help:
- Search for the particular error message on the swift mailing list archive from here: http://www.ci.uchicago.edu/swift/wwwdev/support/index.php
- Subscribe to the swift-user lists and post your questions here: https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
@@ -224,10 +237,12 @@
freeMB=$(free -m | grep cache: | awk '{print $4}')
if [ $freeMB -lt $lowmem ]; then
if [ $i = $maxtries ]; then
- echo "$host $(date) freeMB = $freeMB below yellow mark $lowmem after $maxtries $startsleep sec pauses. Exiting." >>$oomlog
+ echo "$host $(date) freeMB = $freeMB below yellow mark $lowmem after $maxtries \
+ $startsleep sec pauses. Exiting." >>$oomlog
exit 7
else
- echo "$host $(date) freeMB = $freeMB below yellow mark $lowmem on try $i. Sleeping $startsleep sec." >>$oomlog
+ echo "$host $(date) freeMB = $freeMB below yellow mark $lowmem on try $i. Sleeping \
+ $startsleep sec." >>$oomlog
sleep $startsleep
fi
else
More information about the Swift-commit
mailing list