[Swift-commit] r5961 - SwiftApps/ParVis/HiRAMTools
wilde at ci.uchicago.edu
wilde at ci.uchicago.edu
Wed Oct 10 09:43:26 CDT 2012
Author: wilde
Date: 2012-10-10 09:43:26 -0500 (Wed, 10 Oct 2012)
New Revision: 5961
Added:
SwiftApps/ParVis/HiRAMTools/STATUS
Modified:
SwiftApps/ParVis/HiRAMTools/README
Log:
Updates from work in July 2012.
Modified: SwiftApps/ParVis/HiRAMTools/README
===================================================================
--- SwiftApps/ParVis/HiRAMTools/README 2012-10-10 14:42:25 UTC (rev 5960)
+++ SwiftApps/ParVis/HiRAMTools/README 2012-10-10 14:43:26 UTC (rev 5961)
@@ -13,17 +13,21 @@
runall.sh # called by the user, runs the scripts below
- makeyearly_realization.sh
- makeyearly.swift
+ combine_realization.sh
+ combine.swift
+ combine.sh (app, specified in tc)
+
+ makeyearly_realization.sh
+ makeyearly.swift
makeyearly-cdo.sh (app, specified in tc)
- combine_realization.sh
- combine.swift
- combine.chk.sh (app, specified in tc)
+ runpfrepps.sh # Called by used, runs the scripts below
+ pfrepps.swift
+ genpfrepps.sh
+ runscript.sh
-To process a set of realizations (for example, to combine them), perform these
-steps:
+To combine or annualize a set of realizations, perform these steps:
1. Add Swift and a recent Sun Java to your path. For now we will use a Swift
and Java version maintained by Swift team member Jon Monette:
@@ -57,28 +61,83 @@
script=makeyearly_realization.sh # Script to run for each realization
-5. Place the list of realizations to process in a file named "real.todo" and
- create and empty file "real.done". For example, to process 2 realizations,
- Climo_001 and Climo_023):
+5. Place the list of *full pathnames* of the realizations to process in a file
+ named "real.todo" and create and empty file "real.done". For example, to
+ process 3 realizations, eg en3rc16Ic1, ... do:
cat >real.todo
- Climo_001
- Climo_023
+ /intrepid-fs0/users/lzamboni/persistent/en3rc16Ic1
+ /intrepid-fs0/users/lzamboni/persistent/en3rth12Ic2
+ /intrepid-fs0/users/lzamboni/persistent/en3rth8Ic2
^D
- >real.done # create an empty real.done file
+ >real.done # IMPORTANT! create an empty real.done file
6. run:
- ./runall.combine_realizations.sh >& runall.out
+ $HOME/HiRAMTools/runall.sh >& runall.out
-Then look for the most recently created run directory, and do:
+Then cd to the most recently created run directory, and do:
+
+ cd run042 # for example
+ tail -f swift.out
+===
+
+To run pfrep on a set of realizations, do:
+
+1. Create a list of realizations in a file, starting with a header line, in
+ the following format:
+
+--- file pflist --- (next line is first line of file):
+path id
+/full/path/of/realization/history/dir realizationID
+etc
+---
+
+For example, file "pflist" contains:
+
+--- Next line is first line of file. "---" is not in file:
+path id
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run001/run001/ en1eo14Ic2
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run002/run003/ en1eo14Ic3
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run003/run004/ en1eo14Ic4
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run004/run005/ en1eo16Ic1
+---
+
+2. Run the script:
+
+ runpfrepps.sh pflistFile outputDir
+
+ $HOME/HiRAMTools/runpfrepps.sh pflist /intrepid-fs0/users/wilde/persistent/LZ/pfrepps.2012.0630
+
+3. Watch the status
+
+ cd run012
tail -f swift.out
+4. Manual, local execution of pfrepp atmos average scripts (generated from step 3, above)
+cd /intrepid-fs0/users/wilde/persistent/LZ/pfrepps
+
+# start scripts here ???
+
+# show script progress:
+
+for s in $(cat set02); do echo "$s: "; tail -15 $s.out; done
+
+
+
OPEN ISSUES:
+- error handling in the leaf scripts is highly suspect: are errors correctly
+ getting caught???
+
- We see straggler scripts on Eureka nodes. Seems to be doing very slow IO on
-shared disk for unexplained reasons. Ticket is open with ALCF support, being
-investigated by Andrew Cherry.
+ shared disk for unexplained reasons. Ticket is open with ALCF support, being
+ investigated by Andrew Cherry.
+
+- If /scratch is not available, we switch to running entirely on fs0
+
+- Naming conventions for outdir now used in combine need to be added to
+ makeyearly
Added: SwiftApps/ParVis/HiRAMTools/STATUS
===================================================================
--- SwiftApps/ParVis/HiRAMTools/STATUS (rev 0)
+++ SwiftApps/ParVis/HiRAMTools/STATUS 2012-10-10 14:43:26 UTC (rev 5961)
@@ -0,0 +1,273 @@
+From LZ: here are the versions we know they work:
+
+ nco-4.0.9 for yearly reorganization
+
+ nco-3.9.2 for pfrepp and for combine (combine might not used nco at all, but
+ this is our solution today)
+
+ when we know makeyearly, combine and pfrepp work, we may want to test whether
+ the following version works for all cases (as ALCF Support suggests)
+
+ 3.9.9-udunits2
+
+----
+
+In: /intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1
+
+run001/run001/ run001 en1eo14Ic2
+run002/run003/ run002 en1eo14Ic3
+run003/run004/ run003 en1eo14Ic4
+run004/run005/ run004 en1eo16Ic1
+run005/run006/ run005 en1eo16Ic2
+run006/run007/ run006 en1eo16Ic3
+run007/run008/ run007 en1eo16Ic4
+run008/run009/ run008 en1eo8Ic1
+run009/run010/ run009 en1eo8Ic2
+run010/run011/ run010 en1eo8Ic3
+run011/run012/ run011 en1eo8Ic4
+
+run001/run001/ en1eo14Ic2
+run002/run003/ en1eo14Ic3
+run003/run004/ en1eo14Ic4
+run004/run005/ en1eo16Ic1
+run005/run006/ en1eo16Ic2
+run006/run007/ en1eo16Ic3
+run007/run008/ en1eo16Ic4
+run008/run009/ en1eo8Ic1
+run009/run010/ en1eo8Ic2
+run010/run011/ en1eo8Ic3
+run011/run012/ en1eo8Ic4
+
+run001/run001/
+run002/run003/
+run003/run004/
+run004/run005/
+run005/run006/
+run006/run007/
+run007/run008/
+run008/run009/
+run009/run010/
+run010/run011/
+run011/run012/
+
+740 files per combined dir
+740 * 2.5 = 1850 files per annualized dir
+
+Short yearlies:
+
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run001/run001/ 1540
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run002/run003/ 1405
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run003/run004/ 1844
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run011/run012/ 1151
+
+cdir=/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706
+mkdir -p $cdir
+ln -s /intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run001 $cdir/en1eo14Ic2
+ln -s /intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run002 $cdir/en1eo14Ic3
+ln -s /intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run003 $cdir/en1eo14Ic4
+ln -s /intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run011 $cdir/en1eo8Ic4
+
+
+eur$ for d in $cdir/*; do (echo -n $d: ; find $d/19????01 -type f | wc -l ); done
+/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706/en1eo14Ic2:740
+/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706/en1eo14Ic3:740
+/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706/en1eo14Ic4:740
+/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706/en1eo8Ic4:740
+eur$
+
+
+/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706/en1eo14Ic2
+/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706/en1eo14Ic3
+/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706/en1eo14Ic4
+/intrepid-fs0/users/wilde/persistent/LZ/combines.toann.2012.0706/en1eo8Ic4
+
+
+==== Clean Tracker
+
+en1 realization IDs:
+
+en1eo12Ic1
+en1eo12Ic2
+en1eo12Ic3
+en1eo12Ic4
+en1eo14Ic1
+en1eo14Ic2
+en1eo14Ic3
+en1eo14Ic4
+en1eo16Ic1
+en1eo16Ic2
+en1eo16Ic3
+en1eo16Ic4
+en1eo8Ic1
+en1eo8Ic2
+en1eo8Ic3
+en1eo8Ic4
+
+en1 HiRAM output:
+
+/intrepid-fs0/users/lzamboni/persistent/en1eo12Ic1
+/intrepid-fs0/users/lzamboni/persistent/en1eo12Ic2
+/intrepid-fs0/users/lzamboni/persistent/en1eo12Ic3
+/intrepid-fs0/users/lzamboni/persistent/en1eo12Ic4
+/intrepid-fs0/users/lzamboni/persistent/en1eo14Ic1
+/intrepid-fs0/users/lzamboni/persistent/en1eo14Ic2
+/intrepid-fs0/users/lzamboni/persistent/en1eo14Ic3
+/intrepid-fs0/users/lzamboni/persistent/en1eo14Ic4
+/intrepid-fs0/users/lzamboni/persistent/en1eo16Ic1
+/intrepid-fs0/users/lzamboni/persistent/en1eo16Ic2
+/intrepid-fs0/users/lzamboni/persistent/en1eo16Ic3
+/intrepid-fs0/users/lzamboni/persistent/en1eo16Ic4
+/intrepid-fs0/users/lzamboni/persistent/en1eo8Ic1
+/intrepid-fs0/users/lzamboni/persistent/en1eo8Ic2
+/intrepid-fs0/users/lzamboni/persistent/en1eo8Ic3
+/intrepid-fs0/users/lzamboni/persistent/en1eo8Ic4
+
+Combined output:
+
+/intrepid-fs0/users/lzamboni/persistent/combined/en1eo12Ic1/run004 en1eo12Ic1
+/intrepid-fs0/users/lzamboni/persistent/combined/en1eo12Ic2/run001 en1eo12Ic2
+/intrepid-fs0/users/lzamboni/persistent/combined/en1eo12Ic3/run001 en1eo12Ic3
+/intrepid-fs0/users/lzamboni/persistent/combined/run001 en1eo12Ic4 uncertain realID
+/intrepid-fs0/users/lzamboni/persistent/combined/run002 en1eo14Ic1 uncertain realID
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run001 en1eo14Ic2 BAD
+/intrepid-fs0/users/wilde/persistent/LZ/combined.2012.0707/en1eo14Ic2 en1eo14Ic2
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run002 en1eo14Ic3 BAD
+/intrepid-fs0/users/wilde/persistent/LZ/combined.2012.0707/en1eo14Ic3 en1eo14Ic3
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run003 en1eo14Ic4 BAD
+/intrepid-fs0/users/wilde/persistent/LZ/combined.2012.0707/en1eo14Ic4 en1eo14Ic4
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run004 en1eo16Ic1
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run005 en1eo16Ic2
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run006 en1eo16Ic3
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run007 en1eo16Ic4
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run008 en1eo8Ic1
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run009 en1eo8Ic2
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run010 en1eo8Ic3
+/intrepid-fs0/users/lzamboni/persistent/combined/many-en1/run011 en1eo8Ic4 BAD
+/intrepid-fs0/users/wilde/persistent/LZ/combined.2012.0707/en1eo8Ic4 en1eo8Ic4
+
+Yearly output:
+
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/combined/en1eo12Ic1/run001 en1eo12Ic1 MISSING
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/combined/en1eo12Ic2/run002 en1eo12Ic2 MISSING
+COMPLETED - ignore en1eo12Ic3 IGNORE
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/combined/run001/run003 en1eo12Ic4 SHORT by half
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/combined/run002/run004 en1eo14Ic1
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run001/run001/ en1eo14Ic2 BAD
+/intrepid-fs0/users/wilde/persistent/LZ/yearly.2012.0708/en1eo14Ic2/run005/run001 en1eo14Ic2
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run002/run003/ en1eo14Ic3 BAD
+/intrepid-fs0/users/wilde/persistent/LZ/yearly.2012.0708/en1eo14Ic3/run006/run002 en1eo14Ic3
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run003/run004/ en1eo14Ic4 BAD
+/intrepid-fs0/users/wilde/persistent/LZ/yearly.2012.0708/en1eo14Ic4/run007/run004 en1eo14Ic4
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run004/run005/ en1eo16Ic1
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run005/run006/ en1eo16Ic2
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run006/run007/ en1eo16Ic3
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run007/run008/ en1eo16Ic4
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run008/run009/ en1eo8Ic1
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run009/run010/ en1eo8Ic2
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run010/run011/ en1eo8Ic3
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run011/run012/ en1eo8Ic4 BAD
+/intrepid-fs0/users/wilde/persistent/LZ/yearly.2012.0708/en1eo8Ic4/run008/run004 en1eo8Ic4 running
+
+Atmos average output from pfrepp:
+
+ en1eo12Ic1
+ en1eo12Ic2
+COMPLETED - IGNORE -------------------------------------- en1eo12Ic3
+ en1eo12Ic4
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo14Ic1 en1eo14Ic1
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo14Ic2 en1eo14Ic2
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo14Ic3 en1eo14Ic3
+ en1eo14Ic4
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo16Ic1 en1eo16Ic1
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo16Ic2 en1eo16Ic2
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo16Ic3 en1eo16Ic3
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo16Ic4 en1eo16Ic4
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo8Ic1 en1eo8Ic1
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo8Ic2 en1eo8Ic2
+/intrepid-fs0/users/wilde/persistent/LZ/pfrepps/en1eo8Ic3 en1eo8Ic3
+ en1eo8Ic4
+
+
+
+
+
+en1eo8Ic1 - not sure why this was omitted from set02???
+
+
+en1eo16Ic1 - completed atmos avg as part of 3/8 steps that completed in a full-run attempt
+en1eo12Ic3 - Laura is running analysis on this reali, so that one must have been pfrepp'ed by her
+
+Set03 to do:
+
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/combined/run002/run004 en1eo14Ic1
+/intrepid-fs0/users/wilde/persistent/LZ/yearly.2012.0708/en1eo14Ic2/run005/run001 en1eo14Ic2
+/intrepid-fs0/users/wilde/persistent/LZ/yearly.2012.0708/en1eo14Ic3/run006/run002 en1eo14Ic3
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/many-en1/run008/run009/ en1eo8Ic1
+
+set04:
+
+/intrepid-fs0/users/wilde/persistent/LZ/yearly.2012.0708/en1eo14Ic4/run007/run004 en1eo14Ic4
+/intrepid-fs0/users/wilde/persistent/LZ/yearly.2012.0708/en1eo8Ic4/run008/run004 en1eo8Ic4
+
+set05 (tentative):
+
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/combined/en1eo12Ic1/run001 en1eo12Ic1 MISSING
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/combined/en1eo12Ic2/run002 en1eo12Ic2 MISSING
+/intrepid-fs0/users/lzamboni/persistent/yearly-nco/combined/run001/run003 en1eo12Ic4 SHORT by half - Rerun?
+
+
+============= TO FIX ==============
+
+[x] first of all, if you need to run, make sure you're using the following version
+of NCO (which is a package that handles netCDF files)
+
+nco-4.0.9 for yearly reorganization
+
+nco-3.9.2 for pfrepp and for combine (combine might not used nco at all, but
+this is our solution today)
+
+when we know makeyearly, combine and pfrepp work, we may want to test whether
+the following version works for all cases (as ALCF Support suggests)
+
+3.9.9-udunits2
+
+---
+
+You can referred to my working dir /home/lzamboni/SWIFT/output/en1eo12Ic3 to
+track these changes (don't look at either en1eo12Ic1 or en1eo12Ic2):
+
+[no] 1) I renamed combine_realization.sh as combine_realizations.sh (note the "s"
+before .sh) since this is what the script expects.
+
+[x] fixed in README: 2) what is referred to as runall.combine_realizations.sh in the README (see in
+6.) is actually runall.sh so I changed that.
+
+3) On real.todo I added the full path of the input files and not only the name
+ of the realization. (what in the README is Climo_001 needs to be
+ /intrepid-fs0/users/lzamboni/persistent/Climo_001/history instead; what
+ changes between different realizations is only the dirname between
+ persistent and history)
+
+[x] used combine.chk.sh version, renamed it to combine.sh: 4) what is referred to as combine.sh is actually combine.chk.sh
+
+[x] .shk version used $USER: 5) on this combine.chk.sh I changed "wilde" to "lzamboni"
+
+6) this is the part that needs adjustments: in runall.config I changed outdir
+(which is where the output goes)
+
+if you look at the dir I pointed you to, you'll see that in that case
+outdir=/intrepid-fs0/users/lzamboni/persistent/combined/en1eo12Ic3 which
+creates "en1eo12Ic3" under /intrepid-fs0/users/lzamboni/persistent/combined/
+
+as is, I cannot process 2 realizations (say en1eo12Ic4 and en1eo14Ic1) and
+have the output in separate directories.
+
+At the moment I am running runall.sh to process 2 realizations (en1eo12Ic4 and
+en1eo14Ic1). I set outdir=/intrepid-fs0/users/lzamboni/persistent/combined/ in
+runall.config
+
+Even if it completes without problems, I would like to be able to specify just
+the name of the realization (real_name) and obtaining the output in
+/intrepid-fs0/users/lzamboni/persistent/combined/real_name. The input data
+are always in /intrepid-fs0/users/lzamboni/persistent/real_name/history.
+
More information about the Swift-commit
mailing list