pnetcdf with CDO

Charlie Zender zender at uci.edu
Thu Feb 27 01:01:14 CST 2014


Hi Jialin,

Rob's comment about whole workflow parallelism is insightful.
Both CDO and NCO are command-oriented and so workflows are scripts of
independent commands that often produce excess intermediate files.
Though no one asked, I'll tell you that we implemented workflow
parallelism in NCO via SWAMP. SWAMP parallelizes NCO workflows by
compiling a restricted subset of Bash commands and NCO into basic
blocks and optimizes their scheduling. This is documented in

Wang, D. L., C. S. Zender, and S. F. Jenks (2009), Efficient Clustered
Server-side Data Analysis Workflows using SWAMP, Earth Sci. Inform.,
2(3), 141-155, doi:10.1007/s12145-009-0021-z.

SWAMP's approach could work on most shell workflows, and is somewhat
independent of the parallelism (e.g., OpenMP or MPI) in the underlying
commands (NCO/CDO) themselves. Once again, detecting and avoiding
unnecessary intermediate writes is crucial to workflow parallelism.
SWAMP does not completely solve that problem, yet it reduces it enough
to work well with the client/server model and with multinode machines
locally.

If you are interested in implementing PnetCDF into NCO in addition to
or instead of CDO, I would welcome your contribution and will be happy
to provide guidance. Contact me offline if so.

Good luck,
cz
-- 
Charlie Zender, Earth System Sci. & Computer Sci.
University of California, Irvine 949-891-2429 )'(


More information about the parallel-netcdf mailing list