problems writing vars with pnetcdf
Katie Antypas
kantypas at flash.uchicago.edu
Thu Dec 2 16:20:23 CST 2004
Hi All,
I'm not sure if this list gets much traffic but here goes. I'm having a
problem writing out data in parallel for a particular case when there are
zero elements to write on a given processor.
Let me explain a little better. For a very simple case, a 1 dimensional
array that we want to write in parallel - we define a dimension say,
'dim_num_particles' and define a variable, say 'particles' with a unique
id.
Each processor then writes out its portion of the particles into the
particles variable with the correct
starting position and count. As long as each processor has at least one
particle to write we have absolutely no problems, but quite often in our
code there are
processors that have zero particles for a given checkpoint file and thus
have nothing to write to
file. This is where we hang.
I've tried a couple different hacks to get around this --
* First was to try to write a zero-length array, with the count= zero
and the offset or starting point = 'dim_num_particles' but that
returned an error message from the put_vars calls.
All other offsets I choose returned errors as well, which is
understandable.
* The second thing I tried was to not write the data at all if there
were zero particles on a proc. But that hung. After talking to some
people here they though this also made sense because all procs now would
not be doing the same task, a problem we've also seen hang hdf5.
-- I can do a really ugly hack by increasing the dim_num_particles to have
extra room. That way if a proc had zero particles it could write out a
dummy value. The problem is that messes up our offsets when we need
to read in the checkpoint file.
Has anyone else seen this problem or know a fix to it?
Thanks,
Katie
____________________________
Katie Antypas
ASC Flash Center
University of Chicago
kantypas at flash.uchicago.edu
More information about the parallel-netcdf
mailing list