problem with def_var

Robert Latham robl at mcs.anl.gov
Thu May 17 11:43:49 CDT 2007


On Thu, May 10, 2007 at 11:27:35AM -0600, Jim Edwards wrote:
> I'm getting a rather criptic message on initialization that I think means
> that one of def_var, def_dim, or put_att has a confict from one mpi task to
> the next:
> 
> NC definations on multiprocesses conflict.
> NOTE: Definitions across all process
> 
> the problem is that it's at enddef that I get this message so I can't tell
> which of the 50 or so calls prior to that is really the problem.  Is there
> any way to move this message closer to the point of origin?

Hi Jim
We don't have any way to do that now: processes don't communicate
their header information until enddef for performance reasons.  

Having a way to debug this sort of problem would be useful.  We could
have an optional debugging mode that did some sort of header
comparison after every def_far, def_dim, put_att, etc.  Since that
would be a collective call, processes would hang if they were doing
the wrong thing.  

Since I'm not going to be able to implement this debugging mode for a
bit, another short-term approach would be to take a binary search
through the 50 calls, insering enddef/begindef calls after 25 calls.
If the newly placed enddef doesn't trigger the message, then you've
narrowed your search by half.  repeat until you find the bunk call.
Not the most attractive solution, but will probably get you on your
way fastest.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B




More information about the parallel-netcdf mailing list