[MOAB-dev] r3273 - in MOAB/trunk: . tools/dagmc

Fri Nov 6 08:48:22 CST 2009

Sorry for the radio silence; some clarifications below.

Jed Brown wrote:
> kraftche at cae.wisc.edu wrote:
>> Author: kraftche
>> Date: 2009-11-03 13:06:17 -0600 (Tue, 03 Nov 2009)
>> New Revision: 3273
>>
>> Modified:
>>    MOAB/trunk/FileOptions.cpp
>>    MOAB/trunk/FileOptions.hpp
>>    MOAB/trunk/MBCore.cpp
>>    MOAB/trunk/tools/dagmc/test_geom.cc
>> Log:
>> return error code if unrecognized options are passed to load_file or write_file
> 
> 
> As you probably know, this breaks a ton of parallel tests.
> 

Yeah, we knew it probably would.  We'd discussed sending a warning, but in the end that didn't happen.

Without this change, there was no way to know whether the options were getting processed correctly.  This lead to 
difficulty knowing whether the option was input wrongly or the code misbehaved.  For example, because the requirements 
for delimiters in option strings are/were different between iMesh and MOAB, our iMesh implementation removed the first 
character of e.g. the "PARALLEL_PARTITION" option, resulting in MOAB seeing it as "ARALLEL_PARTITION" and never 
processing it (and the app never knowing that).  We've since modified option processing in MOAB to match iMesh, much as 
it pains us, but that's a separate issue.

> 
> When checking that it wasn't just a local problem, it occurred to me
> that MOAB is obnoxiously difficult to debug.  This is largely due to the
> fact that there is no obvious point to attach a debugger to see when an
> error first occurs.  Instead, user code gets a helpful MB_FAILURE, and
> *my* error handler lets me easily have the debugger at this point.  But
> finding the first place that the error occurs is really slow because I
> have to deal with it one stack frame at a time, either by stepping or
> bisecting at each frame.  (GDB's new "reverse-debugging" can help, but
> it's finicky.)  In contrast, when PETSc reports an error, I get a full
> stack trace (without the debugger) and I can immediately have a debugger
> on the line where the error first occurred (literally 1-5 seconds from
> "look, there's an error" to a debugger on the relevant line, even in
> parallel).
> 
> Having MOAB print it's own stack trace would take significant work, but
> allowing a user-defined error handler is simple (it would touch a lot of
> code, but not in an interesting way) and would help a lot.  The main
> difficulty is that MBErrorCode is also used to return non-error
> conditions, therefore the error handler cannot just dump core or attach
> debuggers, it needs to first decide whether the error was intentional.
> 
> Jed
> 

We've very recently talked about how to do different types of error handling.  This would also allow us to turn off 
output, e.g. for a very large parallel run.  Shoring up the parallel support and scalability is a bigger priority this 
very moment, but I think better error handling will go in shortly afterwards.

- tim

-- 
================================================================
"You will keep in perfect peace him whose mind is
   steadfast, because he trusts in you."               Isaiah 26:3

              Tim Tautges            Argonne National Laboratory
          (tautges at mcs.anl.gov)      (telecommuting from UW-Madison)
          phone: (608) 263-8485      1500 Engineering Dr.
            fax: (608) 263-4499      Madison, WI 53706