[petsc-dev] errors galore related to Barry's change to PetscError

Sat May 8 17:42:33 CDT 2010

On May 8, 2010, at 4:50 PM, Jed Brown wrote:

> On Sat, 8 May 2010 16:18:34 -0500, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>  Sorry, and Jed thanks for fixing.
>> 
>>  Why there needs to be separate C++ functions indicates to me a
>>  problem with the design. We should fix up the C one so that a C++
>>  one is not needed that is separate.
> 
> It writes to a std::ostringstream.  I don't see that as a great
> usability advantage, but it is different behavior.  Also, I'm unclear
> about what PetscTraceBackErrorHandler(Cxx) is supposed to when it is
> done.  As currently written, root writes the message and everyone else
> waits 10 seconds, then abort()s.  Seems to me that non-root should wait,
> then everyone should call MPI_Abort().  I don't know when it would be
> acceptable to call abort() in parallel.

    Now that a comm is passed into the error handler, how we use it is still preliminary and work and progress. Likely it will evolve as we figure out what do to. 

The reason I don't have non-root call MPI_Abort() (even after waiting) is that MPI_Abort() will trigger all the other processes to abort? If the root is "late" getting to the error then it will receive an abort from the non-root MPI_Abort() and never execute the traceback hence no error message; bad news. At least I think this might happen.

An alternative to what I have done is to have non-root wait a while and then return with the usual traceback. Thus under normal circumstances it will receive the abort() from root before printing the traceback so we will get one nice traceback from root. (will it?)  Under strange circumstances where root for some reason doesn't get to the error we will get the current behavior where everyone else prints the traceback and so we do get a useful error message (not perfect cause there are several error messages but much better than no messages.

  I'll leave the rest of your comments to another time when I can refresh myself as to why we have these obtuse and badly documented cases. (They are clearly badly documented if you couldn't find cogent explanations for them).

   Barry

> 
> As for C++, why does C support need to be optional?  In other words,
> what value is there in not defining PETSC_USE_EXTERN_CXX?  Seems to only
> make error messages more obtuse and create a different ABI for no real
> benefit.  As far as I know, there is nothing wrong with having mangled
> and unmangled symbols with the same name, as in
> 
>  extern "C" PetscErrorCode VecNorm(Vec,NormType,PetscReal*);
>  #if defined __cplusplus
>  static inline PetscErrorCode VecNorm(Vec x,PetscReal *nrm) {return VecNorm(x,NORM_2,&nrm);}
>  #endif
> 
> 
> And why are things like petsclog.hh wrapped in
> 
>  #if defined(PETSC_CLANGUAGE_CXX) && !defined(PETSC_USE_EXTERN_CXX)
> 
> instead of
> 
>  #if defined(PETSC_CLANGUAGE_CXX) && defined(__cplusplus)
> 
> 
> Jed