[petsc-dev] adding comm argument to PetscError() and friends

Barry Smith bsmith at mcs.anl.gov
Fri May 7 13:24:17 CDT 2010


On May 7, 2010, at 1:04 PM, Matthew Knepley wrote:

> On Thu, May 6, 2010 at 10:16 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   I'd like to add a MPI_Comm as the first argument to PetscError() and friends.
> 
>   In this way, if the same error is known over all the communicator ranks it can print just one nice error message and stack instead of spewing out many of the same messages all over the place.
> 
>   Does anyone object to this?
> 
> I am just worried that it will introduce deadlocks. If an error occurs on only one process
> and not another (like a NaN), but we use the entire communicator, we can get deadlock
> on the error message which will be very confusing.

   The idea is that by default we would pass in PETSC_COMM_SELF. Only when we KNOW 100% that ALL ranks in a process WILL FOR ABSOLUTE sure generate the same error would be pass the entire comm to SETERRQ() For example, if the user has set an invalid PC type etc. So a process generating a NAN would use only a PETSC_COMM_SELF in the SETERRQ().

   You are right that there is a chance when totally bizarre shit happens that rank 1 of a comm calls SETERRQ() but rank 0 does not; then no appropriate error message will be printed. I don't see a way to totally avoid this chance. So we can

1) ignore this chance and make the change and see what happens
2) leave things the same as they are now.

   Even if it turns out we cannot have only rank 0 of the SETERRQ() print the message because bizarre shit happens too often, I think conceptually it is the right thing to do to pass in the MPI_Comm over which the error happens. So I'd like to make the chance and we can always take out the control over printing by rank 0 if it is a problem (i.e. the default error handles could ignore the comm).

   I'm going to try this and see if I can work out the kinks before pushing.

> Is there a nice MPI way of checking
> whether everyone is present, and if not then just use the current method?

   No, absolutely not since that would require communication with everyone who may not be there.

   Barry

> 
>    Matt
>  
>   It does mean for each SETERRQXXX() we call we need to select the correct comm that is passed in. I will do all that, worst case just use MPI_COMM_SELF for some and get the same effect as today.
> 
>   Barry
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100507/d72254c3/attachment.html>


More information about the petsc-dev mailing list