warn message in log summary

Matthew Knepley knepley at gmail.com
Thu Dec 6 21:46:12 CST 2007


On Dec 6, 2007 9:00 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>     When we started PETSc 2 I thought of PETSc errors as ALWAYS being
> catastrophic: that is the program could NOT continue running.
>
>    Later we started to play with possibly recovering from some errors
> and I added the crude PetscException mechanism. For example
> I used it to allow the user in their SNES FormFunction to indicate
> the input vector was not in the domain, SNES would catch this
> and allow the program to continue.
>
>    I am very nervous about mixing a catastrophic error handling system
> WITH an exception system. I'd like to go back to the model:
> "once seterrq() is called ANYWHERE there is no possibility of
> continuing the program. " This means that all "exceptions" have to
> be handled on a case by case basis directly with the code. For
> example I just added SNESSetFunctionDomainError() to
> replace the previous use of SETERRQ(PETSC_ERR_ARG_DOMAIN).
> The "handling" of these custom code is then required to properly
> handle the resources like PetscLogEventEnd().
>
>    Comments?

I guess I have the opposite opinion. I think it is inevitable that PETSc
is rewritten at some point in the future. At that point, we would replace
the current, imperfect exception system with a better one. This way we
can preserve a good design. If we go the other way, all that code will
have to be rethought instead of just rewritten.

  Matt

>    Barry
>
>
>
>
>
> On Nov 26, 2007, at 4:19 PM, Lisandro Dalcin wrote:
>
> > On 11/26/07, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>   I've looked long and hard for a PETSc bug that would cause this
> >> problem.
> >> No luck. It seems to happen mostly (only?) on certain machines.
> >
> > Ups! I've found a possilbe source of the problem, at least for my
> > case! Those negative times I got  Order(-1e9) were in fact originated
> > from premature returns due to CHKERRQ macros.
> >
> > As I was doing Python unittesting, I was making calls generating
> > error, and catching exceptions, in order to check the error was
> > correctly set.
> >
> > However, this way of using PETSc is not safe at all, in general PETSc
> > does not always recover correctly after an error, and this seems to be
> > specially true for log machinery.
> >
> > After surfing the code and hacked PetscLogPrintSummary(),  I added a
> > check (eventInfo[event].depth == 0) in order to skip reductions of
> > time values for 'unterminated' events. This worked as expected, and
> > the even info did not show-up and the warning was not generated...
> >
> > Could this be a possible 'fix' for this issue??
> >
> > Richard... Are you completelly sure the negative timmings you were
> > getting are not related to an error being silenced because of a
> > missing CHKERRQ macro???
>
> >
> >
> >
> >> On Nov 26, 2007, at 11:03 AM, Lisandro Dalcin wrote:
> >>
> >>> I even get consistent time deltas using 'gettimeofday' on my box!!
> >>> Perhaps PETSc has some bug somewere?? What do you think??
> >>>
> >>> On 11/26/07, Richard Tran Mills <rmills at ornl.gov> wrote:
> >>>> Lisandro,
> >>>>
> >>>> Unfortunately, I see the same negative timings problem on the Cray
> >>>> XT3/4
> >>>> systems when I configure PETSc to use MPI_Wtime() for all its
> >>>> timings.  So
> >>>> that doesn't necessarily fix anything...
> >>>>
> >>>> --Richard
> >>>>
> >>>> Lisandro Dalcin wrote:
> >>>>
> >>>>> Perhaps PETSc should use MPI_Wtime as default timer. If a better
> >>>>> one
> >>>>> is available, then use it. But then MPIUNI have to also provide an
> >>>>> useful, default implementation.
> >>>>>
> >>>>> Runing a simple test, like this (MPICH2):
> >>>>>
> >>>>> int main(void)
> >>>>> {
> >>>>> int i;
> >>>>> double t0[100],t1[100];
> >>>>> MPI_Init(0,0);
> >>>>> for (i=0; i<100; i++) {
> >>>>>   t0[i] = MPI_Wtime();
> >>>>>   t1[i] = MPI_Wtime();
> >>>>> }
> >>>>> for (i=0; i< 100; i++) {
> >>>>>   printf("t0=%e, t1=%e, dt=%e\n",t0[i],t1[i],t1[i]-t0[i]);
> >>>>> }
> >>>>> MPI_Finalize();
> >>>>> return 0;
> >>>>> }
> >>>>>
> >>>>> and in the SAME box I get the PETSc warning, it consistently gives
> >>>>> me
> >>>>> positive time deltas of the order of MPI_Wtick()...
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Lisandro Dalcín
> >>> ---------------
> >>> Centro Internacional de Métodos Computacionales en Ingeniería
> >>> (CIMEC)
> >>> Instituto de Desarrollo Tecnológico para la Industria Química
> >>> (INTEC)
> >>> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> >>> PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> >>> Tel/Fax: +54-(0)342-451.1594
> >>>
> >>
> >>
> >
> >
> > --
> > Lisandro Dalcín
> > ---------------
> > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> > PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> > Tel/Fax: +54-(0)342-451.1594
> >
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener




More information about the petsc-dev mailing list