[petsc-users] SNESSetConvergenceTest

Dharmendar Reddy dharmareddy84 at gmail.com
Thu Jan 23 00:47:37 CST 2014


Hello,
        I think, i found the cause of the crashes. It is due to _cb
defined in zsnesf.c and a particular usage pattern.

Here is how it is:
The code has snes object wrapped inside a fortran data type (Solver_t)

Say i have two solver objects:
type(solver_t) :: solver1
type(solver_t) :: solver2

I first use solver1 to solve an equation where i do not use DM

In the snes in Solver1 , i set function first and then jacobian.

Now, ths static struct _cb in zsnsef has:
_cb  = {function = 1000, test = 0, destroy = 0, jacobian = 1001,
monitor = 0, mondestroy = 0, gs = 0}


Now i call the second solver with DM, where is set the function and
jacobain via DMSetFxn/Jac

Then i call snessetconvergenceset

_cb not takes the value :

_cb  = {function = 1000, test = 1002, destroy = 0, jacobian = 1001,
monitor = 0, mondestroy = 0, gs = 0}

but the snes hdr has num_fortrancallback[0] = 1

leading to an error: at line  262 in src/sys/objects/inherit.c
  if (PetscUnlikely(cid >=
PETSC_SMALLEST_FORTRAN_CALLBACK+obj->num_fortrancallback[cbtype]))
SETERRQ(obj->comm,PETSC_ERR_ARG_CORRUPT,"Fortran callback not set on
this object");


There is another scenario where my code hangs, i noted it as case (2)
[line 968 in solver.F90] in the comments of the attached test case.

i have attahced the logs for each of the three cases considered.

why cannot _cb reside inside the snes object  ?

Thanks
Reddy

On Wed, Jan 22, 2014 at 8:20 PM, Dharmendar Reddy
<dharmareddy84 at gmail.com> wrote:
> Hello,
>          I will try to f-blas-pack. later today.
>
> I did some more digging of the code:
>
> When i look at the snes object...I find the follwoing:  I can see that
> the ctx and residual and jacobian functions are set right. Where as
> the fortran callbacks in the snes seem to be wrong..even though snes
> says it has one fortran callback..am i reading this right ?
>
> (idb) print (*snes).dm.dmsnes.fortrancallback[1][0]
> $95 = {func = 0x7e9e7e <functionresidualwithdm>, ctx = 0x2b591f8}
> (idb) print (*snes).dm.dmsnes.fortrancallback[1][1]
> $96 = {func = 0x7e9bb8 <functionjacobianwithdm>, ctx = 0x2b591f8}
> (idb) print (*snes).hdr
> $97 = {classid = 1211230, bops = 0x53bc4d0, comm = -2080374784, type =
> 0, flops = 0, time = 0, mem = 1340, memchild
> ren = 0, id = 67, refct = 1, tag = 1073741734, qlist = 0x0, olist =
> 0x0, class_name = 0x2b592d32b9fc "SNES", descri
> ption = 0x2b592d32ba04 "Nonlinear solver", mansec = 0x2b592d32b9fc
> "SNES", type_name = 0x5416040 "newtonls", parent
>  = 0x0, parentid = 0, name = 0x0, prefix = 0x54146b0 "ddm_", tablevel
> = 0, cpp = 0x0, state = 0, int_idmax = 0, int
> star_idmax = 0, intcomposedstate = 0x0, intstarcomposedstate = 0x0,
> intcomposeddata = 0x0, intstarcomposeddata = 0x
> 0, real_idmax = 0, realstar_idmax = 0, realcomposedstate = 0x0,
> realstarcomposedstate = 0x0, realcomposeddata = 0x0
> , realstarcomposeddata = 0x0, scalar_idmax = 0, scalarstar_idmax = 0,
> scalarcomposedstate = 0x0, scalarstarcomposed
> state = 0x0, scalarcomposeddata = 0x0, scalarstarcomposeddata = 0x0,
> fortran_func_pointers = 0x0, num_fortran_func_
> pointers = 0, *fortrancallback = {0x54cd5e0, 0x0}, num_fortrancallback
> = {1, 0}, python_context = 0x0, python_destr
> oy = 0x0, noptionhandler = 0, (*optionhandler)(PetscObject, void *) =
> {0x0, 0x0, 0x0, 0x0, 0x0}, (*optiondestroy)(P
> etscObject, void *) = {0x0, 0x0, 0x0, 0x0, 0x0}, *optionctx = {0x0,
> 0x0, 0x0, 0x0, 0x0}, precision = PETSC_PRECISIO
> N_DOUBLE, optionsprinted = PETSC_TRUE}
> (idb) print (*snes).hdr.fortrancallback[0][0]
> $98 = {func = 0x0, ctx = 0x0}
>
> The call which sets the convergence test function is : You can see
> that the cctx object has same address as the context set in
> (*snes).dm.dmsnes.fortrancallback
>
> Breakpoint 1, SOLVERUTILS_M::solve_solver (this=0x2b591f8,
> ierr=-1070882422) at /home1/00924/Reddy135/projects/ut
> gds/src/solver/Solver.F90:348
> 348            call SNESSetConvergenceTest(this%snes,
> SNESConvergenceTest_snes, this,PETSC_NULL_FUNCTION, ierr);
> CHKERRQ(ierr)
> (idb) c
> Continuing.
>
> Breakpoint 4, snessetconvergencetest_ (snes=0x2b59210, func=0x7ea300
> <snesconvergencetest_snes>, cctx=0x2b591f8,
> destroy=0x0, ierr=0x7fffcd0c04f4) at
> /home1/00924/Reddy135/LocalApps/petsc/src/snes/interface/ftn-custom/zsnesf.c
> :254
>
>
> The stack trace is..
>
> (idb) bt
> #0  0x00002b592ca05849 in SNESSolve_NEWTONLS (snes=0x53bba60) at
> /home1/00924/Reddy135/LocalApps/petsc/src/snes/imp
> ls/ls/ls.c:197
> #1  0x00002b592caaf980 in SNESSolve (snes=0x53bba60, b=0x0,
> x=0x53ddc00) at /home1/00924/Reddy135/LocalApps/petsc/s
> rc/snes/interface/snes.c:3809
> #2  0x00002b592cae975b in snessolve_ (snes=0x2b59210, b=0xe2f780,
> x=0x2b59258, __ierr=0x7fffcd0c04f4) at /home1/009
> 24/Reddy135/LocalApps/petsc/src/snes/interface/ftn-custom/zsnesf.c:171
> #3  0x00000000007ea68f in SOLVERUTILS_M::solve_solver (this=0x2b591f8,
> ierr=0) at /home1/00924/Reddy135/projects/ut
> gds/src/solver/Solver.F90:350
> #4  0x00000000005b99d9 in FEMLIB_M::solve2_vp (this=0x2b58d60,
> solinit={...}) at /home1/00924/Reddy135/projects/utg
> ds/src/fem/VariationalProblemBoundProcedures.F90:793
> #5  0x0000000000430277 in UTGDS_M::runsimulation_ddm
> (this=0x7fffcd0c11f0) at /home1/00924/Reddy135/projects/utgds/
> src/devicesimulator/runDDM.F90:39
> #6  0x000000000042623e in UTGDS_M::runsimulation_device
> (this=0x7fffcd0c11f0) at /home1/00924/Reddy135/projects/utg
> ds/src/devicesimulator/UTGDSmod.F90:267
> #7  0x000000000040840f in oneddgfetpoissonschrod () at
> /home1/00924/Reddy135/projects/utgds/test/testDDM/RTDPoisTes
> t.F90:13
> #8  0x0000000000407e4c in main () in
> /home1/00924/Reddy135/projects/utgds/test/testDDM/PoisTest
> #9  0x000000390881ecdd in __libc_start_main () in /lib64/libc-2.12.so
>
> Thanks
> Reddy
>
> On Wed, Jan 22, 2014 at 5:45 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> On Wed, Jan 22, 2014 at 5:37 PM, Dharmendar Reddy <dharmareddy84 at gmail.com>
>> wrote:
>>>
>>> On Wed, Jan 22, 2014 at 5:33 PM, Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>> >
>>> > On Wed, Jan 22, 2014 at 4:50 PM, Dharmendar Reddy
>>> > <dharmareddy84 at gmail.com>
>>> > wrote:
>>> >>
>>> >> I take back my earlier email...
>>> >
>>> >
>>> > Do you have the threadcomm stuff turned on?
>>>
>>> I do not know what this means. Is related to compile time flags for my
>>> fortran code ? I link to mkl which is threaded. But otherwise, i
>>> guess, the answer is no.
>>
>>
>> The weird business looks like its in pthreads, so could you reconfigure
>> without MKL?
>>
>>   $PETSC_ARCH/conf/reconfigure-$PETSC_ARCH.py --download-f-blas-lapack
>> --PETSC_ARCH=arch-testing
>>
>> It would not surprise me if this was the problem.
>>
>>    Matt
>>
>>>
>>>
>>> >
>>> >    Matt
>>> >
>>> >>
>>> >> I see the error  irrespective of valgrind options. I did a clean
>>> >> compile to test it few times now.
>>> >> See the attached valgrind.log
>>> >>
>>> >> Now the difference between, ex5f90.F and my code is that i use
>>> >> DMSNESSetFunctionLocal in the case where the code fails.
>>> >>
>>> >> I have also attached the solver.F90 which wraps the calls to snes, Now
>>> >> the solver has to interface (1) for equations which are handled with a
>>> >> DM and (2) for equation with out DM , here i use SNESSetFunction, and
>>> >> this part of the code runs fine with the test case in the code which
>>> >> can be compiled with -DTestSolver and standard petsc makefile
>>> >>
>>> >> The case with DM is linked deeply into my code, i am trying to get to
>>> >> a test case for that.
>>> >>
>>> >> thanks
>>> >> Reddy
>>> >>
>>> >> On Wed, Jan 22, 2014 at 4:30 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> >> wrote:
>>> >> >
>>> >> >    The thing to use in valgrind is -q --tool=memcheck
>>> >> >
>>> >> >     This checks for corrupted and illegally accessed memory,
>>> >> > uninitialized memory etc  It shows the exact lines that memory is
>>> >> > used
>>> >> > incorrectly so you can fix your code.
>>> >> >
>>> >> >    Barry
>>> >> >
>>> >> > On Jan 22, 2014, at 4:02 PM, Dharmendar Reddy
>>> >> > <dharmareddy84 at gmail.com>
>>> >> > wrote:
>>> >> >
>>> >> >> Hello,
>>> >> >>        The code works fine if i run with valgrind --leak-check=yes
>>> >> >> exename args
>>> >> >> it fails if i run  valgrind exename args
>>> >> >>
>>> >> >> Any suggestions on what to look for my code?
>>> >> >>
>>> >> >> Thanks
>>> >> >> Reddy
>>> >> >>
>>> >> >> On Wed, Jan 22, 2014 at 3:31 PM, Dharmendar Reddy
>>> >> >> <dharmareddy84 at gmail.com> wrote:
>>> >> >>> Hello,
>>> >> >>>        I tried using the SNESSetConvergenceTest in ex5f90.F using
>>> >> >>> my
>>> >> >>> subroutine. Works as expected. I will try to run my code through
>>> >> >>> valgrind.
>>> >> >>>
>>> >> >>> Thanks
>>> >> >>> Reddy
>>> >> >>>
>>> >> >>> On Wed, Jan 22, 2014 at 2:33 PM, Dharmendar Reddy
>>> >> >>> <dharmareddy84 at gmail.com> wrote:
>>> >> >>>> Hello,
>>> >> >>>>         I will also try to create a stand alone test case.
>>> >> >>>>
>>> >> >>>> Thanks
>>> >> >>>> Reddy
>>> >> >>>>
>>> >> >>>> On Wed, Jan 22, 2014 at 1:53 PM, Matthew Knepley
>>> >> >>>> <knepley at gmail.com>
>>> >> >>>> wrote:
>>> >> >>>>> On Wed, Jan 22, 2014 at 1:52 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> >> >>>>> wrote:
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>>  We need more information to track this down. Could you make a
>>> >> >>>>>> simple
>>> >> >>>>>> standalone code that reproduces the problem? Just creates a
>>> >> >>>>>> this,
>>> >> >>>>>> creates a
>>> >> >>>>>> SNES, sets the function, sets the test code and then crashes
>>> >> >>>>>> when
>>> >> >>>>>> runs.
>>> >> >>>>>>
>>> >> >>>>>>   But since it is not crashing in the SNESConvergenceTest_snes
>>> >> >>>>>> routine I
>>> >> >>>>>> suspect it may be memory corruption. Have you run your code
>>> >> >>>>>> under
>>> >> >>>>>> valgrind?
>>> >> >>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> >>>>>
>>> >> >>>>>
>>> >> >>>>> I am putting his convergence test into ex5f right now. This can
>>> >> >>>>> be
>>> >> >>>>> valgrinded. I will send an update when it runs.
>>> >> >>>>>
>>> >> >>>>>  Matt
>>> >> >>>>>
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>>   Barry
>>> >> >>>>>>
>>> >> >>>>>> On Jan 20, 2014, at 7:14 PM, Dharmendar Reddy
>>> >> >>>>>> <dharmareddy84 at gmail.com>
>>> >> >>>>>> wrote:
>>> >> >>>>>>
>>> >> >>>>>>> Hello,
>>> >> >>>>>>>        I am getting a segmentation fault when i use
>>> >> >>>>>>> SNESSetConvergenceTest from Fortran.
>>> >> >>>>>>>
>>> >> >>>>>>> I use the following:
>>> >> >>>>>>>
>>> >> >>>>>>> call SNESSetConvergenceTest(this%snes,
>>> >> >>>>>>> SNESConvergenceTest_snes,
>>> >> >>>>>>> this,PETSC_NULL_FUNCTION, ierr)   (Line 1)
>>> >> >>>>>>>
>>> >> >>>>>>> where the soubtoutine is
>>> >> >>>>>>>
>>> >> >>>>>>> subroutine
>>> >> >>>>>>> SNESConvergenceTest_snes(snes,it,xnorm,gnorm,fnorm,reason,this,
>>> >> >>>>>>> ierr)
>>> >> >>>>>>> implicit none
>>> >> >>>>>>> #include "finclude/petsc.h"
>>> >> >>>>>>> ! IO Variables
>>> >> >>>>>>> SNES :: snes
>>> >> >>>>>>> integer :: it
>>> >> >>>>>>> PetscReal :: xnorm
>>> >> >>>>>>> PetscReal :: gnorm
>>> >> >>>>>>> PetscReal :: fnorm
>>> >> >>>>>>> SNESConvergedReason :: reason
>>> >> >>>>>>> type(Solver_t) :: this
>>> >> >>>>>>> PetscErrorCode :: ierr
>>> >> >>>>>>> ! Local Variables
>>> >> >>>>>>> Vec :: X, dX
>>> >> >>>>>>> Vec :: F, W, G
>>> >> >>>>>>> real(WP) :: lambda
>>> >> >>>>>>> print*, 'Calling custom test',it,xnorm,fnorm,gnorm
>>> >> >>>>>>> !call SNESLineSearchGetLambda(this%linesearch, lambda, ierr)
>>> >> >>>>>>> !call SNESLineSearchGetVecs(this%linesearch, X, F, dX, W, G,
>>> >> >>>>>>> ierr)
>>> >> >>>>>>> !call this%SNESConvergenceTest(it, fnorm, lambda, X, dX,
>>> >> >>>>>>> reason,
>>> >> >>>>>>> ierr)
>>> >> >>>>>>> reason = SNES_DIVERGED_FUNCTION_COUNT
>>> >> >>>>>>> end subroutine SNESConvergenceTest_snes
>>> >> >>>>>>>
>>> >> >>>>>>> Code works fine if  the call at Line 1 is commented i.e, if i
>>> >> >>>>>>> donot
>>> >> >>>>>>> set the convergencetest
>>> >> >>>>>>>
>>> >> >>>>>>> I have attached the stack trace when the code fails
>>> >> >>>>>>> (traceFail.log)
>>> >> >>>>>>> and stack trace of the code which runs well (tracePass.log).
>>> >> >>>>>>>
>>> >> >>>>>>>
>>> >> >>>>>>>
>>> >> >>>>>>> Thanks
>>> >> >>>>>>> Reddy
>>> >> >>>>>>> <traceFail.log><tracePass.log>
>>> >> >>>>>>
>>> >> >>>>>
>>> >> >>>>>
>>> >> >>>>>
>>> >> >>>>> --
>>> >> >>>>> What most experimenters take for granted before they begin their
>>> >> >>>>> experiments
>>> >> >>>>> is infinitely more interesting than any results to which their
>>> >> >>>>> experiments
>>> >> >>>>> lead.
>>> >> >>>>> -- Norbert Wiener
>>> >> >>>>
>>> >> >>>>
>>> >> >>>>
>>> >> >>>> --
>>> >> >>>> -----------------------------------------------------
>>> >> >>>> Dharmendar Reddy Palle
>>> >> >>>> Graduate Student
>>> >> >>>> Microelectronics Research center,
>>> >> >>>> University of Texas at Austin,
>>> >> >>>> 10100 Burnet Road, Bldg. 160
>>> >> >>>> MER 2.608F, TX 78758-4445
>>> >> >>>> e-mail: dharmareddy84 at gmail.com
>>> >> >>>> Phone: +1-512-350-9082
>>> >> >>>> United States of America.
>>> >> >>>> Homepage: https://webspace.utexas.edu/~dpr342
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>> --
>>> >> >>> -----------------------------------------------------
>>> >> >>> Dharmendar Reddy Palle
>>> >> >>> Graduate Student
>>> >> >>> Microelectronics Research center,
>>> >> >>> University of Texas at Austin,
>>> >> >>> 10100 Burnet Road, Bldg. 160
>>> >> >>> MER 2.608F, TX 78758-4445
>>> >> >>> e-mail: dharmareddy84 at gmail.com
>>> >> >>> Phone: +1-512-350-9082
>>> >> >>> United States of America.
>>> >> >>> Homepage: https://webspace.utexas.edu/~dpr342
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> -----------------------------------------------------
>>> >> >> Dharmendar Reddy Palle
>>> >> >> Graduate Student
>>> >> >> Microelectronics Research center,
>>> >> >> University of Texas at Austin,
>>> >> >> 10100 Burnet Road, Bldg. 160
>>> >> >> MER 2.608F, TX 78758-4445
>>> >> >> e-mail: dharmareddy84 at gmail.com
>>> >> >> Phone: +1-512-350-9082
>>> >> >> United States of America.
>>> >> >> Homepage: https://webspace.utexas.edu/~dpr342
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> -----------------------------------------------------
>>> >> Dharmendar Reddy Palle
>>> >> Graduate Student
>>> >> Microelectronics Research center,
>>> >> University of Texas at Austin,
>>> >> 10100 Burnet Road, Bldg. 160
>>> >> MER 2.608F, TX 78758-4445
>>> >> e-mail: dharmareddy84 at gmail.com
>>> >> Phone: +1-512-350-9082
>>> >> United States of America.
>>> >> Homepage: https://webspace.utexas.edu/~dpr342
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > What most experimenters take for granted before they begin their
>>> > experiments
>>> > is infinitely more interesting than any results to which their
>>> > experiments
>>> > lead.
>>> > -- Norbert Wiener
>>>
>>>
>>>
>>> --
>>> -----------------------------------------------------
>>> Dharmendar Reddy Palle
>>> Graduate Student
>>> Microelectronics Research center,
>>> University of Texas at Austin,
>>> 10100 Burnet Road, Bldg. 160
>>> MER 2.608F, TX 78758-4445
>>> e-mail: dharmareddy84 at gmail.com
>>> Phone: +1-512-350-9082
>>> United States of America.
>>> Homepage: https://webspace.utexas.edu/~dpr342
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their experiments
>> is infinitely more interesting than any results to which their experiments
>> lead.
>> -- Norbert Wiener
>
>
>
> --
> -----------------------------------------------------
> Dharmendar Reddy Palle
> Graduate Student
> Microelectronics Research center,
> University of Texas at Austin,
> 10100 Burnet Road, Bldg. 160
> MER 2.608F, TX 78758-4445
> e-mail: dharmareddy84 at gmail.com
> Phone: +1-512-350-9082
> United States of America.
> Homepage: https://webspace.utexas.edu/~dpr342



-- 
-----------------------------------------------------
Dharmendar Reddy Palle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Error1.log
Type: application/octet-stream
Size: 2303 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140123/97975415/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Error2.log
Type: application/octet-stream
Size: 3447 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140123/97975415/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: output.log
Type: application/octet-stream
Size: 330 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140123/97975415/attachment-0005.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Solver.F90
Type: text/x-fortran
Size: 32693 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140123/97975415/attachment-0001.bin>


More information about the petsc-users mailing list