SNES Problem

Barry Smith bsmith at mcs.anl.gov
Mon Mar 20 11:30:30 CST 2006


   Nisse,

    Is there any way you can bundle everything up so you could send
it to me and I could easily compile it and make? This approach to
trying to discover the problem is taking too long.

    Barry


On Mon, 20 Mar 2006, Nils Erik Svangård wrote:

> Barry,
> I have now rewritten the code
>
> do 1:i
> do 1:j
>  do 1:k
>   l= old_numbering
>   m++ newnumbering
>   xx(1,m)=RO(l)
>   xx(2,m)=RU(l)
>   ...
> end dos
>
> The vectors are created by running, where m is the number of
> iterations in the do loop times 7 because we have 7 diffs to solve
> (DOF = 7).
>
> m=m*7
> call VecCreateSeq(PETSC_COMM_SELF,m,x,ierr)
>
> could this be a source of trouble, solving 7 diffs at the same time?
>
> Petsc still supply get_DRO with strange values, RK crashes directly
> but beuler does about the double amount of iterations but still doesnt
> to complete a timestep.
>
> /nisse
>
>
> On 3/20/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>    Nisse,
>>
>>     Yes. You need to make sure that all the entries computed
>> in your FormFunction() are correctly F_i(x) for 0 <= i < globalvectorlength!!
>> The "zeros" will destroy the Newton convergence since the Jacobian
>> will be singular, plus if the "zeros" are actually random "garbage"
>> because the memory has not be initialized that will really screw
>> things up.
>>
>>    Barry
>>
>>
>> On Mon, 20 Mar 2006, Nils Erik Svangård wrote:
>>
>>> Hi again,
>>> I thought that I would comment on the diff from last being 0.90 all
>>> the time. That is because a error in calculation (by me).
>>>
>>> do IMESH=1,NMESH
>>>     NMPL=NMP(IMESH)
>>>     NIL = NI(IMESH)
>>>     NJL = NJ(IMESH)
>>>     NKL = NK(IMESH)
>>>       do K=1,NKL-1
>>>         do J=1,NJL-1
>>>           do I=1,NIL-1
>>>              L=I+NIL*(J-1)+NIL*NJL*(K-1)+NMPL
>>>              count=count+1
>>>              DRO(L)=magic()
>>>           end do
>>>         end do
>>>      end do
>>>  end do
>>>
>>> count after a run is 8019 and L is 8899, so the vector containg the
>>> solution also have a couple of zeros in the middle, could this effect
>>> the SNES solver?
>>>
>>> /nisse
>>>
>>>
>>> On 3/20/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
>>>> Hi,
>>>> The timestep doesnt change dramaticly just before the values get
>>>> strange. The term diff from last is:
>>>> a=new_value(L)/old_value(L)+a
>>>> Diff from last=a/L
>>>> To me it seems as something petsc does supplies strange values.
>>>> The output berfore crashing:
>>>>
>>>>
>>>>  Iterering:   667.000
>>>>  H2T
>>>>  Timestep:   1.000000000000000E-007
>>>>  H1T
>>>>  Diff from last:   0.901112485562141
>>>>  AUXVR
>>>>  ROMIN,ROMAX=   1.22541  1.22657
>>>>  UMIN ,UMAX =  -4.016098E-02  0.278343
>>>>  VMIN ,VMAX =  -1.853829E-03  3.289763E-02
>>>>  WMIN ,WMAX =  -1.852430E-03  1.855088E-03
>>>>  PMIN ,PMAX =   101283.  101416.
>>>>  QMIN ,QMAX =   0.815285  0.816229
>>>>  EMIN ,EMAX =   4.07643  4.08175
>>>>  V3
>>>>  Hello CONV4
>>>>  RUS:   12.2556
>>>>  V4
>>>>  DIFF3
>>>>  F3
>>>>  Iterering:   668.000
>>>>  H2T
>>>>  Timestep:   1.000000000000000E-007
>>>>  H1T
>>>>  Diff from last:   0.901112485562141
>>>>  AUXVR
>>>>  ROMIN,ROMAX=   1.22541  1.22657
>>>>  UMIN ,UMAX =  -4.016098E-02  0.278343
>>>>  VMIN ,VMAX =  -1.853829E-03  3.289763E-02
>>>>  WMIN ,WMAX =  -1.852430E-03  1.855088E-03
>>>>  PMIN ,PMAX =   101283.  101416.
>>>>  QMIN ,QMAX =   0.815285  0.816229
>>>>  EMIN ,EMAX =   4.07643  4.08175
>>>>  V3
>>>>  Hello CONV4
>>>>  RUS:   12.2556
>>>>  V4
>>>>  DIFF3
>>>>  F3
>>>>  Iterering:   669.000
>>>>  H2T
>>>>  Timestep:   1.000000000000000E-007
>>>>  H1T
>>>>  Diff from last:   1.785767662774350E-006
>>>>  AUXVR
>>>> --------------------------------------------------------------------------
>>>> Petsc Release Version 2.3.0, Patch 32, April, 26, 2005
>>>> See docs/changes/index.html for recent updates.
>>>> See docs/faq.html for hints about trouble shooting.
>>>> See docs/index.html for manual pages.
>>>> -----------------------------------------------------------------------
>>>> ./withrk on a linux-gnu named sethnx004.vac.com by yy26539 Mon Mar 20
>>>> 10:17:33 2006
>>>> Libraries linked from /home/yy26539/work/NISSE/petsc-2.3.0/lib/linux-gnu
>>>> Configure run at Thu Oct 13 08:23:56 2005
>>>> Configure options --with-cc=gcc --with-fc="f77 -N109"
>>>> --download-mpich=1 --download-mpich-pm=gforker
>>>> --download-f-blas-lapack=1 --with-shared=0
>>>> -----------------------------------------------------------------------
>>>> [0]PETSC ERROR: Caught signal number 8 FPE: Floating Point
>>>> Exception,probably divide by zero
>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [0]PETSC ERROR: likely location of problem given in stack below
>>>> [0]PETSC ERROR: --------------- Stack Frames ---------------
>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>> [0]PETSC ERROR:       is given.
>>>> [0]PETSC ERROR: [0] SNESComputeFunction line 788 src/snes/interface/snes.c
>>>> [0]PETSC ERROR: [0] MatMult_MFFD line 235 src/snes/mf/snesmfj.c
>>>> [0]PETSC ERROR: [0] MatMult line 1368 src/mat/interface/matrix.c
>>>> [0]PETSC ERROR: [0] SNESLSCheckLocalMin_Private line 19 src/snes/impls/ls/ls.c
>>>> [0]PETSC ERROR: [0] SNESSolve line 1656 src/snes/interface/snes.c
>>>> [0]PETSC ERROR: --------------------------------------------
>>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>>> unknown file
>>>> [0]PETSC ERROR: Signal received!
>>>> [0]PETSC ERROR:  !
>>>> aborting job:
>>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>>>
>>>>
>>>> /nisse
>>>>
>>>> On 3/18/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
>>>>> On 3/18/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>
>>>>>>    So you are saying that TS is feeding "reasonable" input
>>>>>> for a while? That basically matches (in scale) the values feed in
>>>>>> from the old rk code? Then SUDDENLY? it inputs values with a very
>>>>>> different size?
>>>>>
>>>>> Yes.
>>>>>
>>>>>> Does the TS time-step size? also change dramatically at
>>>>>> that point?
>>>>>>
>>>>>
>>>>> No, I dont think so. However I'm not 100% sure what the timestep
>>>>> hasent changed dramatically for the last iteration when it crashes
>>>>> (but 95% sure). I can check that on monday.
>>>>>
>>>>> /nisse
>>>>>
>>>>>>     Barry
>>>>>>
>>>>>>
>>>>>> On Sat, 18 Mar 2006, Nils Erik Svangård wrote:
>>>>>>
>>>>>>> On 3/17/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>>
>>>>>>>>    Based on my understanding. This is correct.
>>>>>>>
>>>>>>> I was almost hoping I had missed something fundamental.
>>>>>>>
>>>>>>>>
>>>>>>>>    Suggest you run a TINY problem with your "old" code
>>>>>>>> and the TS (and or SNES one). Print out everything. The
>>>>>>>> current solution the result from calling Get_DRO() and
>>>>>>>> compare the runs, when and why do they change? This will
>>>>>>>> help understand what is going on.
>>>>>>>
>>>>>>> Well, Get_DRO() is exactly the same in the old and new code, and they
>>>>>>> start with the same initial values. Get_DRO() is not producing strange
>>>>>>> values, it is crashing because it get strange values fed into it.
>>>>>>>
>>>>>>> A normal run for the formfunction f(in,out) in TS would look something
>>>>>>> like this if printed:
>>>>>>> in=25560         <- Start value
>>>>>>> out=0.0001     <-  du/dt
>>>>>>> in=25560         <- Value fed into formfunction after some TS magic
>>>>>>> out=0.0001      <- du/dt
>>>>>>> ...                      <- (a couple iterations)
>>>>>>> in=25451          <- Input value after a couple of iterations
>>>>>>> out=0.0001      <- du/dt
>>>>>>> in=0.23151       <- Value fed into formfunction decided by TS
>>>>>>> Petscerror divide by zero...
>>>>>>>
>>>>>>> The problem seems to be that TS decides that the input values to
>>>>>>> formfunction should be (in this case) 0.23151 which makes Get_DRO()
>>>>>>> crash because of  for example sqrt(in-1).
>>>>>>>
>>>>>>> I have no idea how to fix this. The values from Get_DRO() in TS match
>>>>>>> the values when running the old rk code.
>>>>>>>
>>>>>>> /nisse
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>     Barry
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, 17 Mar 2006, Nils Erik Svangård wrote:
>>>>>>>>
>>>>>>>>> Ok, I managed to compile and link rk.c to my fortran program, I forgot
>>>>>>>>> that in C you need a ;  in the end of every statement. (stupid mistake
>>>>>>>>> ;) )
>>>>>>>>>
>>>>>>>>> I have just added one line to see if it works.
>>>>>>>>>
>>>>>>>>>  /* computing new dt */
>>>>>>>>>     dt = dt * dt_fac;
>>>>>>>>>
>>>>>>>>>     /* Start Nisse stuff */
>>>>>>>>>        ierr = PetscPrintf(PETSC_COMM_WORLD,"Nisse prints dt: %f\n",dt);
>>>>>>>>>     /* End nisse stuff */
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     if(ts->ptime+dt > ts->max_time){
>>>>>>>>>        dt = ts->max_time - ts->ptime;
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>> I just try to print the current timestep, however this is never
>>>>>>>>> printed. And I'm not really sure that it is the timestep that is
>>>>>>>>> causing the problems.
>>>>>>>>>
>>>>>>>>> I have used call TSGetTimeStep(ts,timestep,ierr) to monitor what
>>>>>>>>> timestep TS uses and it seem ok. However after the first iteration of
>>>>>>>>> FormFunction everything seems ok, but in start of the second iteration
>>>>>>>>> all values are really strange.
>>>>>>>>>
>>>>>>>>> I see the same thing when using SNES and my back euler implementation,
>>>>>>>>> it iterate many more times however, but all of a sudden the all "in"
>>>>>>>>> values are in the range 0.2-0.7 (for all 7 equations) and my code
>>>>>>>>> bombs because of the strange values.
>>>>>>>>>
>>>>>>>>> When using TS and  running with  -snes_mf -ts_type beuler -ksp_rtol
>>>>>>>>> 1.e-10 this is what printed just before producing strange values:
>>>>>>>>> KSP Object:
>>>>>>>>>  type: gmres
>>>>>>>>>    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>>>    GMRES: happy breakdown tolerance 1e-30
>>>>>>>>>  maximum iterations=10000, initial guess is zero
>>>>>>>>>  tolerances:  relative=1e-10, absolute=1e-50, divergence=10000
>>>>>>>>>  left preconditioning
>>>>>>>>> PC Object:
>>>>>>>>>  type: none
>>>>>>>>>  linear system matrix = precond matrix:
>>>>>>>>>  Matrix Object:
>>>>>>>>>    type=mffd, rows=70000, cols=70000
>>>>>>>>>        SNES matrix-free approximation:
>>>>>>>>>          err=1e-07 (relative error in function evaluation)
>>>>>>>>>          Using wp compute h routine
>>>>>>>>>          Computes normA
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> And just to make sure that I havent misunderstood how SNES and TS work:
>>>>>>>>> If the original 3-stage RK uses (my numbering):
>>>>>>>>> 1.  RO0(L)=RO(L)
>>>>>>>>>     Get_DRO(RO(L))
>>>>>>>>>     RO(L)=RO0(L)+CFL*DRO(L)
>>>>>>>>> 2.  RO0(L)=.5*(RO0(L)+RO(L))
>>>>>>>>>     Get_DRO(RO(L))
>>>>>>>>>     RO(L)=RO0(L)+.5*CFL*DRO(L)
>>>>>>>>> 3.  Get_DRO(RO(L))
>>>>>>>>>     RO(L)=RO0(L)+.5*CFL*DRO(L)
>>>>>>>>>
>>>>>>>>> Then this should be in TS which should return du/dt which is DRO:
>>>>>>>>>      RO(L)=xx(1,L)
>>>>>>>>>      Get_DRO(RO(L))
>>>>>>>>>      ff(1,L) = DRO(L)
>>>>>>>>>
>>>>>>>>> And in SNES with back euler:
>>>>>>>>>      (Old RO from previous iteration ORO(L)
>>>>>>>>>      RO(L)=xx(1,L)
>>>>>>>>>      Get_DRO(RO(L))
>>>>>>>>>      ff(1,L)= RO(L)-OLD(1,L)-TSF(L)*DRO(L)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This became a long mail, I hope this shows if I missed something vital.
>>>>>>>>> /nisse
>>>>>>>>> On 3/16/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
>>>>>>>>>> Barry,
>>>>>>>>>> the problem is making the objectfile, but I'll try again when I have
>>>>>>>>>> the code. I will check the makefile for the c-examples.
>>>>>>>>>> /nisse
>>>>>>>>>>
>>>>>>>>>> On 3/16/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>>>>>
>>>>>>>>>>>    Nisse,
>>>>>>>>>>>
>>>>>>>>>>>      Just list it in your makefile with all your other object
>>>>>>>>>>> files (that come from Fortran). Send the output if this fails.
>>>>>>>>>>>
>>>>>>>>>>>     Barry
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, 16 Mar 2006, Nils Erik Svangård wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I havent managed to get rk.c compiled with changes. how do I compile
>>>>>>>>>>>> it in my working directory to get a object file. I just realised that
>>>>>>>>>>>> I probably forgot to link it against $TSLIB but should I need to that
>>>>>>>>>>>> when I dont do any linking, the linking is done when linking the
>>>>>>>>>>>> fortran and the c code?
>>>>>>>>>>>> Or what am I doing wrong (I not that good with C++ and linking).
>>>>>>>>>>>> /nisse
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/15/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>    Both
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, 15 Mar 2006, Nils Erik Svangård wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    Sorry, I forgot. Is the linear solver converging? If not, then that
>>>>>>>>>>>>>>> is the problem? Use a tolerance like -ksp_rtol 1.e-10 and see if the
>>>>>>>>>>>>>>> nonlinear solver converges.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'll try that when I have access to the code. I havent checked if the
>>>>>>>>>>>>>> linear solver converges is thera a -kspmonitor or -kspconvergedreason
>>>>>>>>>>>>>> I should use?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /nisse
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Nils-Erik Svangård
>>>>>>>>>>>> E-Mail: nilserik at gmail.com
>>>>>>>>>>>> MSN: schweingaard at hotmail.com
>>>>>>>>>>>> Skype: schweingaard
>>>>>>>>>>>> Mobil: +46-(0)70-3612178
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Nils-Erik Svangård
>>>>>>>>>> E-Mail: nilserik at gmail.com
>>>>>>>>>> MSN: schweingaard at hotmail.com
>>>>>>>>>> Skype: schweingaard
>>>>>>>>>> Mobil: +46-(0)70-3612178
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Nils-Erik Svangård
>>>>>>>>> E-Mail: nilserik at gmail.com
>>>>>>>>> MSN: schweingaard at hotmail.com
>>>>>>>>> Skype: schweingaard
>>>>>>>>> Mobil: +46-(0)70-3612178
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nils-Erik Svangård
>>>>>>> E-Mail: nilserik at gmail.com
>>>>>>> MSN: schweingaard at hotmail.com
>>>>>>> Skype: schweingaard
>>>>>>> Mobil: +46-(0)70-3612178
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nils-Erik Svangård
>>>>> E-Mail: nilserik at gmail.com
>>>>> MSN: schweingaard at hotmail.com
>>>>> Skype: schweingaard
>>>>> Mobil: +46-(0)70-3612178
>>>>>
>>>>
>>>>
>>>> --
>>>> Nils-Erik Svangård
>>>> E-Mail: nilserik at gmail.com
>>>> MSN: schweingaard at hotmail.com
>>>> Skype: schweingaard
>>>> Mobil: +46-(0)70-3612178
>>>>
>>>
>>>
>>> --
>>> Nils-Erik Svangård
>>> E-Mail: nilserik at gmail.com
>>> MSN: schweingaard at hotmail.com
>>> Skype: schweingaard
>>> Mobil: +46-(0)70-3612178
>>>
>>>
>>
>
>
> --
> Nils-Erik Svangård
> E-Mail: nilserik at gmail.com
> MSN: schweingaard at hotmail.com
> Skype: schweingaard
> Mobil: +46-(0)70-3612178
>
>


More information about the petsc-users mailing list