SNES Problem

Nils Erik Svangård nilserik at gmail.com
Wed Mar 29 14:57:03 CST 2006


Barry,
You wrote this a couple of mails back:

  SNES works by computing p = -approxinv(J)*F(uold) and
then does a line search on unew = uold + lambda*p to get the
new u. First it uses a test value of 1 for lambda so it
tries to compute F(uold + p). It is possible that uold + p
has some "non-physical" values in it.

Does SNES do the same thing when -snes_mf is used?

/nisse

On 3/20/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>    Nisse,
>
>     Is there any way you can bundle everything up so you could send
> it to me and I could easily compile it and make? This approach to
> trying to discover the problem is taking too long.
>
>     Barry
>
>
> On Mon, 20 Mar 2006, Nils Erik Svangård wrote:
>
> > Barry,
> > I have now rewritten the code
> >
> > do 1:i
> > do 1:j
> >  do 1:k
> >   l= old_numbering
> >   m++ newnumbering
> >   xx(1,m)=RO(l)
> >   xx(2,m)=RU(l)
> >   ...
> > end dos
> >
> > The vectors are created by running, where m is the number of
> > iterations in the do loop times 7 because we have 7 diffs to solve
> > (DOF = 7).
> >
> > m=m*7
> > call VecCreateSeq(PETSC_COMM_SELF,m,x,ierr)
> >
> > could this be a source of trouble, solving 7 diffs at the same time?
> >
> > Petsc still supply get_DRO with strange values, RK crashes directly
> > but beuler does about the double amount of iterations but still doesnt
> > to complete a timestep.
> >
> > /nisse
> >
> >
> > On 3/20/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>
> >>    Nisse,
> >>
> >>     Yes. You need to make sure that all the entries computed
> >> in your FormFunction() are correctly F_i(x) for 0 <= i < globalvectorlength!!
> >> The "zeros" will destroy the Newton convergence since the Jacobian
> >> will be singular, plus if the "zeros" are actually random "garbage"
> >> because the memory has not be initialized that will really screw
> >> things up.
> >>
> >>    Barry
> >>
> >>
> >> On Mon, 20 Mar 2006, Nils Erik Svangård wrote:
> >>
> >>> Hi again,
> >>> I thought that I would comment on the diff from last being 0.90 all
> >>> the time. That is because a error in calculation (by me).
> >>>
> >>> do IMESH=1,NMESH
> >>>     NMPL=NMP(IMESH)
> >>>     NIL = NI(IMESH)
> >>>     NJL = NJ(IMESH)
> >>>     NKL = NK(IMESH)
> >>>       do K=1,NKL-1
> >>>         do J=1,NJL-1
> >>>           do I=1,NIL-1
> >>>              L=I+NIL*(J-1)+NIL*NJL*(K-1)+NMPL
> >>>              count=count+1
> >>>              DRO(L)=magic()
> >>>           end do
> >>>         end do
> >>>      end do
> >>>  end do
> >>>
> >>> count after a run is 8019 and L is 8899, so the vector containg the
> >>> solution also have a couple of zeros in the middle, could this effect
> >>> the SNES solver?
> >>>
> >>> /nisse
> >>>
> >>>
> >>> On 3/20/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >>>> Hi,
> >>>> The timestep doesnt change dramaticly just before the values get
> >>>> strange. The term diff from last is:
> >>>> a=new_value(L)/old_value(L)+a
> >>>> Diff from last=a/L
> >>>> To me it seems as something petsc does supplies strange values.
> >>>> The output berfore crashing:
> >>>>
> >>>>
> >>>>  Iterering:   667.000
> >>>>  H2T
> >>>>  Timestep:   1.000000000000000E-007
> >>>>  H1T
> >>>>  Diff from last:   0.901112485562141
> >>>>  AUXVR
> >>>>  ROMIN,ROMAX=   1.22541  1.22657
> >>>>  UMIN ,UMAX =  -4.016098E-02  0.278343
> >>>>  VMIN ,VMAX =  -1.853829E-03  3.289763E-02
> >>>>  WMIN ,WMAX =  -1.852430E-03  1.855088E-03
> >>>>  PMIN ,PMAX =   101283.  101416.
> >>>>  QMIN ,QMAX =   0.815285  0.816229
> >>>>  EMIN ,EMAX =   4.07643  4.08175
> >>>>  V3
> >>>>  Hello CONV4
> >>>>  RUS:   12.2556
> >>>>  V4
> >>>>  DIFF3
> >>>>  F3
> >>>>  Iterering:   668.000
> >>>>  H2T
> >>>>  Timestep:   1.000000000000000E-007
> >>>>  H1T
> >>>>  Diff from last:   0.901112485562141
> >>>>  AUXVR
> >>>>  ROMIN,ROMAX=   1.22541  1.22657
> >>>>  UMIN ,UMAX =  -4.016098E-02  0.278343
> >>>>  VMIN ,VMAX =  -1.853829E-03  3.289763E-02
> >>>>  WMIN ,WMAX =  -1.852430E-03  1.855088E-03
> >>>>  PMIN ,PMAX =   101283.  101416.
> >>>>  QMIN ,QMAX =   0.815285  0.816229
> >>>>  EMIN ,EMAX =   4.07643  4.08175
> >>>>  V3
> >>>>  Hello CONV4
> >>>>  RUS:   12.2556
> >>>>  V4
> >>>>  DIFF3
> >>>>  F3
> >>>>  Iterering:   669.000
> >>>>  H2T
> >>>>  Timestep:   1.000000000000000E-007
> >>>>  H1T
> >>>>  Diff from last:   1.785767662774350E-006
> >>>>  AUXVR
> >>>> --------------------------------------------------------------------------
> >>>> Petsc Release Version 2.3.0, Patch 32, April, 26, 2005
> >>>> See docs/changes/index.html for recent updates.
> >>>> See docs/faq.html for hints about trouble shooting.
> >>>> See docs/index.html for manual pages.
> >>>> -----------------------------------------------------------------------
> >>>> ./withrk on a linux-gnu named sethnx004.vac.com by yy26539 Mon Mar 20
> >>>> 10:17:33 2006
> >>>> Libraries linked from /home/yy26539/work/NISSE/petsc-2.3.0/lib/linux-gnu
> >>>> Configure run at Thu Oct 13 08:23:56 2005
> >>>> Configure options --with-cc=gcc --with-fc="f77 -N109"
> >>>> --download-mpich=1 --download-mpich-pm=gforker
> >>>> --download-f-blas-lapack=1 --with-shared=0
> >>>> -----------------------------------------------------------------------
> >>>> [0]PETSC ERROR: Caught signal number 8 FPE: Floating Point
> >>>> Exception,probably divide by zero
> >>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >>>> [0]PETSC ERROR: likely location of problem given in stack below
> >>>> [0]PETSC ERROR: --------------- Stack Frames ---------------
> >>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> >>>> [0]PETSC ERROR:       is given.
> >>>> [0]PETSC ERROR: [0] SNESComputeFunction line 788 src/snes/interface/snes.c
> >>>> [0]PETSC ERROR: [0] MatMult_MFFD line 235 src/snes/mf/snesmfj.c
> >>>> [0]PETSC ERROR: [0] MatMult line 1368 src/mat/interface/matrix.c
> >>>> [0]PETSC ERROR: [0] SNESLSCheckLocalMin_Private line 19 src/snes/impls/ls/ls.c
> >>>> [0]PETSC ERROR: [0] SNESSolve line 1656 src/snes/interface/snes.c
> >>>> [0]PETSC ERROR: --------------------------------------------
> >>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> >>>> unknown file
> >>>> [0]PETSC ERROR: Signal received!
> >>>> [0]PETSC ERROR:  !
> >>>> aborting job:
> >>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> >>>>
> >>>>
> >>>> /nisse
> >>>>
> >>>> On 3/18/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >>>>> On 3/18/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>
> >>>>>>    So you are saying that TS is feeding "reasonable" input
> >>>>>> for a while? That basically matches (in scale) the values feed in
> >>>>>> from the old rk code? Then SUDDENLY? it inputs values with a very
> >>>>>> different size?
> >>>>>
> >>>>> Yes.
> >>>>>
> >>>>>> Does the TS time-step size? also change dramatically at
> >>>>>> that point?
> >>>>>>
> >>>>>
> >>>>> No, I dont think so. However I'm not 100% sure what the timestep
> >>>>> hasent changed dramatically for the last iteration when it crashes
> >>>>> (but 95% sure). I can check that on monday.
> >>>>>
> >>>>> /nisse
> >>>>>
> >>>>>>     Barry
> >>>>>>
> >>>>>>
> >>>>>> On Sat, 18 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>
> >>>>>>> On 3/17/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>>>
> >>>>>>>>    Based on my understanding. This is correct.
> >>>>>>>
> >>>>>>> I was almost hoping I had missed something fundamental.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>    Suggest you run a TINY problem with your "old" code
> >>>>>>>> and the TS (and or SNES one). Print out everything. The
> >>>>>>>> current solution the result from calling Get_DRO() and
> >>>>>>>> compare the runs, when and why do they change? This will
> >>>>>>>> help understand what is going on.
> >>>>>>>
> >>>>>>> Well, Get_DRO() is exactly the same in the old and new code, and they
> >>>>>>> start with the same initial values. Get_DRO() is not producing strange
> >>>>>>> values, it is crashing because it get strange values fed into it.
> >>>>>>>
> >>>>>>> A normal run for the formfunction f(in,out) in TS would look something
> >>>>>>> like this if printed:
> >>>>>>> in=25560         <- Start value
> >>>>>>> out=0.0001     <-  du/dt
> >>>>>>> in=25560         <- Value fed into formfunction after some TS magic
> >>>>>>> out=0.0001      <- du/dt
> >>>>>>> ...                      <- (a couple iterations)
> >>>>>>> in=25451          <- Input value after a couple of iterations
> >>>>>>> out=0.0001      <- du/dt
> >>>>>>> in=0.23151       <- Value fed into formfunction decided by TS
> >>>>>>> Petscerror divide by zero...
> >>>>>>>
> >>>>>>> The problem seems to be that TS decides that the input values to
> >>>>>>> formfunction should be (in this case) 0.23151 which makes Get_DRO()
> >>>>>>> crash because of  for example sqrt(in-1).
> >>>>>>>
> >>>>>>> I have no idea how to fix this. The values from Get_DRO() in TS match
> >>>>>>> the values when running the old rk code.
> >>>>>>>
> >>>>>>> /nisse
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>>     Barry
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, 17 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>>>
> >>>>>>>>> Ok, I managed to compile and link rk.c to my fortran program, I forgot
> >>>>>>>>> that in C you need a ;  in the end of every statement. (stupid mistake
> >>>>>>>>> ;) )
> >>>>>>>>>
> >>>>>>>>> I have just added one line to see if it works.
> >>>>>>>>>
> >>>>>>>>>  /* computing new dt */
> >>>>>>>>>     dt = dt * dt_fac;
> >>>>>>>>>
> >>>>>>>>>     /* Start Nisse stuff */
> >>>>>>>>>        ierr = PetscPrintf(PETSC_COMM_WORLD,"Nisse prints dt: %f\n",dt);
> >>>>>>>>>     /* End nisse stuff */
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>     if(ts->ptime+dt > ts->max_time){
> >>>>>>>>>        dt = ts->max_time - ts->ptime;
> >>>>>>>>>     }
> >>>>>>>>>
> >>>>>>>>> I just try to print the current timestep, however this is never
> >>>>>>>>> printed. And I'm not really sure that it is the timestep that is
> >>>>>>>>> causing the problems.
> >>>>>>>>>
> >>>>>>>>> I have used call TSGetTimeStep(ts,timestep,ierr) to monitor what
> >>>>>>>>> timestep TS uses and it seem ok. However after the first iteration of
> >>>>>>>>> FormFunction everything seems ok, but in start of the second iteration
> >>>>>>>>> all values are really strange.
> >>>>>>>>>
> >>>>>>>>> I see the same thing when using SNES and my back euler implementation,
> >>>>>>>>> it iterate many more times however, but all of a sudden the all "in"
> >>>>>>>>> values are in the range 0.2-0.7 (for all 7 equations) and my code
> >>>>>>>>> bombs because of the strange values.
> >>>>>>>>>
> >>>>>>>>> When using TS and  running with  -snes_mf -ts_type beuler -ksp_rtol
> >>>>>>>>> 1.e-10 this is what printed just before producing strange values:
> >>>>>>>>> KSP Object:
> >>>>>>>>>  type: gmres
> >>>>>>>>>    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> >>>>>>>>> Orthogonalization with no iterative refinement
> >>>>>>>>>    GMRES: happy breakdown tolerance 1e-30
> >>>>>>>>>  maximum iterations=10000, initial guess is zero
> >>>>>>>>>  tolerances:  relative=1e-10, absolute=1e-50, divergence=10000
> >>>>>>>>>  left preconditioning
> >>>>>>>>> PC Object:
> >>>>>>>>>  type: none
> >>>>>>>>>  linear system matrix = precond matrix:
> >>>>>>>>>  Matrix Object:
> >>>>>>>>>    type=mffd, rows=70000, cols=70000
> >>>>>>>>>        SNES matrix-free approximation:
> >>>>>>>>>          err=1e-07 (relative error in function evaluation)
> >>>>>>>>>          Using wp compute h routine
> >>>>>>>>>          Computes normA
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> And just to make sure that I havent misunderstood how SNES and TS work:
> >>>>>>>>> If the original 3-stage RK uses (my numbering):
> >>>>>>>>> 1.  RO0(L)=RO(L)
> >>>>>>>>>     Get_DRO(RO(L))
> >>>>>>>>>     RO(L)=RO0(L)+CFL*DRO(L)
> >>>>>>>>> 2.  RO0(L)=.5*(RO0(L)+RO(L))
> >>>>>>>>>     Get_DRO(RO(L))
> >>>>>>>>>     RO(L)=RO0(L)+.5*CFL*DRO(L)
> >>>>>>>>> 3.  Get_DRO(RO(L))
> >>>>>>>>>     RO(L)=RO0(L)+.5*CFL*DRO(L)
> >>>>>>>>>
> >>>>>>>>> Then this should be in TS which should return du/dt which is DRO:
> >>>>>>>>>      RO(L)=xx(1,L)
> >>>>>>>>>      Get_DRO(RO(L))
> >>>>>>>>>      ff(1,L) = DRO(L)
> >>>>>>>>>
> >>>>>>>>> And in SNES with back euler:
> >>>>>>>>>      (Old RO from previous iteration ORO(L)
> >>>>>>>>>      RO(L)=xx(1,L)
> >>>>>>>>>      Get_DRO(RO(L))
> >>>>>>>>>      ff(1,L)= RO(L)-OLD(1,L)-TSF(L)*DRO(L)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> This became a long mail, I hope this shows if I missed something vital.
> >>>>>>>>> /nisse
> >>>>>>>>> On 3/16/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >>>>>>>>>> Barry,
> >>>>>>>>>> the problem is making the objectfile, but I'll try again when I have
> >>>>>>>>>> the code. I will check the makefile for the c-examples.
> >>>>>>>>>> /nisse
> >>>>>>>>>>
> >>>>>>>>>> On 3/16/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>    Nisse,
> >>>>>>>>>>>
> >>>>>>>>>>>      Just list it in your makefile with all your other object
> >>>>>>>>>>> files (that come from Fortran). Send the output if this fails.
> >>>>>>>>>>>
> >>>>>>>>>>>     Barry
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, 16 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I havent managed to get rk.c compiled with changes. how do I compile
> >>>>>>>>>>>> it in my working directory to get a object file. I just realised that
> >>>>>>>>>>>> I probably forgot to link it against $TSLIB but should I need to that
> >>>>>>>>>>>> when I dont do any linking, the linking is done when linking the
> >>>>>>>>>>>> fortran and the c code?
> >>>>>>>>>>>> Or what am I doing wrong (I not that good with C++ and linking).
> >>>>>>>>>>>> /nisse
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 3/15/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    Both
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, 15 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>    Sorry, I forgot. Is the linear solver converging? If not, then that
> >>>>>>>>>>>>>>> is the problem? Use a tolerance like -ksp_rtol 1.e-10 and see if the
> >>>>>>>>>>>>>>> nonlinear solver converges.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'll try that when I have access to the code. I havent checked if the
> >>>>>>>>>>>>>> linear solver converges is thera a -kspmonitor or -kspconvergedreason
> >>>>>>>>>>>>>> I should use?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> /nisse
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> Nils-Erik Svangård
> >>>>>>>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>>>>>>> Skype: schweingaard
> >>>>>>>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Nils-Erik Svangård
> >>>>>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>>>>> Skype: schweingaard
> >>>>>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Nils-Erik Svangård
> >>>>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>>>> Skype: schweingaard
> >>>>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Nils-Erik Svangård
> >>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>> Skype: schweingaard
> >>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Nils-Erik Svangård
> >>>>> E-Mail: nilserik at gmail.com
> >>>>> MSN: schweingaard at hotmail.com
> >>>>> Skype: schweingaard
> >>>>> Mobil: +46-(0)70-3612178
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Nils-Erik Svangård
> >>>> E-Mail: nilserik at gmail.com
> >>>> MSN: schweingaard at hotmail.com
> >>>> Skype: schweingaard
> >>>> Mobil: +46-(0)70-3612178
> >>>>
> >>>
> >>>
> >>> --
> >>> Nils-Erik Svangård
> >>> E-Mail: nilserik at gmail.com
> >>> MSN: schweingaard at hotmail.com
> >>> Skype: schweingaard
> >>> Mobil: +46-(0)70-3612178
> >>>
> >>>
> >>
> >
> >
> > --
> > Nils-Erik Svangård
> > E-Mail: nilserik at gmail.com
> > MSN: schweingaard at hotmail.com
> > Skype: schweingaard
> > Mobil: +46-(0)70-3612178
> >
> >
>


--
Nils-Erik Svangård
E-Mail: nilserik at gmail.com
MSN: schweingaard at hotmail.com
Skype: schweingaard
Mobil: +46-(0)70-3612178




More information about the petsc-users mailing list