SNES Problem
Nils Erik Svangård
nilserik at gmail.com
Mon Mar 20 11:38:38 CST 2006
I will investigate the possibilty and get back to you offlist.
/nisse
On 3/20/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> Nisse,
>
> Is there any way you can bundle everything up so you could send
> it to me and I could easily compile it and make? This approach to
> trying to discover the problem is taking too long.
>
> Barry
>
>
> On Mon, 20 Mar 2006, Nils Erik Svangård wrote:
>
> > Barry,
> > I have now rewritten the code
> >
> > do 1:i
> > do 1:j
> > do 1:k
> > l= old_numbering
> > m++ newnumbering
> > xx(1,m)=RO(l)
> > xx(2,m)=RU(l)
> > ...
> > end dos
> >
> > The vectors are created by running, where m is the number of
> > iterations in the do loop times 7 because we have 7 diffs to solve
> > (DOF = 7).
> >
> > m=m*7
> > call VecCreateSeq(PETSC_COMM_SELF,m,x,ierr)
> >
> > could this be a source of trouble, solving 7 diffs at the same time?
> >
> > Petsc still supply get_DRO with strange values, RK crashes directly
> > but beuler does about the double amount of iterations but still doesnt
> > to complete a timestep.
> >
> > /nisse
> >
> >
> > On 3/20/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>
> >> Nisse,
> >>
> >> Yes. You need to make sure that all the entries computed
> >> in your FormFunction() are correctly F_i(x) for 0 <= i < globalvectorlength!!
> >> The "zeros" will destroy the Newton convergence since the Jacobian
> >> will be singular, plus if the "zeros" are actually random "garbage"
> >> because the memory has not be initialized that will really screw
> >> things up.
> >>
> >> Barry
> >>
> >>
> >> On Mon, 20 Mar 2006, Nils Erik Svangård wrote:
> >>
> >>> Hi again,
> >>> I thought that I would comment on the diff from last being 0.90 all
> >>> the time. That is because a error in calculation (by me).
> >>>
> >>> do IMESH=1,NMESH
> >>> NMPL=NMP(IMESH)
> >>> NIL = NI(IMESH)
> >>> NJL = NJ(IMESH)
> >>> NKL = NK(IMESH)
> >>> do K=1,NKL-1
> >>> do J=1,NJL-1
> >>> do I=1,NIL-1
> >>> L=I+NIL*(J-1)+NIL*NJL*(K-1)+NMPL
> >>> count=count+1
> >>> DRO(L)=magic()
> >>> end do
> >>> end do
> >>> end do
> >>> end do
> >>>
> >>> count after a run is 8019 and L is 8899, so the vector containg the
> >>> solution also have a couple of zeros in the middle, could this effect
> >>> the SNES solver?
> >>>
> >>> /nisse
> >>>
> >>>
> >>> On 3/20/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >>>> Hi,
> >>>> The timestep doesnt change dramaticly just before the values get
> >>>> strange. The term diff from last is:
> >>>> a=new_value(L)/old_value(L)+a
> >>>> Diff from last=a/L
> >>>> To me it seems as something petsc does supplies strange values.
> >>>> The output berfore crashing:
> >>>>
> >>>>
> >>>> Iterering: 667.000
> >>>> H2T
> >>>> Timestep: 1.000000000000000E-007
> >>>> H1T
> >>>> Diff from last: 0.901112485562141
> >>>> AUXVR
> >>>> ROMIN,ROMAX= 1.22541 1.22657
> >>>> UMIN ,UMAX = -4.016098E-02 0.278343
> >>>> VMIN ,VMAX = -1.853829E-03 3.289763E-02
> >>>> WMIN ,WMAX = -1.852430E-03 1.855088E-03
> >>>> PMIN ,PMAX = 101283. 101416.
> >>>> QMIN ,QMAX = 0.815285 0.816229
> >>>> EMIN ,EMAX = 4.07643 4.08175
> >>>> V3
> >>>> Hello CONV4
> >>>> RUS: 12.2556
> >>>> V4
> >>>> DIFF3
> >>>> F3
> >>>> Iterering: 668.000
> >>>> H2T
> >>>> Timestep: 1.000000000000000E-007
> >>>> H1T
> >>>> Diff from last: 0.901112485562141
> >>>> AUXVR
> >>>> ROMIN,ROMAX= 1.22541 1.22657
> >>>> UMIN ,UMAX = -4.016098E-02 0.278343
> >>>> VMIN ,VMAX = -1.853829E-03 3.289763E-02
> >>>> WMIN ,WMAX = -1.852430E-03 1.855088E-03
> >>>> PMIN ,PMAX = 101283. 101416.
> >>>> QMIN ,QMAX = 0.815285 0.816229
> >>>> EMIN ,EMAX = 4.07643 4.08175
> >>>> V3
> >>>> Hello CONV4
> >>>> RUS: 12.2556
> >>>> V4
> >>>> DIFF3
> >>>> F3
> >>>> Iterering: 669.000
> >>>> H2T
> >>>> Timestep: 1.000000000000000E-007
> >>>> H1T
> >>>> Diff from last: 1.785767662774350E-006
> >>>> AUXVR
> >>>> --------------------------------------------------------------------------
> >>>> Petsc Release Version 2.3.0, Patch 32, April, 26, 2005
> >>>> See docs/changes/index.html for recent updates.
> >>>> See docs/faq.html for hints about trouble shooting.
> >>>> See docs/index.html for manual pages.
> >>>> -----------------------------------------------------------------------
> >>>> ./withrk on a linux-gnu named sethnx004.vac.com by yy26539 Mon Mar 20
> >>>> 10:17:33 2006
> >>>> Libraries linked from /home/yy26539/work/NISSE/petsc-2.3.0/lib/linux-gnu
> >>>> Configure run at Thu Oct 13 08:23:56 2005
> >>>> Configure options --with-cc=gcc --with-fc="f77 -N109"
> >>>> --download-mpich=1 --download-mpich-pm=gforker
> >>>> --download-f-blas-lapack=1 --with-shared=0
> >>>> -----------------------------------------------------------------------
> >>>> [0]PETSC ERROR: Caught signal number 8 FPE: Floating Point
> >>>> Exception,probably divide by zero
> >>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >>>> [0]PETSC ERROR: likely location of problem given in stack below
> >>>> [0]PETSC ERROR: --------------- Stack Frames ---------------
> >>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >>>> [0]PETSC ERROR: INSTEAD the line number of the start of the function
> >>>> [0]PETSC ERROR: is given.
> >>>> [0]PETSC ERROR: [0] SNESComputeFunction line 788 src/snes/interface/snes.c
> >>>> [0]PETSC ERROR: [0] MatMult_MFFD line 235 src/snes/mf/snesmfj.c
> >>>> [0]PETSC ERROR: [0] MatMult line 1368 src/mat/interface/matrix.c
> >>>> [0]PETSC ERROR: [0] SNESLSCheckLocalMin_Private line 19 src/snes/impls/ls/ls.c
> >>>> [0]PETSC ERROR: [0] SNESSolve line 1656 src/snes/interface/snes.c
> >>>> [0]PETSC ERROR: --------------------------------------------
> >>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> >>>> unknown file
> >>>> [0]PETSC ERROR: Signal received!
> >>>> [0]PETSC ERROR: !
> >>>> aborting job:
> >>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> >>>>
> >>>>
> >>>> /nisse
> >>>>
> >>>> On 3/18/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >>>>> On 3/18/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>
> >>>>>> So you are saying that TS is feeding "reasonable" input
> >>>>>> for a while? That basically matches (in scale) the values feed in
> >>>>>> from the old rk code? Then SUDDENLY? it inputs values with a very
> >>>>>> different size?
> >>>>>
> >>>>> Yes.
> >>>>>
> >>>>>> Does the TS time-step size? also change dramatically at
> >>>>>> that point?
> >>>>>>
> >>>>>
> >>>>> No, I dont think so. However I'm not 100% sure what the timestep
> >>>>> hasent changed dramatically for the last iteration when it crashes
> >>>>> (but 95% sure). I can check that on monday.
> >>>>>
> >>>>> /nisse
> >>>>>
> >>>>>> Barry
> >>>>>>
> >>>>>>
> >>>>>> On Sat, 18 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>
> >>>>>>> On 3/17/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>>>
> >>>>>>>> Based on my understanding. This is correct.
> >>>>>>>
> >>>>>>> I was almost hoping I had missed something fundamental.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Suggest you run a TINY problem with your "old" code
> >>>>>>>> and the TS (and or SNES one). Print out everything. The
> >>>>>>>> current solution the result from calling Get_DRO() and
> >>>>>>>> compare the runs, when and why do they change? This will
> >>>>>>>> help understand what is going on.
> >>>>>>>
> >>>>>>> Well, Get_DRO() is exactly the same in the old and new code, and they
> >>>>>>> start with the same initial values. Get_DRO() is not producing strange
> >>>>>>> values, it is crashing because it get strange values fed into it.
> >>>>>>>
> >>>>>>> A normal run for the formfunction f(in,out) in TS would look something
> >>>>>>> like this if printed:
> >>>>>>> in=25560 <- Start value
> >>>>>>> out=0.0001 <- du/dt
> >>>>>>> in=25560 <- Value fed into formfunction after some TS magic
> >>>>>>> out=0.0001 <- du/dt
> >>>>>>> ... <- (a couple iterations)
> >>>>>>> in=25451 <- Input value after a couple of iterations
> >>>>>>> out=0.0001 <- du/dt
> >>>>>>> in=0.23151 <- Value fed into formfunction decided by TS
> >>>>>>> Petscerror divide by zero...
> >>>>>>>
> >>>>>>> The problem seems to be that TS decides that the input values to
> >>>>>>> formfunction should be (in this case) 0.23151 which makes Get_DRO()
> >>>>>>> crash because of for example sqrt(in-1).
> >>>>>>>
> >>>>>>> I have no idea how to fix this. The values from Get_DRO() in TS match
> >>>>>>> the values when running the old rk code.
> >>>>>>>
> >>>>>>> /nisse
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Barry
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, 17 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>>>
> >>>>>>>>> Ok, I managed to compile and link rk.c to my fortran program, I forgot
> >>>>>>>>> that in C you need a ; in the end of every statement. (stupid mistake
> >>>>>>>>> ;) )
> >>>>>>>>>
> >>>>>>>>> I have just added one line to see if it works.
> >>>>>>>>>
> >>>>>>>>> /* computing new dt */
> >>>>>>>>> dt = dt * dt_fac;
> >>>>>>>>>
> >>>>>>>>> /* Start Nisse stuff */
> >>>>>>>>> ierr = PetscPrintf(PETSC_COMM_WORLD,"Nisse prints dt: %f\n",dt);
> >>>>>>>>> /* End nisse stuff */
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> if(ts->ptime+dt > ts->max_time){
> >>>>>>>>> dt = ts->max_time - ts->ptime;
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> I just try to print the current timestep, however this is never
> >>>>>>>>> printed. And I'm not really sure that it is the timestep that is
> >>>>>>>>> causing the problems.
> >>>>>>>>>
> >>>>>>>>> I have used call TSGetTimeStep(ts,timestep,ierr) to monitor what
> >>>>>>>>> timestep TS uses and it seem ok. However after the first iteration of
> >>>>>>>>> FormFunction everything seems ok, but in start of the second iteration
> >>>>>>>>> all values are really strange.
> >>>>>>>>>
> >>>>>>>>> I see the same thing when using SNES and my back euler implementation,
> >>>>>>>>> it iterate many more times however, but all of a sudden the all "in"
> >>>>>>>>> values are in the range 0.2-0.7 (for all 7 equations) and my code
> >>>>>>>>> bombs because of the strange values.
> >>>>>>>>>
> >>>>>>>>> When using TS and running with -snes_mf -ts_type beuler -ksp_rtol
> >>>>>>>>> 1.e-10 this is what printed just before producing strange values:
> >>>>>>>>> KSP Object:
> >>>>>>>>> type: gmres
> >>>>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> >>>>>>>>> Orthogonalization with no iterative refinement
> >>>>>>>>> GMRES: happy breakdown tolerance 1e-30
> >>>>>>>>> maximum iterations=10000, initial guess is zero
> >>>>>>>>> tolerances: relative=1e-10, absolute=1e-50, divergence=10000
> >>>>>>>>> left preconditioning
> >>>>>>>>> PC Object:
> >>>>>>>>> type: none
> >>>>>>>>> linear system matrix = precond matrix:
> >>>>>>>>> Matrix Object:
> >>>>>>>>> type=mffd, rows=70000, cols=70000
> >>>>>>>>> SNES matrix-free approximation:
> >>>>>>>>> err=1e-07 (relative error in function evaluation)
> >>>>>>>>> Using wp compute h routine
> >>>>>>>>> Computes normA
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> And just to make sure that I havent misunderstood how SNES and TS work:
> >>>>>>>>> If the original 3-stage RK uses (my numbering):
> >>>>>>>>> 1. RO0(L)=RO(L)
> >>>>>>>>> Get_DRO(RO(L))
> >>>>>>>>> RO(L)=RO0(L)+CFL*DRO(L)
> >>>>>>>>> 2. RO0(L)=.5*(RO0(L)+RO(L))
> >>>>>>>>> Get_DRO(RO(L))
> >>>>>>>>> RO(L)=RO0(L)+.5*CFL*DRO(L)
> >>>>>>>>> 3. Get_DRO(RO(L))
> >>>>>>>>> RO(L)=RO0(L)+.5*CFL*DRO(L)
> >>>>>>>>>
> >>>>>>>>> Then this should be in TS which should return du/dt which is DRO:
> >>>>>>>>> RO(L)=xx(1,L)
> >>>>>>>>> Get_DRO(RO(L))
> >>>>>>>>> ff(1,L) = DRO(L)
> >>>>>>>>>
> >>>>>>>>> And in SNES with back euler:
> >>>>>>>>> (Old RO from previous iteration ORO(L)
> >>>>>>>>> RO(L)=xx(1,L)
> >>>>>>>>> Get_DRO(RO(L))
> >>>>>>>>> ff(1,L)= RO(L)-OLD(1,L)-TSF(L)*DRO(L)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> This became a long mail, I hope this shows if I missed something vital.
> >>>>>>>>> /nisse
> >>>>>>>>> On 3/16/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >>>>>>>>>> Barry,
> >>>>>>>>>> the problem is making the objectfile, but I'll try again when I have
> >>>>>>>>>> the code. I will check the makefile for the c-examples.
> >>>>>>>>>> /nisse
> >>>>>>>>>>
> >>>>>>>>>> On 3/16/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Nisse,
> >>>>>>>>>>>
> >>>>>>>>>>> Just list it in your makefile with all your other object
> >>>>>>>>>>> files (that come from Fortran). Send the output if this fails.
> >>>>>>>>>>>
> >>>>>>>>>>> Barry
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, 16 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I havent managed to get rk.c compiled with changes. how do I compile
> >>>>>>>>>>>> it in my working directory to get a object file. I just realised that
> >>>>>>>>>>>> I probably forgot to link it against $TSLIB but should I need to that
> >>>>>>>>>>>> when I dont do any linking, the linking is done when linking the
> >>>>>>>>>>>> fortran and the c code?
> >>>>>>>>>>>> Or what am I doing wrong (I not that good with C++ and linking).
> >>>>>>>>>>>> /nisse
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 3/15/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Both
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, 15 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Sorry, I forgot. Is the linear solver converging? If not, then that
> >>>>>>>>>>>>>>> is the problem? Use a tolerance like -ksp_rtol 1.e-10 and see if the
> >>>>>>>>>>>>>>> nonlinear solver converges.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'll try that when I have access to the code. I havent checked if the
> >>>>>>>>>>>>>> linear solver converges is thera a -kspmonitor or -kspconvergedreason
> >>>>>>>>>>>>>> I should use?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> /nisse
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> Nils-Erik Svangård
> >>>>>>>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>>>>>>> Skype: schweingaard
> >>>>>>>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Nils-Erik Svangård
> >>>>>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>>>>> Skype: schweingaard
> >>>>>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Nils-Erik Svangård
> >>>>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>>>> Skype: schweingaard
> >>>>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Nils-Erik Svangård
> >>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>> Skype: schweingaard
> >>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Nils-Erik Svangård
> >>>>> E-Mail: nilserik at gmail.com
> >>>>> MSN: schweingaard at hotmail.com
> >>>>> Skype: schweingaard
> >>>>> Mobil: +46-(0)70-3612178
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Nils-Erik Svangård
> >>>> E-Mail: nilserik at gmail.com
> >>>> MSN: schweingaard at hotmail.com
> >>>> Skype: schweingaard
> >>>> Mobil: +46-(0)70-3612178
> >>>>
> >>>
> >>>
> >>> --
> >>> Nils-Erik Svangård
> >>> E-Mail: nilserik at gmail.com
> >>> MSN: schweingaard at hotmail.com
> >>> Skype: schweingaard
> >>> Mobil: +46-(0)70-3612178
> >>>
> >>>
> >>
> >
> >
> > --
> > Nils-Erik Svangård
> > E-Mail: nilserik at gmail.com
> > MSN: schweingaard at hotmail.com
> > Skype: schweingaard
> > Mobil: +46-(0)70-3612178
> >
> >
>
--
Nils-Erik Svangård
E-Mail: nilserik at gmail.com
MSN: schweingaard at hotmail.com
Skype: schweingaard
Mobil: +46-(0)70-3612178
More information about the petsc-users
mailing list