SNES Problem
Nils Erik Svangård
nilserik at gmail.com
Mon Mar 20 08:58:36 CST 2006
Barry,
I have now rewritten the code
do 1:i
do 1:j
do 1:k
l= old_numbering
m++ newnumbering
xx(1,m)=RO(l)
xx(2,m)=RU(l)
...
end dos
The vectors are created by running, where m is the number of
iterations in the do loop times 7 because we have 7 diffs to solve
(DOF = 7).
m=m*7
call VecCreateSeq(PETSC_COMM_SELF,m,x,ierr)
could this be a source of trouble, solving 7 diffs at the same time?
Petsc still supply get_DRO with strange values, RK crashes directly
but beuler does about the double amount of iterations but still doesnt
to complete a timestep.
/nisse
On 3/20/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> Nisse,
>
> Yes. You need to make sure that all the entries computed
> in your FormFunction() are correctly F_i(x) for 0 <= i < globalvectorlength!!
> The "zeros" will destroy the Newton convergence since the Jacobian
> will be singular, plus if the "zeros" are actually random "garbage"
> because the memory has not be initialized that will really screw
> things up.
>
> Barry
>
>
> On Mon, 20 Mar 2006, Nils Erik Svangård wrote:
>
> > Hi again,
> > I thought that I would comment on the diff from last being 0.90 all
> > the time. That is because a error in calculation (by me).
> >
> > do IMESH=1,NMESH
> > NMPL=NMP(IMESH)
> > NIL = NI(IMESH)
> > NJL = NJ(IMESH)
> > NKL = NK(IMESH)
> > do K=1,NKL-1
> > do J=1,NJL-1
> > do I=1,NIL-1
> > L=I+NIL*(J-1)+NIL*NJL*(K-1)+NMPL
> > count=count+1
> > DRO(L)=magic()
> > end do
> > end do
> > end do
> > end do
> >
> > count after a run is 8019 and L is 8899, so the vector containg the
> > solution also have a couple of zeros in the middle, could this effect
> > the SNES solver?
> >
> > /nisse
> >
> >
> > On 3/20/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >> Hi,
> >> The timestep doesnt change dramaticly just before the values get
> >> strange. The term diff from last is:
> >> a=new_value(L)/old_value(L)+a
> >> Diff from last=a/L
> >> To me it seems as something petsc does supplies strange values.
> >> The output berfore crashing:
> >>
> >>
> >> Iterering: 667.000
> >> H2T
> >> Timestep: 1.000000000000000E-007
> >> H1T
> >> Diff from last: 0.901112485562141
> >> AUXVR
> >> ROMIN,ROMAX= 1.22541 1.22657
> >> UMIN ,UMAX = -4.016098E-02 0.278343
> >> VMIN ,VMAX = -1.853829E-03 3.289763E-02
> >> WMIN ,WMAX = -1.852430E-03 1.855088E-03
> >> PMIN ,PMAX = 101283. 101416.
> >> QMIN ,QMAX = 0.815285 0.816229
> >> EMIN ,EMAX = 4.07643 4.08175
> >> V3
> >> Hello CONV4
> >> RUS: 12.2556
> >> V4
> >> DIFF3
> >> F3
> >> Iterering: 668.000
> >> H2T
> >> Timestep: 1.000000000000000E-007
> >> H1T
> >> Diff from last: 0.901112485562141
> >> AUXVR
> >> ROMIN,ROMAX= 1.22541 1.22657
> >> UMIN ,UMAX = -4.016098E-02 0.278343
> >> VMIN ,VMAX = -1.853829E-03 3.289763E-02
> >> WMIN ,WMAX = -1.852430E-03 1.855088E-03
> >> PMIN ,PMAX = 101283. 101416.
> >> QMIN ,QMAX = 0.815285 0.816229
> >> EMIN ,EMAX = 4.07643 4.08175
> >> V3
> >> Hello CONV4
> >> RUS: 12.2556
> >> V4
> >> DIFF3
> >> F3
> >> Iterering: 669.000
> >> H2T
> >> Timestep: 1.000000000000000E-007
> >> H1T
> >> Diff from last: 1.785767662774350E-006
> >> AUXVR
> >> --------------------------------------------------------------------------
> >> Petsc Release Version 2.3.0, Patch 32, April, 26, 2005
> >> See docs/changes/index.html for recent updates.
> >> See docs/faq.html for hints about trouble shooting.
> >> See docs/index.html for manual pages.
> >> -----------------------------------------------------------------------
> >> ./withrk on a linux-gnu named sethnx004.vac.com by yy26539 Mon Mar 20
> >> 10:17:33 2006
> >> Libraries linked from /home/yy26539/work/NISSE/petsc-2.3.0/lib/linux-gnu
> >> Configure run at Thu Oct 13 08:23:56 2005
> >> Configure options --with-cc=gcc --with-fc="f77 -N109"
> >> --download-mpich=1 --download-mpich-pm=gforker
> >> --download-f-blas-lapack=1 --with-shared=0
> >> -----------------------------------------------------------------------
> >> [0]PETSC ERROR: Caught signal number 8 FPE: Floating Point
> >> Exception,probably divide by zero
> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [0]PETSC ERROR: likely location of problem given in stack below
> >> [0]PETSC ERROR: --------------- Stack Frames ---------------
> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [0]PETSC ERROR: INSTEAD the line number of the start of the function
> >> [0]PETSC ERROR: is given.
> >> [0]PETSC ERROR: [0] SNESComputeFunction line 788 src/snes/interface/snes.c
> >> [0]PETSC ERROR: [0] MatMult_MFFD line 235 src/snes/mf/snesmfj.c
> >> [0]PETSC ERROR: [0] MatMult line 1368 src/mat/interface/matrix.c
> >> [0]PETSC ERROR: [0] SNESLSCheckLocalMin_Private line 19 src/snes/impls/ls/ls.c
> >> [0]PETSC ERROR: [0] SNESSolve line 1656 src/snes/interface/snes.c
> >> [0]PETSC ERROR: --------------------------------------------
> >> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> >> unknown file
> >> [0]PETSC ERROR: Signal received!
> >> [0]PETSC ERROR: !
> >> aborting job:
> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> >>
> >>
> >> /nisse
> >>
> >> On 3/18/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >>> On 3/18/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>
> >>>> So you are saying that TS is feeding "reasonable" input
> >>>> for a while? That basically matches (in scale) the values feed in
> >>>> from the old rk code? Then SUDDENLY? it inputs values with a very
> >>>> different size?
> >>>
> >>> Yes.
> >>>
> >>>> Does the TS time-step size? also change dramatically at
> >>>> that point?
> >>>>
> >>>
> >>> No, I dont think so. However I'm not 100% sure what the timestep
> >>> hasent changed dramatically for the last iteration when it crashes
> >>> (but 95% sure). I can check that on monday.
> >>>
> >>> /nisse
> >>>
> >>>> Barry
> >>>>
> >>>>
> >>>> On Sat, 18 Mar 2006, Nils Erik Svangård wrote:
> >>>>
> >>>>> On 3/17/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>
> >>>>>> Based on my understanding. This is correct.
> >>>>>
> >>>>> I was almost hoping I had missed something fundamental.
> >>>>>
> >>>>>>
> >>>>>> Suggest you run a TINY problem with your "old" code
> >>>>>> and the TS (and or SNES one). Print out everything. The
> >>>>>> current solution the result from calling Get_DRO() and
> >>>>>> compare the runs, when and why do they change? This will
> >>>>>> help understand what is going on.
> >>>>>
> >>>>> Well, Get_DRO() is exactly the same in the old and new code, and they
> >>>>> start with the same initial values. Get_DRO() is not producing strange
> >>>>> values, it is crashing because it get strange values fed into it.
> >>>>>
> >>>>> A normal run for the formfunction f(in,out) in TS would look something
> >>>>> like this if printed:
> >>>>> in=25560 <- Start value
> >>>>> out=0.0001 <- du/dt
> >>>>> in=25560 <- Value fed into formfunction after some TS magic
> >>>>> out=0.0001 <- du/dt
> >>>>> ... <- (a couple iterations)
> >>>>> in=25451 <- Input value after a couple of iterations
> >>>>> out=0.0001 <- du/dt
> >>>>> in=0.23151 <- Value fed into formfunction decided by TS
> >>>>> Petscerror divide by zero...
> >>>>>
> >>>>> The problem seems to be that TS decides that the input values to
> >>>>> formfunction should be (in this case) 0.23151 which makes Get_DRO()
> >>>>> crash because of for example sqrt(in-1).
> >>>>>
> >>>>> I have no idea how to fix this. The values from Get_DRO() in TS match
> >>>>> the values when running the old rk code.
> >>>>>
> >>>>> /nisse
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> Barry
> >>>>>>
> >>>>>>
> >>>>>> On Fri, 17 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>
> >>>>>>> Ok, I managed to compile and link rk.c to my fortran program, I forgot
> >>>>>>> that in C you need a ; in the end of every statement. (stupid mistake
> >>>>>>> ;) )
> >>>>>>>
> >>>>>>> I have just added one line to see if it works.
> >>>>>>>
> >>>>>>> /* computing new dt */
> >>>>>>> dt = dt * dt_fac;
> >>>>>>>
> >>>>>>> /* Start Nisse stuff */
> >>>>>>> ierr = PetscPrintf(PETSC_COMM_WORLD,"Nisse prints dt: %f\n",dt);
> >>>>>>> /* End nisse stuff */
> >>>>>>>
> >>>>>>>
> >>>>>>> if(ts->ptime+dt > ts->max_time){
> >>>>>>> dt = ts->max_time - ts->ptime;
> >>>>>>> }
> >>>>>>>
> >>>>>>> I just try to print the current timestep, however this is never
> >>>>>>> printed. And I'm not really sure that it is the timestep that is
> >>>>>>> causing the problems.
> >>>>>>>
> >>>>>>> I have used call TSGetTimeStep(ts,timestep,ierr) to monitor what
> >>>>>>> timestep TS uses and it seem ok. However after the first iteration of
> >>>>>>> FormFunction everything seems ok, but in start of the second iteration
> >>>>>>> all values are really strange.
> >>>>>>>
> >>>>>>> I see the same thing when using SNES and my back euler implementation,
> >>>>>>> it iterate many more times however, but all of a sudden the all "in"
> >>>>>>> values are in the range 0.2-0.7 (for all 7 equations) and my code
> >>>>>>> bombs because of the strange values.
> >>>>>>>
> >>>>>>> When using TS and running with -snes_mf -ts_type beuler -ksp_rtol
> >>>>>>> 1.e-10 this is what printed just before producing strange values:
> >>>>>>> KSP Object:
> >>>>>>> type: gmres
> >>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> >>>>>>> Orthogonalization with no iterative refinement
> >>>>>>> GMRES: happy breakdown tolerance 1e-30
> >>>>>>> maximum iterations=10000, initial guess is zero
> >>>>>>> tolerances: relative=1e-10, absolute=1e-50, divergence=10000
> >>>>>>> left preconditioning
> >>>>>>> PC Object:
> >>>>>>> type: none
> >>>>>>> linear system matrix = precond matrix:
> >>>>>>> Matrix Object:
> >>>>>>> type=mffd, rows=70000, cols=70000
> >>>>>>> SNES matrix-free approximation:
> >>>>>>> err=1e-07 (relative error in function evaluation)
> >>>>>>> Using wp compute h routine
> >>>>>>> Computes normA
> >>>>>>>
> >>>>>>>
> >>>>>>> And just to make sure that I havent misunderstood how SNES and TS work:
> >>>>>>> If the original 3-stage RK uses (my numbering):
> >>>>>>> 1. RO0(L)=RO(L)
> >>>>>>> Get_DRO(RO(L))
> >>>>>>> RO(L)=RO0(L)+CFL*DRO(L)
> >>>>>>> 2. RO0(L)=.5*(RO0(L)+RO(L))
> >>>>>>> Get_DRO(RO(L))
> >>>>>>> RO(L)=RO0(L)+.5*CFL*DRO(L)
> >>>>>>> 3. Get_DRO(RO(L))
> >>>>>>> RO(L)=RO0(L)+.5*CFL*DRO(L)
> >>>>>>>
> >>>>>>> Then this should be in TS which should return du/dt which is DRO:
> >>>>>>> RO(L)=xx(1,L)
> >>>>>>> Get_DRO(RO(L))
> >>>>>>> ff(1,L) = DRO(L)
> >>>>>>>
> >>>>>>> And in SNES with back euler:
> >>>>>>> (Old RO from previous iteration ORO(L)
> >>>>>>> RO(L)=xx(1,L)
> >>>>>>> Get_DRO(RO(L))
> >>>>>>> ff(1,L)= RO(L)-OLD(1,L)-TSF(L)*DRO(L)
> >>>>>>>
> >>>>>>>
> >>>>>>> This became a long mail, I hope this shows if I missed something vital.
> >>>>>>> /nisse
> >>>>>>> On 3/16/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
> >>>>>>>> Barry,
> >>>>>>>> the problem is making the objectfile, but I'll try again when I have
> >>>>>>>> the code. I will check the makefile for the c-examples.
> >>>>>>>> /nisse
> >>>>>>>>
> >>>>>>>> On 3/16/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>>>>
> >>>>>>>>> Nisse,
> >>>>>>>>>
> >>>>>>>>> Just list it in your makefile with all your other object
> >>>>>>>>> files (that come from Fortran). Send the output if this fails.
> >>>>>>>>>
> >>>>>>>>> Barry
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thu, 16 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>>>>
> >>>>>>>>>> I havent managed to get rk.c compiled with changes. how do I compile
> >>>>>>>>>> it in my working directory to get a object file. I just realised that
> >>>>>>>>>> I probably forgot to link it against $TSLIB but should I need to that
> >>>>>>>>>> when I dont do any linking, the linking is done when linking the
> >>>>>>>>>> fortran and the c code?
> >>>>>>>>>> Or what am I doing wrong (I not that good with C++ and linking).
> >>>>>>>>>> /nisse
> >>>>>>>>>>
> >>>>>>>>>> On 3/15/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Both
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, 15 Mar 2006, Nils Erik Svangård wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Sorry, I forgot. Is the linear solver converging? If not, then that
> >>>>>>>>>>>>> is the problem? Use a tolerance like -ksp_rtol 1.e-10 and see if the
> >>>>>>>>>>>>> nonlinear solver converges.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'll try that when I have access to the code. I havent checked if the
> >>>>>>>>>>>> linear solver converges is thera a -kspmonitor or -kspconvergedreason
> >>>>>>>>>>>> I should use?
> >>>>>>>>>>>>
> >>>>>>>>>>>> /nisse
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Nils-Erik Svangård
> >>>>>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>>>>> Skype: schweingaard
> >>>>>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Nils-Erik Svangård
> >>>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>>> Skype: schweingaard
> >>>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Nils-Erik Svangård
> >>>>>>> E-Mail: nilserik at gmail.com
> >>>>>>> MSN: schweingaard at hotmail.com
> >>>>>>> Skype: schweingaard
> >>>>>>> Mobil: +46-(0)70-3612178
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Nils-Erik Svangård
> >>>>> E-Mail: nilserik at gmail.com
> >>>>> MSN: schweingaard at hotmail.com
> >>>>> Skype: schweingaard
> >>>>> Mobil: +46-(0)70-3612178
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Nils-Erik Svangård
> >>> E-Mail: nilserik at gmail.com
> >>> MSN: schweingaard at hotmail.com
> >>> Skype: schweingaard
> >>> Mobil: +46-(0)70-3612178
> >>>
> >>
> >>
> >> --
> >> Nils-Erik Svangård
> >> E-Mail: nilserik at gmail.com
> >> MSN: schweingaard at hotmail.com
> >> Skype: schweingaard
> >> Mobil: +46-(0)70-3612178
> >>
> >
> >
> > --
> > Nils-Erik Svangård
> > E-Mail: nilserik at gmail.com
> > MSN: schweingaard at hotmail.com
> > Skype: schweingaard
> > Mobil: +46-(0)70-3612178
> >
> >
>
--
Nils-Erik Svangård
E-Mail: nilserik at gmail.com
MSN: schweingaard at hotmail.com
Skype: schweingaard
Mobil: +46-(0)70-3612178
More information about the petsc-users
mailing list