SNES Problem
Barry Smith
bsmith at mcs.anl.gov
Mon Mar 20 11:30:30 CST 2006
Nisse,
Is there any way you can bundle everything up so you could send
it to me and I could easily compile it and make? This approach to
trying to discover the problem is taking too long.
Barry
On Mon, 20 Mar 2006, Nils Erik Svangård wrote:
> Barry,
> I have now rewritten the code
>
> do 1:i
> do 1:j
> do 1:k
> l= old_numbering
> m++ newnumbering
> xx(1,m)=RO(l)
> xx(2,m)=RU(l)
> ...
> end dos
>
> The vectors are created by running, where m is the number of
> iterations in the do loop times 7 because we have 7 diffs to solve
> (DOF = 7).
>
> m=m*7
> call VecCreateSeq(PETSC_COMM_SELF,m,x,ierr)
>
> could this be a source of trouble, solving 7 diffs at the same time?
>
> Petsc still supply get_DRO with strange values, RK crashes directly
> but beuler does about the double amount of iterations but still doesnt
> to complete a timestep.
>
> /nisse
>
>
> On 3/20/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>> Nisse,
>>
>> Yes. You need to make sure that all the entries computed
>> in your FormFunction() are correctly F_i(x) for 0 <= i < globalvectorlength!!
>> The "zeros" will destroy the Newton convergence since the Jacobian
>> will be singular, plus if the "zeros" are actually random "garbage"
>> because the memory has not be initialized that will really screw
>> things up.
>>
>> Barry
>>
>>
>> On Mon, 20 Mar 2006, Nils Erik Svangård wrote:
>>
>>> Hi again,
>>> I thought that I would comment on the diff from last being 0.90 all
>>> the time. That is because a error in calculation (by me).
>>>
>>> do IMESH=1,NMESH
>>> NMPL=NMP(IMESH)
>>> NIL = NI(IMESH)
>>> NJL = NJ(IMESH)
>>> NKL = NK(IMESH)
>>> do K=1,NKL-1
>>> do J=1,NJL-1
>>> do I=1,NIL-1
>>> L=I+NIL*(J-1)+NIL*NJL*(K-1)+NMPL
>>> count=count+1
>>> DRO(L)=magic()
>>> end do
>>> end do
>>> end do
>>> end do
>>>
>>> count after a run is 8019 and L is 8899, so the vector containg the
>>> solution also have a couple of zeros in the middle, could this effect
>>> the SNES solver?
>>>
>>> /nisse
>>>
>>>
>>> On 3/20/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
>>>> Hi,
>>>> The timestep doesnt change dramaticly just before the values get
>>>> strange. The term diff from last is:
>>>> a=new_value(L)/old_value(L)+a
>>>> Diff from last=a/L
>>>> To me it seems as something petsc does supplies strange values.
>>>> The output berfore crashing:
>>>>
>>>>
>>>> Iterering: 667.000
>>>> H2T
>>>> Timestep: 1.000000000000000E-007
>>>> H1T
>>>> Diff from last: 0.901112485562141
>>>> AUXVR
>>>> ROMIN,ROMAX= 1.22541 1.22657
>>>> UMIN ,UMAX = -4.016098E-02 0.278343
>>>> VMIN ,VMAX = -1.853829E-03 3.289763E-02
>>>> WMIN ,WMAX = -1.852430E-03 1.855088E-03
>>>> PMIN ,PMAX = 101283. 101416.
>>>> QMIN ,QMAX = 0.815285 0.816229
>>>> EMIN ,EMAX = 4.07643 4.08175
>>>> V3
>>>> Hello CONV4
>>>> RUS: 12.2556
>>>> V4
>>>> DIFF3
>>>> F3
>>>> Iterering: 668.000
>>>> H2T
>>>> Timestep: 1.000000000000000E-007
>>>> H1T
>>>> Diff from last: 0.901112485562141
>>>> AUXVR
>>>> ROMIN,ROMAX= 1.22541 1.22657
>>>> UMIN ,UMAX = -4.016098E-02 0.278343
>>>> VMIN ,VMAX = -1.853829E-03 3.289763E-02
>>>> WMIN ,WMAX = -1.852430E-03 1.855088E-03
>>>> PMIN ,PMAX = 101283. 101416.
>>>> QMIN ,QMAX = 0.815285 0.816229
>>>> EMIN ,EMAX = 4.07643 4.08175
>>>> V3
>>>> Hello CONV4
>>>> RUS: 12.2556
>>>> V4
>>>> DIFF3
>>>> F3
>>>> Iterering: 669.000
>>>> H2T
>>>> Timestep: 1.000000000000000E-007
>>>> H1T
>>>> Diff from last: 1.785767662774350E-006
>>>> AUXVR
>>>> --------------------------------------------------------------------------
>>>> Petsc Release Version 2.3.0, Patch 32, April, 26, 2005
>>>> See docs/changes/index.html for recent updates.
>>>> See docs/faq.html for hints about trouble shooting.
>>>> See docs/index.html for manual pages.
>>>> -----------------------------------------------------------------------
>>>> ./withrk on a linux-gnu named sethnx004.vac.com by yy26539 Mon Mar 20
>>>> 10:17:33 2006
>>>> Libraries linked from /home/yy26539/work/NISSE/petsc-2.3.0/lib/linux-gnu
>>>> Configure run at Thu Oct 13 08:23:56 2005
>>>> Configure options --with-cc=gcc --with-fc="f77 -N109"
>>>> --download-mpich=1 --download-mpich-pm=gforker
>>>> --download-f-blas-lapack=1 --with-shared=0
>>>> -----------------------------------------------------------------------
>>>> [0]PETSC ERROR: Caught signal number 8 FPE: Floating Point
>>>> Exception,probably divide by zero
>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [0]PETSC ERROR: likely location of problem given in stack below
>>>> [0]PETSC ERROR: --------------- Stack Frames ---------------
>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>>> [0]PETSC ERROR: INSTEAD the line number of the start of the function
>>>> [0]PETSC ERROR: is given.
>>>> [0]PETSC ERROR: [0] SNESComputeFunction line 788 src/snes/interface/snes.c
>>>> [0]PETSC ERROR: [0] MatMult_MFFD line 235 src/snes/mf/snesmfj.c
>>>> [0]PETSC ERROR: [0] MatMult line 1368 src/mat/interface/matrix.c
>>>> [0]PETSC ERROR: [0] SNESLSCheckLocalMin_Private line 19 src/snes/impls/ls/ls.c
>>>> [0]PETSC ERROR: [0] SNESSolve line 1656 src/snes/interface/snes.c
>>>> [0]PETSC ERROR: --------------------------------------------
>>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>>> unknown file
>>>> [0]PETSC ERROR: Signal received!
>>>> [0]PETSC ERROR: !
>>>> aborting job:
>>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>>>
>>>>
>>>> /nisse
>>>>
>>>> On 3/18/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
>>>>> On 3/18/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>
>>>>>> So you are saying that TS is feeding "reasonable" input
>>>>>> for a while? That basically matches (in scale) the values feed in
>>>>>> from the old rk code? Then SUDDENLY? it inputs values with a very
>>>>>> different size?
>>>>>
>>>>> Yes.
>>>>>
>>>>>> Does the TS time-step size? also change dramatically at
>>>>>> that point?
>>>>>>
>>>>>
>>>>> No, I dont think so. However I'm not 100% sure what the timestep
>>>>> hasent changed dramatically for the last iteration when it crashes
>>>>> (but 95% sure). I can check that on monday.
>>>>>
>>>>> /nisse
>>>>>
>>>>>> Barry
>>>>>>
>>>>>>
>>>>>> On Sat, 18 Mar 2006, Nils Erik Svangård wrote:
>>>>>>
>>>>>>> On 3/17/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>>
>>>>>>>> Based on my understanding. This is correct.
>>>>>>>
>>>>>>> I was almost hoping I had missed something fundamental.
>>>>>>>
>>>>>>>>
>>>>>>>> Suggest you run a TINY problem with your "old" code
>>>>>>>> and the TS (and or SNES one). Print out everything. The
>>>>>>>> current solution the result from calling Get_DRO() and
>>>>>>>> compare the runs, when and why do they change? This will
>>>>>>>> help understand what is going on.
>>>>>>>
>>>>>>> Well, Get_DRO() is exactly the same in the old and new code, and they
>>>>>>> start with the same initial values. Get_DRO() is not producing strange
>>>>>>> values, it is crashing because it get strange values fed into it.
>>>>>>>
>>>>>>> A normal run for the formfunction f(in,out) in TS would look something
>>>>>>> like this if printed:
>>>>>>> in=25560 <- Start value
>>>>>>> out=0.0001 <- du/dt
>>>>>>> in=25560 <- Value fed into formfunction after some TS magic
>>>>>>> out=0.0001 <- du/dt
>>>>>>> ... <- (a couple iterations)
>>>>>>> in=25451 <- Input value after a couple of iterations
>>>>>>> out=0.0001 <- du/dt
>>>>>>> in=0.23151 <- Value fed into formfunction decided by TS
>>>>>>> Petscerror divide by zero...
>>>>>>>
>>>>>>> The problem seems to be that TS decides that the input values to
>>>>>>> formfunction should be (in this case) 0.23151 which makes Get_DRO()
>>>>>>> crash because of for example sqrt(in-1).
>>>>>>>
>>>>>>> I have no idea how to fix this. The values from Get_DRO() in TS match
>>>>>>> the values when running the old rk code.
>>>>>>>
>>>>>>> /nisse
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Barry
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, 17 Mar 2006, Nils Erik Svangård wrote:
>>>>>>>>
>>>>>>>>> Ok, I managed to compile and link rk.c to my fortran program, I forgot
>>>>>>>>> that in C you need a ; in the end of every statement. (stupid mistake
>>>>>>>>> ;) )
>>>>>>>>>
>>>>>>>>> I have just added one line to see if it works.
>>>>>>>>>
>>>>>>>>> /* computing new dt */
>>>>>>>>> dt = dt * dt_fac;
>>>>>>>>>
>>>>>>>>> /* Start Nisse stuff */
>>>>>>>>> ierr = PetscPrintf(PETSC_COMM_WORLD,"Nisse prints dt: %f\n",dt);
>>>>>>>>> /* End nisse stuff */
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> if(ts->ptime+dt > ts->max_time){
>>>>>>>>> dt = ts->max_time - ts->ptime;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> I just try to print the current timestep, however this is never
>>>>>>>>> printed. And I'm not really sure that it is the timestep that is
>>>>>>>>> causing the problems.
>>>>>>>>>
>>>>>>>>> I have used call TSGetTimeStep(ts,timestep,ierr) to monitor what
>>>>>>>>> timestep TS uses and it seem ok. However after the first iteration of
>>>>>>>>> FormFunction everything seems ok, but in start of the second iteration
>>>>>>>>> all values are really strange.
>>>>>>>>>
>>>>>>>>> I see the same thing when using SNES and my back euler implementation,
>>>>>>>>> it iterate many more times however, but all of a sudden the all "in"
>>>>>>>>> values are in the range 0.2-0.7 (for all 7 equations) and my code
>>>>>>>>> bombs because of the strange values.
>>>>>>>>>
>>>>>>>>> When using TS and running with -snes_mf -ts_type beuler -ksp_rtol
>>>>>>>>> 1.e-10 this is what printed just before producing strange values:
>>>>>>>>> KSP Object:
>>>>>>>>> type: gmres
>>>>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>>> GMRES: happy breakdown tolerance 1e-30
>>>>>>>>> maximum iterations=10000, initial guess is zero
>>>>>>>>> tolerances: relative=1e-10, absolute=1e-50, divergence=10000
>>>>>>>>> left preconditioning
>>>>>>>>> PC Object:
>>>>>>>>> type: none
>>>>>>>>> linear system matrix = precond matrix:
>>>>>>>>> Matrix Object:
>>>>>>>>> type=mffd, rows=70000, cols=70000
>>>>>>>>> SNES matrix-free approximation:
>>>>>>>>> err=1e-07 (relative error in function evaluation)
>>>>>>>>> Using wp compute h routine
>>>>>>>>> Computes normA
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> And just to make sure that I havent misunderstood how SNES and TS work:
>>>>>>>>> If the original 3-stage RK uses (my numbering):
>>>>>>>>> 1. RO0(L)=RO(L)
>>>>>>>>> Get_DRO(RO(L))
>>>>>>>>> RO(L)=RO0(L)+CFL*DRO(L)
>>>>>>>>> 2. RO0(L)=.5*(RO0(L)+RO(L))
>>>>>>>>> Get_DRO(RO(L))
>>>>>>>>> RO(L)=RO0(L)+.5*CFL*DRO(L)
>>>>>>>>> 3. Get_DRO(RO(L))
>>>>>>>>> RO(L)=RO0(L)+.5*CFL*DRO(L)
>>>>>>>>>
>>>>>>>>> Then this should be in TS which should return du/dt which is DRO:
>>>>>>>>> RO(L)=xx(1,L)
>>>>>>>>> Get_DRO(RO(L))
>>>>>>>>> ff(1,L) = DRO(L)
>>>>>>>>>
>>>>>>>>> And in SNES with back euler:
>>>>>>>>> (Old RO from previous iteration ORO(L)
>>>>>>>>> RO(L)=xx(1,L)
>>>>>>>>> Get_DRO(RO(L))
>>>>>>>>> ff(1,L)= RO(L)-OLD(1,L)-TSF(L)*DRO(L)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This became a long mail, I hope this shows if I missed something vital.
>>>>>>>>> /nisse
>>>>>>>>> On 3/16/06, Nils Erik Svangård <nilserik at gmail.com> wrote:
>>>>>>>>>> Barry,
>>>>>>>>>> the problem is making the objectfile, but I'll try again when I have
>>>>>>>>>> the code. I will check the makefile for the c-examples.
>>>>>>>>>> /nisse
>>>>>>>>>>
>>>>>>>>>> On 3/16/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Nisse,
>>>>>>>>>>>
>>>>>>>>>>> Just list it in your makefile with all your other object
>>>>>>>>>>> files (that come from Fortran). Send the output if this fails.
>>>>>>>>>>>
>>>>>>>>>>> Barry
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, 16 Mar 2006, Nils Erik Svangård wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I havent managed to get rk.c compiled with changes. how do I compile
>>>>>>>>>>>> it in my working directory to get a object file. I just realised that
>>>>>>>>>>>> I probably forgot to link it against $TSLIB but should I need to that
>>>>>>>>>>>> when I dont do any linking, the linking is done when linking the
>>>>>>>>>>>> fortran and the c code?
>>>>>>>>>>>> Or what am I doing wrong (I not that good with C++ and linking).
>>>>>>>>>>>> /nisse
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/15/06, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Both
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, 15 Mar 2006, Nils Erik Svangård wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sorry, I forgot. Is the linear solver converging? If not, then that
>>>>>>>>>>>>>>> is the problem? Use a tolerance like -ksp_rtol 1.e-10 and see if the
>>>>>>>>>>>>>>> nonlinear solver converges.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'll try that when I have access to the code. I havent checked if the
>>>>>>>>>>>>>> linear solver converges is thera a -kspmonitor or -kspconvergedreason
>>>>>>>>>>>>>> I should use?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /nisse
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Nils-Erik Svangård
>>>>>>>>>>>> E-Mail: nilserik at gmail.com
>>>>>>>>>>>> MSN: schweingaard at hotmail.com
>>>>>>>>>>>> Skype: schweingaard
>>>>>>>>>>>> Mobil: +46-(0)70-3612178
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Nils-Erik Svangård
>>>>>>>>>> E-Mail: nilserik at gmail.com
>>>>>>>>>> MSN: schweingaard at hotmail.com
>>>>>>>>>> Skype: schweingaard
>>>>>>>>>> Mobil: +46-(0)70-3612178
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Nils-Erik Svangård
>>>>>>>>> E-Mail: nilserik at gmail.com
>>>>>>>>> MSN: schweingaard at hotmail.com
>>>>>>>>> Skype: schweingaard
>>>>>>>>> Mobil: +46-(0)70-3612178
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nils-Erik Svangård
>>>>>>> E-Mail: nilserik at gmail.com
>>>>>>> MSN: schweingaard at hotmail.com
>>>>>>> Skype: schweingaard
>>>>>>> Mobil: +46-(0)70-3612178
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nils-Erik Svangård
>>>>> E-Mail: nilserik at gmail.com
>>>>> MSN: schweingaard at hotmail.com
>>>>> Skype: schweingaard
>>>>> Mobil: +46-(0)70-3612178
>>>>>
>>>>
>>>>
>>>> --
>>>> Nils-Erik Svangård
>>>> E-Mail: nilserik at gmail.com
>>>> MSN: schweingaard at hotmail.com
>>>> Skype: schweingaard
>>>> Mobil: +46-(0)70-3612178
>>>>
>>>
>>>
>>> --
>>> Nils-Erik Svangård
>>> E-Mail: nilserik at gmail.com
>>> MSN: schweingaard at hotmail.com
>>> Skype: schweingaard
>>> Mobil: +46-(0)70-3612178
>>>
>>>
>>
>
>
> --
> Nils-Erik Svangård
> E-Mail: nilserik at gmail.com
> MSN: schweingaard at hotmail.com
> Skype: schweingaard
> Mobil: +46-(0)70-3612178
>
>
More information about the petsc-users
mailing list