[petsc-users] Snes behavior
Barry Smith
bsmith at mcs.anl.gov
Mon Jan 11 11:27:32 CST 2010
On Jan 11, 2010, at 9:32 AM, Ryan Yan wrote:
> Hi Barry,
> This is my case: In 1-d, for the interior pts the stencil width is
> 1, but for boundary pts the stencil width is 2.
>
> Should the stencil width in DACreate1d be 1 or 2?
2
>
> If I switch to stencil width 2, then I can get quadratic convergence.
Yes
>
> Thank you very much,
>
> Yan
>
> On Sun, Jan 10, 2010 at 11:22 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
>
> Like I said in my previous mail this means that the Jacobian being
> computing before was not accurate enough for some reason. This will
> happen if your function evaluation does NOT respect the stencil you
> provided when you created the DA. For example if you told the DA to
> use a stencil width of 1 but somewhere your function evaluation
> grabs a value 2 cells away and uses it in the computation. For
> example on some boundary condition. Or you told it was a box stencil
> but the function evaluation grabbed a value from the diagonal stencil.
> There is a 99% probability this is your problem.
>
> If you cannot find the bug then send your function evaluation and
> how you form the DA to petsc-maint at mcs.anl.gov and we'll find it.
>
>
> Barry
>
>
>
> On Jan 10, 2010, at 9:13 PM, Ryan Yan wrote:
>
> Hi Barry,
> Here is the result for -snes_mf_operator:
>
> yy2250 at sci-m8n6 ~/test/1d $ srun -p sci-comp -N 2 -n 2 ./
> HeatProfile1D -snes_mf_operator -dmmg_grid_sequence -pc_type bjacobi
> -snes_rtol 1e-15 -snes_monitor -ksp_rtol 1.e-12 -
> ksp_converged_reason -snes_converged_reason > out.mf_operator &
>
>
> yy2250 at sci-m8n6 ~/test/1d $ cat out.mf_operator
> 0 SNES Function norm 1.411468156752e+08
> Linear solve converged due to CONVERGED_ITS iterations 1
> 1 SNES Function norm 1.396727866424e+08
> Linear solve converged due to CONVERGED_ITS iterations 1
> Nonlinear solve did not converge due to DIVERGED_LS_FAILURE
> 0 SNES Function norm 1.045985796073e+08
> Linear solve converged due to CONVERGED_RTOL iterations 44
> 1 SNES Function norm 7.912999174650e+07
> Linear solve converged due to CONVERGED_RTOL iterations 38
> 2 SNES Function norm 6.079225520436e+07
> Linear solve converged due to CONVERGED_RTOL iterations 66
> 3 SNES Function norm 4.610252725173e+07
> Linear solve converged due to CONVERGED_RTOL iterations 66
> 4 SNES Function norm 3.333587574896e+07
> Linear solve converged due to CONVERGED_RTOL iterations 46
> 5 SNES Function norm 2.326587775240e+07
> Linear solve converged due to CONVERGED_RTOL iterations 59
> 6 SNES Function norm 1.233942497170e+05
> Linear solve converged due to CONVERGED_RTOL iterations 38
> 7 SNES Function norm 2.309536978272e+03
> Linear solve converged due to CONVERGED_RTOL iterations 38
> 8 SNES Function norm 7.064055492974e-03
> Linear solve converged due to CONVERGED_RTOL iterations 38
> 9 SNES Function norm 3.419662119536e-06
> Nonlinear solve converged due to CONVERGED_PNORM_RELATIVE
> 0 SNES Function norm 3.420486720202e+09
> Linear solve converged due to CONVERGED_RTOL iterations 10
> 1 SNES Function norm 1.426355492941e+08
> Linear solve converged due to CONVERGED_RTOL iterations 36
> 2 SNES Function norm 1.429830907344e+06
> Linear solve converged due to CONVERGED_RTOL iterations 37
> 3 SNES Function norm 1.755275678702e+03
> Linear solve converged due to CONVERGED_RTOL iterations 37
> 4 SNES Function norm 1.216161531652e-03
> Linear solve converged due to CONVERGED_RTOL iterations 36
> 5 SNES Function norm 4.309776935609e-06
> Nonlinear solve converged due to CONVERGED_PNORM_RELATIVE
> Number of Newton iterations = 5
> Converged reason is 4
>
> This time the convergence is very aggressive, and it is quadratic
> from iteration 0 to iteration 4.
>
> The solution vector x from this run is the same as the run without
> the option "-snes_mf_operator". Any hint for why this is totally
> quadratic convergence?
>
> Thank you very much,
>
> Yan
>
>
> On Sun, Jan 10, 2010 at 7:25 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
>
> On Jan 10, 2010, at 6:09 PM, Ryan Yan wrote:
>
> Hi Barry,
>
> I run the code on another machine.
>
> yy2250 at sci-m8n6 ~/test/1d $ srun -p sci-comp -N 2 -n 2 ./
> HeatProfile1D -snes_type test > out.snes_test
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [1]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: main() line 270 in /home/vyan2000/local/PPETSc/
> petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/
> NCprojectHeatProfile1D.c
> application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0
> [1]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [1]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [1]PETSC ERROR: main() line 270 in /home/vyan2000/local/PPETSc/
> petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/
> NCprojectHeatProfile1D.c
> In: PMI_Abort(73, application called MPI_Abort(MPI_COMM_WORLD, 73) -
> process 0)
> application called MPI_Abort(MPI_COMM_WORLD, 73) - process 1
> In: PMI_Abort(73, application called MPI_Abort(MPI_COMM_WORLD, 73) -
> process 1)
> srun: error: task 1: Exited with exit code 73
> srun: error: task 0: Exited with exit code 73
>
> I was using 3 level grids. The error comes out during the snes
> solving on the fine level grid. The output in the file out.snes_test
> is :
> yy2250 at sci-m8n6 ~/test/1d $ cat out.snes_test
> Testing hand-coded Jacobian, if the ratio is
> O(1.e-8), the hand-coded Jacobian is probably correct.
> Run with -snes_test_display to show difference
> of hand-coded and finite difference Jacobian.
> Norm of matrix ratio 1.21314e-10 difference 1.41447
> Norm of matrix ratio 4.30115e-06 difference 1.41427
> Norm of matrix ratio 4.30108e-06 difference 1.41427
>
>
> Ok, Jacobians being used are probably ok.
>
>
>
> For the -snes_mf_operator,
> yy2250 at sci-m8n6 ~/test/1d $ srun -p sci-comp -N 2 -n 2 ./
> HeatProfile1D -snes_mf_operator
> the program is still running. I suspect that the program hanged.
>
> It's not hanging. It is just running. Use -snes_monitor -ksp_rtol
> 1.e-12 -ksp_converged_reason -snes_converged_reason and run again
>
> Barry
>
>
>
> Is there any useful information in the file "out.snes_test"?
>
> BTW, I did not hand-code any Jacobian, since I passed 0 for the
> Jacobian evaluation subroutine in the DMMGSetSNESLocal().
>
> Thank you very much,
>
> Yan
>
>
>
> On Sun, Jan 10, 2010 at 6:11 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
>
> That message ain't from PETSc. Something is likely killing the job.
>
> Barry
>
>
> On Jan 10, 2010, at 5:09 PM, Ryan Yan wrote:
>
> Hi Barry,
> I got the following result:
>
> vyan2000 at vyan2000-linux ~/NCproject/general $ mpirun -np 1 ./
> HeatProfile1D -snes_mf_operator
> Alarm clock
> vyan2000 at vyan2000-linux ~/NCproject/general $ mpirun -np 1 ./
> HeatProfile1D -snes_type test
> Alarm clock
>
> Are they normal responses and what do they indicate?
>
> Thanks a lot,
>
> Yan
>
>
> On Sun, Jan 10, 2010 at 5:57 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
>
> JUST run with -snes_mf_operator and then with -snes_type test NOT
> together,
>
> Barry
>
>
> On Jan 10, 2010, at 4:42 PM, Ryan Yan wrote:
>
> Hi Barry,
> Please see reply below,
>
> On Sun, Jan 10, 2010 at 4:57 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
>
> On Jan 10, 2010, at 3:52 PM, Ryan Yan wrote:
>
> Hi Barry,
> Yes, exactly. The original multi-components system scale quite
> unevenly. I will try to rescale it.
> Could this be helpful to show some promise on quadratic convergence?
>
> I won't be concerned about "quadratic convergence" I'd only be
> concerned that it is converging to the correct answer and that you
> are getting close enough to the correct answer.
>
> Yes, I agree.
>
> You can run with -snes_mf_operator and -snes_type test to verify if
> the Jacobian being computed is accurate. Perhaps in your function
> evaluation you are not using the stencil that you set with the DA,
> this would cause the wrong Jacobian to be computed.
>
>
> After passing in -snes_mf_operator and -snes_type test as follows:
> vyan2000 at vyan2000-linux ~/NCproject/general : mpirun -np 2 ./
> HeatProfile1D -snes_mf_operator -snes_type test
> I got errors, as expected:
>
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Cannot test with alternative preconditioner!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13
> 09:15:37 CDT 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./HeatProfile1D on a O-hypre-p named vyan2000-linux
> by vyan2000 Sun Jan 10 17:16:00 2010
> [0]PETSC ERROR: Libraries linked from /home/vyan2000/local/PPETSc/
> petsc-3.0.0-p5/O-hypre-prometheus/lib
> [0]PETSC ERROR: Configure run at Thu Jun 25 13:49:36 2009
> [0]PETSC ERROR: Configure options --download-mpich=1 --with-
> debugger=gdb --download-hypre=1 --download-parmetis=1 --download-
> prometheus=1 --with-shared=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: SNESSolve_Test() line 28 in src/snes/impls/test/
> snestest.c
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: main() line 270 in /home/vyan2000/local/PPETSc/
> petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/
> NCprojectHeatProfile1D.c
> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0[cli_0]:
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0
>
> The error makes sense, since I did not pass in the analytical
> Jacobian for the preconditioner matrix. In stead, I was using
> DMMGSetSNESLocal(dmmg,FormFunctionLocal,
> 0,ad_FormFunctionLocal,admf_FormFunctionLocal). I am going to change
> the code a bit, pass in the analytical Jacobian and do the -
> snes_type test.
>
> I will bear your suggestion in mind during the test.
>
> Thanks a lot,
>
> Yan
>
>
>
> Barry
>
>
>
> Thanks a lot,
>
> Yan
>
> On Sun, Jan 10, 2010 at 4:35 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
>
> You already got a 10^16 drop in the residual norm. It is not
> realistic to expect to get much more than that for double precision
> calculations. Perhaps your original F() has some funky scaling of
> different components that you can fix.
>
>
>
> Barry
>
>
> On Jan 10, 2010, at 2:55 PM, Ryan Yan wrote:
>
> Hi All,
> I am solving a nonlinear system using snes. The -snes_monitor option
> has the following output:
>
> 0 SNES Function norm 2.640163923729e+09
> 1 SNES Function norm 1.047643565314e+08
> 2 SNES Function norm 1.712732074788e+06
> 3 SNES Function norm 1.002169173269e+04
> 4 SNES Function norm 1.655878303433e+03
> 5 SNES Function norm 3.746498305706e+02
> 6 SNES Function norm 8.317435704773e+01
> 7 SNES Function norm 1.857639969641e+01
> 8 SNES Function norm 4.149691057773e+00
> 9 SNES Function norm 9.265604042412e-01
> 10 SNES Function norm 2.069527103214e-01
> 11 SNES Function norm 4.624186491082e-02
> 12 SNES Function norm 1.035558432688e-02
> 13 SNES Function norm 2.341362958811e-03
> 14 SNES Function norm 5.507445427277e-04
> 15 SNES Function norm 1.485123568354e-04
> 16 SNES Function norm 5.180043781814e-05
> 17 SNES Function norm 2.341966514486e-05
> 18 SNES Function norm 1.344936158651e-05
> 19 SNES Function norm 1.054812641176e-05
> Number of Newton iterations = 19
> Converged reason is 4
>
> It looks like the iterate never falls into a quadratic convergence
> region before it converges. Is there any hint to understand this
> behavior?
>
> Thanks a lot,
>
> Yan
>
>
>
>
>
>
>
>
>
>
>
>
>
>
More information about the petsc-users
mailing list