[petsc-users] Snes behavior

Barry Smith bsmith at mcs.anl.gov
Mon Jan 11 11:27:32 CST 2010


On Jan 11, 2010, at 9:32 AM, Ryan Yan wrote:

> Hi Barry,
> This is my case: In 1-d, for the interior pts the stencil width is  
> 1, but for boundary pts the stencil width is 2.
>
> Should the stencil width in DACreate1d be 1 or 2?

2

>
> If I switch to stencil width 2, then I can get quadratic convergence.

Yes

>
> Thank you very much,
>
> Yan
>
> On Sun, Jan 10, 2010 at 11:22 PM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
>   Like I said in my previous mail this means that the Jacobian being  
> computing before was not accurate enough for some reason. This will  
> happen if your function evaluation does NOT respect the stencil you  
> provided when you created the DA. For example if you told the DA to  
> use a stencil width of 1 but somewhere your function evaluation  
> grabs a value 2 cells away and uses it in the computation. For  
> example on some boundary condition. Or you told it was a box stencil  
> but the function evaluation grabbed a value from the diagonal stencil.
> There is a 99% probability this is your problem.
>
>  If you cannot find the bug then send your function evaluation and  
> how you form the DA to petsc-maint at mcs.anl.gov and we'll find it.
>
>
>   Barry
>
>
>
> On Jan 10, 2010, at 9:13 PM, Ryan Yan wrote:
>
> Hi Barry,
> Here is the result for -snes_mf_operator:
>
> yy2250 at sci-m8n6 ~/test/1d $ srun -p sci-comp -N 2 -n 2 ./ 
> HeatProfile1D -snes_mf_operator -dmmg_grid_sequence -pc_type bjacobi  
> -snes_rtol 1e-15 -snes_monitor -ksp_rtol 1.e-12 - 
> ksp_converged_reason -snes_converged_reason > out.mf_operator &
>
>
> yy2250 at sci-m8n6 ~/test/1d $ cat out.mf_operator
>      0 SNES Function norm 1.411468156752e+08
> Linear solve converged due to CONVERGED_ITS iterations 1
>      1 SNES Function norm 1.396727866424e+08
> Linear solve converged due to CONVERGED_ITS iterations 1
> Nonlinear solve did not converge due to DIVERGED_LS_FAILURE
>    0 SNES Function norm 1.045985796073e+08
> Linear solve converged due to CONVERGED_RTOL iterations 44
>    1 SNES Function norm 7.912999174650e+07
> Linear solve converged due to CONVERGED_RTOL iterations 38
>    2 SNES Function norm 6.079225520436e+07
> Linear solve converged due to CONVERGED_RTOL iterations 66
>    3 SNES Function norm 4.610252725173e+07
> Linear solve converged due to CONVERGED_RTOL iterations 66
>    4 SNES Function norm 3.333587574896e+07
> Linear solve converged due to CONVERGED_RTOL iterations 46
>    5 SNES Function norm 2.326587775240e+07
> Linear solve converged due to CONVERGED_RTOL iterations 59
>    6 SNES Function norm 1.233942497170e+05
> Linear solve converged due to CONVERGED_RTOL iterations 38
>    7 SNES Function norm 2.309536978272e+03
> Linear solve converged due to CONVERGED_RTOL iterations 38
>    8 SNES Function norm 7.064055492974e-03
> Linear solve converged due to CONVERGED_RTOL iterations 38
>    9 SNES Function norm 3.419662119536e-06
> Nonlinear solve converged due to CONVERGED_PNORM_RELATIVE
>  0 SNES Function norm 3.420486720202e+09
> Linear solve converged due to CONVERGED_RTOL iterations 10
>  1 SNES Function norm 1.426355492941e+08
> Linear solve converged due to CONVERGED_RTOL iterations 36
>  2 SNES Function norm 1.429830907344e+06
> Linear solve converged due to CONVERGED_RTOL iterations 37
>  3 SNES Function norm 1.755275678702e+03
> Linear solve converged due to CONVERGED_RTOL iterations 37
>  4 SNES Function norm 1.216161531652e-03
> Linear solve converged due to CONVERGED_RTOL iterations 36
>  5 SNES Function norm 4.309776935609e-06
> Nonlinear solve converged due to CONVERGED_PNORM_RELATIVE
> Number of Newton iterations = 5
> Converged reason is 4
>
> This time the convergence is very aggressive, and it is quadratic  
> from iteration 0 to iteration 4.
>
> The solution vector x from this run is the same as the run without  
> the option "-snes_mf_operator". Any hint for why this is totally  
> quadratic convergence?
>
> Thank you very much,
>
> Yan
>
>
> On Sun, Jan 10, 2010 at 7:25 PM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
> On Jan 10, 2010, at 6:09 PM, Ryan Yan wrote:
>
> Hi Barry,
>
> I run the code on another machine.
>
> yy2250 at sci-m8n6 ~/test/1d $ srun -p sci-comp -N 2 -n 2 ./ 
> HeatProfile1D -snes_type test > out.snes_test
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [1]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: main() line 270 in /home/vyan2000/local/PPETSc/ 
> petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ 
> NCprojectHeatProfile1D.c
> application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0
> [1]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [1]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [1]PETSC ERROR: main() line 270 in /home/vyan2000/local/PPETSc/ 
> petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ 
> NCprojectHeatProfile1D.c
> In: PMI_Abort(73, application called MPI_Abort(MPI_COMM_WORLD, 73) -  
> process 0)
> application called MPI_Abort(MPI_COMM_WORLD, 73) - process 1
> In: PMI_Abort(73, application called MPI_Abort(MPI_COMM_WORLD, 73) -  
> process 1)
> srun: error: task 1: Exited with exit code 73
> srun: error: task 0: Exited with exit code 73
>
> I was using 3 level grids. The error comes out during the snes  
> solving on the fine level grid. The output in the file out.snes_test  
> is  :
> yy2250 at sci-m8n6 ~/test/1d $ cat out.snes_test
> Testing hand-coded Jacobian, if the ratio is
> O(1.e-8), the hand-coded Jacobian is probably correct.
> Run with -snes_test_display to show difference
> of hand-coded and finite difference Jacobian.
> Norm of matrix ratio 1.21314e-10 difference 1.41447
> Norm of matrix ratio 4.30115e-06 difference 1.41427
> Norm of matrix ratio 4.30108e-06 difference 1.41427
>
>
>  Ok, Jacobians being used are probably ok.
>
>
>
> For the -snes_mf_operator,
> yy2250 at sci-m8n6 ~/test/1d $ srun -p sci-comp -N 2 -n 2 ./ 
> HeatProfile1D -snes_mf_operator
> the program is still running. I suspect that the program hanged.
>
>  It's not hanging. It is just running. Use -snes_monitor -ksp_rtol  
> 1.e-12 -ksp_converged_reason -snes_converged_reason and run again
>
>  Barry
>
>
>
> Is there any useful information in the file "out.snes_test"?
>
> BTW, I did not hand-code any Jacobian, since I passed 0 for the  
> Jacobian evaluation subroutine in the DMMGSetSNESLocal().
>
> Thank you very much,
>
> Yan
>
>
>
> On Sun, Jan 10, 2010 at 6:11 PM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
>  That message ain't from PETSc. Something is likely killing the job.
>
>  Barry
>
>
> On Jan 10, 2010, at 5:09 PM, Ryan Yan wrote:
>
> Hi Barry,
> I got the following result:
>
> vyan2000 at vyan2000-linux ~/NCproject/general $ mpirun -np 1 ./ 
> HeatProfile1D -snes_mf_operator
> Alarm clock
> vyan2000 at vyan2000-linux ~/NCproject/general $ mpirun -np 1 ./ 
> HeatProfile1D -snes_type test
> Alarm clock
>
> Are they normal responses and what do they indicate?
>
> Thanks a lot,
>
> Yan
>
>
> On Sun, Jan 10, 2010 at 5:57 PM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
>  JUST run with -snes_mf_operator and then with -snes_type test NOT  
> together,
>
>  Barry
>
>
> On Jan 10, 2010, at 4:42 PM, Ryan Yan wrote:
>
> Hi Barry,
> Please see reply below,
>
> On Sun, Jan 10, 2010 at 4:57 PM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
> On Jan 10, 2010, at 3:52 PM, Ryan Yan wrote:
>
> Hi Barry,
> Yes, exactly. The original multi-components system scale quite  
> unevenly. I will try to rescale it.
> Could this be helpful to show some promise on quadratic convergence?
>
>  I won't be concerned about "quadratic convergence" I'd only be  
> concerned that it is converging to the correct answer and that you  
> are getting close enough to the correct answer.
>
> Yes, I agree.
>
>  You can run with -snes_mf_operator and -snes_type test to verify if  
> the Jacobian being computed is accurate. Perhaps in your function  
> evaluation you are not using the stencil that you set with the DA,  
> this would cause the wrong Jacobian to be computed.
>
>
> After passing in -snes_mf_operator and -snes_type test as follows:
> vyan2000 at vyan2000-linux ~/NCproject/general : mpirun -np 2 ./ 
> HeatProfile1D -snes_mf_operator -snes_type test
> I got errors, as expected:
>
> [0]PETSC ERROR: --------------------- Error Message  
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Cannot test with alternative preconditioner!
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13  
> 09:15:37 CDT 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./HeatProfile1D on a O-hypre-p named vyan2000-linux  
> by vyan2000 Sun Jan 10 17:16:00 2010
> [0]PETSC ERROR: Libraries linked from /home/vyan2000/local/PPETSc/ 
> petsc-3.0.0-p5/O-hypre-prometheus/lib
> [0]PETSC ERROR: Configure run at Thu Jun 25 13:49:36 2009
> [0]PETSC ERROR: Configure options --download-mpich=1 --with- 
> debugger=gdb --download-hypre=1 --download-parmetis=1 --download- 
> prometheus=1 --with-shared=0
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: SNESSolve_Test() line 28 in src/snes/impls/test/ 
> snestest.c
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: main() line 270 in /home/vyan2000/local/PPETSc/ 
> petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ 
> NCprojectHeatProfile1D.c
> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0[cli_0]:  
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0
>
> The error makes sense, since I did not pass in the analytical  
> Jacobian for the preconditioner matrix. In stead, I was using  
> DMMGSetSNESLocal(dmmg,FormFunctionLocal, 
> 0,ad_FormFunctionLocal,admf_FormFunctionLocal). I am going to change  
> the code a bit, pass in the analytical Jacobian and do the - 
> snes_type test.
>
> I will bear your suggestion in mind during the test.
>
> Thanks a lot,
>
> Yan
>
>
>
>  Barry
>
>
>
> Thanks a lot,
>
> Yan
>
> On Sun, Jan 10, 2010 at 4:35 PM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
>  You already got a 10^16 drop in the residual norm. It is not  
> realistic to expect to get much more than that for double precision  
> calculations. Perhaps your original F() has some funky scaling of  
> different components that you can fix.
>
>
>
>  Barry
>
>
> On Jan 10, 2010, at 2:55 PM, Ryan Yan wrote:
>
> Hi All,
> I am solving a nonlinear system using snes. The -snes_monitor option  
> has the following output:
>
>  0 SNES Function norm 2.640163923729e+09
>  1 SNES Function norm 1.047643565314e+08
>  2 SNES Function norm 1.712732074788e+06
>  3 SNES Function norm 1.002169173269e+04
>  4 SNES Function norm 1.655878303433e+03
>  5 SNES Function norm 3.746498305706e+02
>  6 SNES Function norm 8.317435704773e+01
>  7 SNES Function norm 1.857639969641e+01
>  8 SNES Function norm 4.149691057773e+00
>  9 SNES Function norm 9.265604042412e-01
>  10 SNES Function norm 2.069527103214e-01
>  11 SNES Function norm 4.624186491082e-02
>  12 SNES Function norm 1.035558432688e-02
>  13 SNES Function norm 2.341362958811e-03
>  14 SNES Function norm 5.507445427277e-04
>  15 SNES Function norm 1.485123568354e-04
>  16 SNES Function norm 5.180043781814e-05
>  17 SNES Function norm 2.341966514486e-05
>  18 SNES Function norm 1.344936158651e-05
>  19 SNES Function norm 1.054812641176e-05
> Number of Newton iterations = 19
> Converged reason is 4
>
> It looks like the iterate never falls into a quadratic convergence  
> region before it converges. Is there any hint to understand this  
> behavior?
>
> Thanks a lot,
>
> Yan
>
>
>
>
>
>
>
>
>
>
>
>
>
>



More information about the petsc-users mailing list