[petsc-users] Newton LS - better results on single processor

zakaryah . zakaryah at gmail.com
Fri Nov 10 16:18:10 CST 2017


Thanks for the advice.

I put a VecView() on the solution vector, both before and after all the
solves, and in the SNES Function and Jacobian, as well as a MatView() in
the Jacobian, and a VecView() on the residual after the solves.  Then I run
a tiny problem with -pc_type redundant -redundant_pc_type lu, as Stefano
suggested, and I compare the vectors and matrices with -n 1 to -n 2.
Although the monitor shows somewhat different residuals for the KSP and for
the SNES, the differences are very small.  For example, after the first
SNESSolve(), the SNES residual is 6.5e-18 with -n 1 and 6.0e-18 with -n 2,
and of course I don't care about that tiny difference unless it indicates
some mistake in my code.  The VecView() and MatView() show that the state
vector, the function vector, and the function Jacobian are identical (up to
the default output precision of the view routines).  The residuals are of
course slightly different.

For this problem it took 8 Riks iterations for the loading coefficient to
reach 1 (i.e. to finish the iterations).  For the last solve, the residuals
and their differences were larger: 8.4e-15 with -n 1 and 8.7e-15 with -n
2.  I think this supports my hypothesis that the iterations which feed one
SNESSolve() solution into the initial guess for the next solve can amplify
small differences.

To check a bit deeper, I removed all of the view calls except to the SNES
residual, and ran on a more realistic problem size, with the SNES defaults
(and KSP defaults, PC defaults, etc).

I will call the residual after the first SNESSolve Ri, and the residual
after the last SNESSolve Rf.  With -n 1, Ri and Rf are both spatially
smooth (as I expect).  I think the standard deviation (over space) of the
residual is a useful way to quantify its amplitude; for -n 1, sd(Ri) =
5.1e-13 and sd(Rf) = 6.8e-12.  With -n 2, both Ri and Rf have a
discontinuity in x, at x=0, x=20, x=21, and x=41 (i.e. at the global
boundary and at the boundary between the two processes).  For -n 2, sd(Ri)
= 1.2e-12 and sd(Rf) = 5.7e-12, with all of the additional fluctuations
coming from those boundary coordinates.  In other words, the fluctuations
of the residual are an order of magnitude larger at the boundary between
processes.

If I consider the residuals as a function of the other dimensions (y or z
instead of x), the entire range of which is owned by each processor, I
don't see any discontinuity.

I suppose that all of this has something to do with the spectrums of the
matrices involved in the solve but I don't know enough to improve the
results I'm obtaining.

On Thu, Nov 9, 2017 at 11:09 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>
> > On Nov 9, 2017, at 3:33 PM, zakaryah . <zakaryah at gmail.com> wrote:
> >
> > Hi Stefano - when I referred to the iterations, I was trying to point
> out that my method solves a series of nonlinear systems, with the solution
> to the first problem being used to initialize the state vector for the
> second problem, etc.  The reason I mentioned that was I thought perhaps I
> can expect the residuals from single process solve to differ from the
> residuals from multiprocess solve by a very small amount, say machine
> precision or the tolerance of the KSP/SNES, that would be fine normally.
> But, if there is a possibility that those differences are somehow amplified
> by each of the iterations (solution->initial state), that could explain
> what I see.
> >
> > I agree that it is more likely that I have a bug in my code but I'm
> having trouble finding it.
>
>   Run a tiny problem on one and two processes with LU linear solver and
> the same mesh. So in the first case all values live on the first process
> and in the second the same first half live on one process and the second
> half on the second process.
>
>   Now track the values in the actual vectors and matrices. For example you
> can just put in VecView() and MatView() on all objects you pass into the
> solver and then put them in the SNESComputeFunction/Jacobian routines.
> Print both the vectors inputed to these routines and the vectors/matrices
> created in the routines. The output differences from the two runs should be
> small, determine when they significantly vary. This will tell you the
> likely location of the bug in your source code. (For example if certain
> values of the Jacobian differ)
>
>   Good luck, I've done this plenty of times and if it is a
> "parallelization" bug this will help you find it much faster than guessing
> where the problem is and trying code inspect to find the bug.
>
>   Barry
>
> >
> > I ran a small problem with -pc_type redundant -redundant_pc_type lu, as
> you suggested.  What I think is the relevant portion of the output is here
> (i.e. there are small differences in the KSP residuals and SNES residuals):
> >
> > -n 1, first "iteration" as described above:
> >
> >   0 SNES Function norm 6.053565720454e-02
> >     0 KSP Residual norm 4.883115701982e-05
> >
> >     0 KSP preconditioned resid norm 4.883115701982e-05 true resid norm
> 6.053565720454e-02 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 8.173640409069e-20
> >
> >     1 KSP preconditioned resid norm 8.173640409069e-20 true resid norm
> 1.742143029296e-16 ||r(i)||/||b|| 2.877879104227e-15
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 6.053565720454e-02 gnorm
> 2.735518570862e-07
> >
> >   1 SNES Function norm 2.735518570862e-07
> >
> >     0 KSP Residual norm 1.298536630766e-10
> >
> >     0 KSP preconditioned resid norm 1.298536630766e-10 true resid norm
> 2.735518570862e-07 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 2.152782096751e-25
> >
> >     1 KSP preconditioned resid norm 2.152782096751e-25 true resid norm
> 4.755555202641e-22 ||r(i)||/||b|| 1.738447420279e-15
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 2.735518570862e-07 gnorm
> 1.917989238989e-17
> >
> >   2 SNES Function norm 1.917989238989e-17
> >
> > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
> >
> >
> >
> > -n 2, first "iteration" as described above:
> >
> >   0 SNES Function norm 6.053565720454e-02
> >
> >     0 KSP Residual norm 4.883115701982e-05
> >
> >     0 KSP preconditioned resid norm 4.883115701982e-05 true resid norm
> 6.053565720454e-02 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 1.007084240718e-19
> >
> >     1 KSP preconditioned resid norm 1.007084240718e-19 true resid norm
> 1.868472589717e-16 ||r(i)||/||b|| 3.086565300520e-15
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 6.053565720454e-02 gnorm
> 2.735518570379e-07
> >
> >   1 SNES Function norm 2.735518570379e-07
> >
> >     0 KSP Residual norm 1.298536630342e-10
> >
> >     0 KSP preconditioned resid norm 1.298536630342e-10 true resid norm
> 2.735518570379e-07 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 1.885083482938e-25
> >
> >     1 KSP preconditioned resid norm 1.885083482938e-25 true resid norm
> 4.735707460766e-22 ||r(i)||/||b|| 1.731191852267e-15
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 2.735518570379e-07 gnorm
> 1.851472273258e-17
> >
> >
> >   2 SNES Function norm 1.851472273258e-17
> >
> >
> > -n 1, final "iteration":
> >   0 SNES Function norm 9.695669610792e+01
> >
> >     0 KSP Residual norm 7.898912593878e-03
> >
> >     0 KSP preconditioned resid norm 7.898912593878e-03 true resid norm
> 9.695669610792e+01 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 1.720960785852e-17
> >
> >     1 KSP preconditioned resid norm 1.720960785852e-17 true resid norm
> 1.237111121391e-13 ||r(i)||/||b|| 1.275941911237e-15
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 9.695669610792e+01 gnorm
> 1.026572731653e-01
> >
> >   1 SNES Function norm 1.026572731653e-01
> >
> >     0 KSP Residual norm 1.382450412926e-04
> >
> >     0 KSP preconditioned resid norm 1.382450412926e-04 true resid norm
> 1.026572731653e-01 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 5.018078565710e-20
> >
> >     1 KSP preconditioned resid norm 5.018078565710e-20 true resid norm
> 9.031463071676e-17 ||r(i)||/||b|| 8.797684560673e-16
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 1.026572731653e-01 gnorm
> 7.982937980399e-06
> >
> >   2 SNES Function norm 7.982937980399e-06
> >
> >     0 KSP Residual norm 4.223898196692e-08
> >
> >     0 KSP preconditioned resid norm 4.223898196692e-08 true resid norm
> 7.982937980399e-06 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 1.038123933240e-22
> >
> >     1 KSP preconditioned resid norm 1.038123933240e-22 true resid norm
> 3.213931469966e-20 ||r(i)||/||b|| 4.026000800530e-15
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 7.982937980399e-06 gnorm
> 9.776066323463e-13
> >
> >   3 SNES Function norm 9.776066323463e-13
> >
> > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3
> >
> > -n 2, final "iteration":
> >
> >   0 SNES Function norm 9.695669610792e+01
> >
> >     0 KSP Residual norm 7.898912593878e-03
> >
> >     0 KSP preconditioned resid norm 7.898912593878e-03 true resid norm
> 9.695669610792e+01 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 1.752819851736e-17
> >
> >     1 KSP preconditioned resid norm 1.752819851736e-17 true resid norm
> 1.017605437996e-13 ||r(i)||/||b|| 1.049546322064e-15
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 9.695669610792e+01 gnorm
> 1.026572731655e-01
> >
> >   1 SNES Function norm 1.026572731655e-01
> >
> >     0 KSP Residual norm 1.382450412926e-04
> >
> >     0 KSP preconditioned resid norm 1.382450412926e-04 true resid norm
> 1.026572731655e-01 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 1.701690118486e-19
> >
> >     1 KSP preconditioned resid norm 1.701690118486e-19 true resid norm
> 9.077679331860e-17 ||r(i)||/||b|| 8.842704517606e-16
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 1.026572731655e-01 gnorm
> 7.982937883350e-06
> >
> >   2 SNES Function norm 7.982937883350e-06
> >
> >     0 KSP Residual norm 4.223898196594e-08
> >
> >     0 KSP preconditioned resid norm 4.223898196594e-08 true resid norm
> 7.982937883350e-06 ||r(i)||/||b|| 1.000000000000e+00
> >
> >     1 KSP Residual norm 1.471638984554e-23
> >
> >     1 KSP preconditioned resid norm 1.471638984554e-23 true resid norm
> 2.483672977401e-20 ||r(i)||/||b|| 3.111226735938e-15
> >
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >
> >       Line search: Using full step: fnorm 7.982937883350e-06 gnorm
> 1.019121417798e-12
> >
> >   3 SNES Function norm 1.019121417798e-12
> >
> >
> > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3
> >
> >
> >
> > Of course these differences are still very small, but this is only true
> for such a small problem size.  For a regular sized problem, the
> differences at the final iteration can exceed 1 and even 100 at a
> particular grid point (i.e. in a sense that doesn't scale with problem
> size).
> >
> > I also compared -n 1 and -n 2 with the -snes_monitor_solution
> -ksp_view_rhs -ksp_view_mat -ksp_view_solution options on a tiny problem
> (5x5x5), and I was not able to find any differences in the Jacobian or the
> vectors, but I'm suspicious that this could be due to the output format,
> because even for the tiny problem there are non-trivial differences in the
> residuals of both the SNES and the KSP.
> >
> > In all cases, the differences in the residuals are localized to the
> boundary between parts of the displacement vector owned by the two
> processes.  The SNES residual with -n 2 typically looks discontinuous
> across that boundary.
> >
> >
> > On Thu, Nov 9, 2017 at 11:16 AM, zakaryah . <zakaryah at gmail.com> wrote:
> > Thanks Stefano, I will try what you suggest.
> >
> > ​Matt - my DM is a composite between the redundant field (loading
> coefficient, which is included in the Newton solve in Riks' method) and the
> displacements, which are represented by a 3D DA with 3 dof.  I am using
> finite difference.
> >
> > Probably my problem comes from confusion over how the composite DM is
> organized.  I am using FormFunction()​, and within that I call
> DMCompositeGetLocalVectors(), DMCompositeScatter(), DMDAVecGetArray(), and
> for the Jacobian, DMCompositeGetLocalISs() and MatGetLocalSubmatrix() to
> split J into Jbb, Jbh, Jhb, and Jhh, where b is the loading coefficient,
> and h is the displacements).  The values of each submatrix are set using
> MatSetValuesLocal().
> >
> > ​I'm most suspicious of the part of the Jacobian routine where I
> calculate the rows of Jhb, the columns of Jbh, and the corresponding
> values.  I take the DA coordinates and ix,iy,iz, then calculate the row of
> Jhb as ((((iz-info->gzs)*info->gym + (iy-info->gys))*info->gxm +
> (ix-info->gxs))*info->dof+c), where info is the DA local info and c is the
> degree of freedom.  The same calculation is performed for the column of
> Jbh.  I suspect that the indexing of the DA vector is not so simple, but I
> don't know for a fact that I'm doing this incorrectly nor how to do this
> properly.
> >
> > ​Thanks for all the help!​
> >
> >
> > On Nov 9, 2017 8:44 AM, "Matthew Knepley" <knepley at gmail.com> wrote:
> > On Thu, Nov 9, 2017 at 12:14 AM, zakaryah . <zakaryah at gmail.com> wrote:
> > Well the saga of my problem continues.  As I described previously in an
> epic thread, I'm using the SNES to solve problems involving an elastic
> material on a rectangular grid, subjected to external forces.  In any case,
> I'm occasionally getting poor convergence using Newton's method with line
> search.  In troubleshooting by visualizing the residual, I saw that in data
> sets which had good convergence, the residual was nevertheless
> significantly larger along the boundary between different processors.
> Likewise, in data sets with poor convergence, the residual became very
> large on the boundary between different processors.  The residual is not
> significantly larger on the physical boundary, i.e. the global boundary.
> When I run on a single process, convergence seems to be good on all data
> sets.
> >
> > Any clues to fix this?
> >
> > It sounds like something is wrong with communication across domains:
> >
> >  - If this is FEM, it sounds like you are not adding contributions from
> the other domain to shared vertices/edges/faces
> >
> >  - If this is FDM/FVM, maybe the ghosts are not updated
> >
> > What DM are you using? Are you using the Local assembly functions
> (FormFunctionLocal), or just FormFunction()?
> >
> >   Thanks,
> >
> >      Matt
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171110/d3bc8f4e/attachment-0001.html>


More information about the petsc-users mailing list