[petsc-users] Newton LS - better results on single processor
Smith, Barry F.
bsmith at mcs.anl.gov
Thu Nov 9 22:09:10 CST 2017
> On Nov 9, 2017, at 3:33 PM, zakaryah . <zakaryah at gmail.com> wrote:
>
> Hi Stefano - when I referred to the iterations, I was trying to point out that my method solves a series of nonlinear systems, with the solution to the first problem being used to initialize the state vector for the second problem, etc. The reason I mentioned that was I thought perhaps I can expect the residuals from single process solve to differ from the residuals from multiprocess solve by a very small amount, say machine precision or the tolerance of the KSP/SNES, that would be fine normally. But, if there is a possibility that those differences are somehow amplified by each of the iterations (solution->initial state), that could explain what I see.
>
> I agree that it is more likely that I have a bug in my code but I'm having trouble finding it.
Run a tiny problem on one and two processes with LU linear solver and the same mesh. So in the first case all values live on the first process and in the second the same first half live on one process and the second half on the second process.
Now track the values in the actual vectors and matrices. For example you can just put in VecView() and MatView() on all objects you pass into the solver and then put them in the SNESComputeFunction/Jacobian routines. Print both the vectors inputed to these routines and the vectors/matrices created in the routines. The output differences from the two runs should be small, determine when they significantly vary. This will tell you the likely location of the bug in your source code. (For example if certain values of the Jacobian differ)
Good luck, I've done this plenty of times and if it is a "parallelization" bug this will help you find it much faster than guessing where the problem is and trying code inspect to find the bug.
Barry
>
> I ran a small problem with -pc_type redundant -redundant_pc_type lu, as you suggested. What I think is the relevant portion of the output is here (i.e. there are small differences in the KSP residuals and SNES residuals):
>
> -n 1, first "iteration" as described above:
>
> 0 SNES Function norm 6.053565720454e-02
> 0 KSP Residual norm 4.883115701982e-05
>
> 0 KSP preconditioned resid norm 4.883115701982e-05 true resid norm 6.053565720454e-02 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 8.173640409069e-20
>
> 1 KSP preconditioned resid norm 8.173640409069e-20 true resid norm 1.742143029296e-16 ||r(i)||/||b|| 2.877879104227e-15
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 6.053565720454e-02 gnorm 2.735518570862e-07
>
> 1 SNES Function norm 2.735518570862e-07
>
> 0 KSP Residual norm 1.298536630766e-10
>
> 0 KSP preconditioned resid norm 1.298536630766e-10 true resid norm 2.735518570862e-07 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 2.152782096751e-25
>
> 1 KSP preconditioned resid norm 2.152782096751e-25 true resid norm 4.755555202641e-22 ||r(i)||/||b|| 1.738447420279e-15
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 2.735518570862e-07 gnorm 1.917989238989e-17
>
> 2 SNES Function norm 1.917989238989e-17
>
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
>
>
>
> -n 2, first "iteration" as described above:
>
> 0 SNES Function norm 6.053565720454e-02
>
> 0 KSP Residual norm 4.883115701982e-05
>
> 0 KSP preconditioned resid norm 4.883115701982e-05 true resid norm 6.053565720454e-02 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 1.007084240718e-19
>
> 1 KSP preconditioned resid norm 1.007084240718e-19 true resid norm 1.868472589717e-16 ||r(i)||/||b|| 3.086565300520e-15
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 6.053565720454e-02 gnorm 2.735518570379e-07
>
> 1 SNES Function norm 2.735518570379e-07
>
> 0 KSP Residual norm 1.298536630342e-10
>
> 0 KSP preconditioned resid norm 1.298536630342e-10 true resid norm 2.735518570379e-07 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 1.885083482938e-25
>
> 1 KSP preconditioned resid norm 1.885083482938e-25 true resid norm 4.735707460766e-22 ||r(i)||/||b|| 1.731191852267e-15
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 2.735518570379e-07 gnorm 1.851472273258e-17
>
>
> 2 SNES Function norm 1.851472273258e-17
>
>
> -n 1, final "iteration":
> 0 SNES Function norm 9.695669610792e+01
>
> 0 KSP Residual norm 7.898912593878e-03
>
> 0 KSP preconditioned resid norm 7.898912593878e-03 true resid norm 9.695669610792e+01 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 1.720960785852e-17
>
> 1 KSP preconditioned resid norm 1.720960785852e-17 true resid norm 1.237111121391e-13 ||r(i)||/||b|| 1.275941911237e-15
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 9.695669610792e+01 gnorm 1.026572731653e-01
>
> 1 SNES Function norm 1.026572731653e-01
>
> 0 KSP Residual norm 1.382450412926e-04
>
> 0 KSP preconditioned resid norm 1.382450412926e-04 true resid norm 1.026572731653e-01 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 5.018078565710e-20
>
> 1 KSP preconditioned resid norm 5.018078565710e-20 true resid norm 9.031463071676e-17 ||r(i)||/||b|| 8.797684560673e-16
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 1.026572731653e-01 gnorm 7.982937980399e-06
>
> 2 SNES Function norm 7.982937980399e-06
>
> 0 KSP Residual norm 4.223898196692e-08
>
> 0 KSP preconditioned resid norm 4.223898196692e-08 true resid norm 7.982937980399e-06 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 1.038123933240e-22
>
> 1 KSP preconditioned resid norm 1.038123933240e-22 true resid norm 3.213931469966e-20 ||r(i)||/||b|| 4.026000800530e-15
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 7.982937980399e-06 gnorm 9.776066323463e-13
>
> 3 SNES Function norm 9.776066323463e-13
>
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3
>
> -n 2, final "iteration":
>
> 0 SNES Function norm 9.695669610792e+01
>
> 0 KSP Residual norm 7.898912593878e-03
>
> 0 KSP preconditioned resid norm 7.898912593878e-03 true resid norm 9.695669610792e+01 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 1.752819851736e-17
>
> 1 KSP preconditioned resid norm 1.752819851736e-17 true resid norm 1.017605437996e-13 ||r(i)||/||b|| 1.049546322064e-15
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 9.695669610792e+01 gnorm 1.026572731655e-01
>
> 1 SNES Function norm 1.026572731655e-01
>
> 0 KSP Residual norm 1.382450412926e-04
>
> 0 KSP preconditioned resid norm 1.382450412926e-04 true resid norm 1.026572731655e-01 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 1.701690118486e-19
>
> 1 KSP preconditioned resid norm 1.701690118486e-19 true resid norm 9.077679331860e-17 ||r(i)||/||b|| 8.842704517606e-16
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 1.026572731655e-01 gnorm 7.982937883350e-06
>
> 2 SNES Function norm 7.982937883350e-06
>
> 0 KSP Residual norm 4.223898196594e-08
>
> 0 KSP preconditioned resid norm 4.223898196594e-08 true resid norm 7.982937883350e-06 ||r(i)||/||b|| 1.000000000000e+00
>
> 1 KSP Residual norm 1.471638984554e-23
>
> 1 KSP preconditioned resid norm 1.471638984554e-23 true resid norm 2.483672977401e-20 ||r(i)||/||b|| 3.111226735938e-15
>
> Linear solve converged due to CONVERGED_RTOL iterations 1
>
> Line search: Using full step: fnorm 7.982937883350e-06 gnorm 1.019121417798e-12
>
> 3 SNES Function norm 1.019121417798e-12
>
>
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3
>
>
>
> Of course these differences are still very small, but this is only true for such a small problem size. For a regular sized problem, the differences at the final iteration can exceed 1 and even 100 at a particular grid point (i.e. in a sense that doesn't scale with problem size).
>
> I also compared -n 1 and -n 2 with the -snes_monitor_solution -ksp_view_rhs -ksp_view_mat -ksp_view_solution options on a tiny problem (5x5x5), and I was not able to find any differences in the Jacobian or the vectors, but I'm suspicious that this could be due to the output format, because even for the tiny problem there are non-trivial differences in the residuals of both the SNES and the KSP.
>
> In all cases, the differences in the residuals are localized to the boundary between parts of the displacement vector owned by the two processes. The SNES residual with -n 2 typically looks discontinuous across that boundary.
>
>
> On Thu, Nov 9, 2017 at 11:16 AM, zakaryah . <zakaryah at gmail.com> wrote:
> Thanks Stefano, I will try what you suggest.
>
> Matt - my DM is a composite between the redundant field (loading coefficient, which is included in the Newton solve in Riks' method) and the displacements, which are represented by a 3D DA with 3 dof. I am using finite difference.
>
> Probably my problem comes from confusion over how the composite DM is organized. I am using FormFunction(), and within that I call DMCompositeGetLocalVectors(), DMCompositeScatter(), DMDAVecGetArray(), and for the Jacobian, DMCompositeGetLocalISs() and MatGetLocalSubmatrix() to split J into Jbb, Jbh, Jhb, and Jhh, where b is the loading coefficient, and h is the displacements). The values of each submatrix are set using MatSetValuesLocal().
>
> I'm most suspicious of the part of the Jacobian routine where I calculate the rows of Jhb, the columns of Jbh, and the corresponding values. I take the DA coordinates and ix,iy,iz, then calculate the row of Jhb as ((((iz-info->gzs)*info->gym + (iy-info->gys))*info->gxm + (ix-info->gxs))*info->dof+c), where info is the DA local info and c is the degree of freedom. The same calculation is performed for the column of Jbh. I suspect that the indexing of the DA vector is not so simple, but I don't know for a fact that I'm doing this incorrectly nor how to do this properly.
>
> Thanks for all the help!
>
>
> On Nov 9, 2017 8:44 AM, "Matthew Knepley" <knepley at gmail.com> wrote:
> On Thu, Nov 9, 2017 at 12:14 AM, zakaryah . <zakaryah at gmail.com> wrote:
> Well the saga of my problem continues. As I described previously in an epic thread, I'm using the SNES to solve problems involving an elastic material on a rectangular grid, subjected to external forces. In any case, I'm occasionally getting poor convergence using Newton's method with line search. In troubleshooting by visualizing the residual, I saw that in data sets which had good convergence, the residual was nevertheless significantly larger along the boundary between different processors. Likewise, in data sets with poor convergence, the residual became very large on the boundary between different processors. The residual is not significantly larger on the physical boundary, i.e. the global boundary. When I run on a single process, convergence seems to be good on all data sets.
>
> Any clues to fix this?
>
> It sounds like something is wrong with communication across domains:
>
> - If this is FEM, it sounds like you are not adding contributions from the other domain to shared vertices/edges/faces
>
> - If this is FDM/FVM, maybe the ghosts are not updated
>
> What DM are you using? Are you using the Local assembly functions (FormFunctionLocal), or just FormFunction()?
>
> Thanks,
>
> Matt
>
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
>
>
>
More information about the petsc-users
mailing list