<div dir="ltr">Your subdomain solves do not appear to be producing descent whatsoever. Possible reasons:<div><br></div><div>  1) Your subdomain Jacobians are wrong (this is usually the problem)</div><div><br></div><div>  2) You have some global coupling field for which local solves give no descent. (For this you want nonlinear elimination I think)</div><div><br></div><div>  Thanks,</div><div><br></div><div>     Matt</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, May 12, 2022 at 9:02 AM Takahashi, Tadanaga <<a href="mailto:tt73@njit.edu" target="_blank">tt73@njit.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">I ran the code with the additional options but the raw output is about 75,000 lines. I cannot paste it directly in the email. The output is in the attached file. </div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, May 11, 2022 at 11:44 PM Jed Brown <<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Can you add -snes_linesearch_monitor -sub_snes_linesearch_monitor -ksp_converged_reason and send the output??<br>

<br>

"Takahashi, Tadanaga" <<a href="mailto:tt73@njit.edu" target="_blank">tt73@njit.edu</a>> writes:<br>

<br>

> Hello,<br>

><br>

> We are working on a finite difference solver for a 2D nonlinear PDE with<br>

> Dirichlet Boundary conditions on a rectangular domain. Our goal is to solve<br>

> the problem with parallel nonlinear additive Schwarz (NASM) as the outer<br>

> solver. Our code is similar to SNES example 5<br>

> <<a href="https://petsc.org/release/src/snes/tutorials/ex5.c.html" rel="noreferrer" target="_blank">https://petsc.org/release/src/snes/tutorials/ex5.c.html</a>>. In example 5,<br>

> the parallel NASM can be executed with a command like `mpiexec -n 4 ./ex5<br>

> -mms 3 -snes_type nasm -snes_nasm_type restrict -da_overlap 2` which gives<br>

> a convergent result. We assume this is the correct usage. A comment in the<br>

> source code for NASM mentions that NASM should be a preconditioner but<br>

> there's no documentation on the usage. The Brune paper does not cover<br>

> parallel NASM either. We observed that increasing the overlap leads to<br>

> fewer Schwarz iterations. The parallelization works seamlessly for an<br>

> arbitrary number of subdomains. This is the type of behavior we were<br>

> expecting from our code.<br>

><br>

> Our method uses box-style stencil width d = ceil(N^(1/3)) on a N by N DMDA.<br>

> The finite difference stencil consists of 4d+1 points spread out in a<br>

> diamond formation. If a stencil point is out of bounds, then it is<br>

> projected onto the boundary curve. Since the nodes on the boundary curve<br>

> would result in an irregular mesh, we chose not treat boundary nodes as<br>

> unknowns as in Example 5. We use DMDACreate2d to create the DA for the<br>

> interior points and DMDASNESSetFunctionLocal to associate the residue<br>

> function to the SNES object.<br>

><br>

> Our code works serially. We have also tested our code<br>

> with Newton-Krylov-Schwarz (NKS) by running something akin to `mpiexec -n<br>

> <n> ./solve -snes_type newtonls`. We have tested the NKS for several<br>

> quantities of subdomains and overlap and the code works as expected. We<br>

> have some confidence in the correctness of our code. The overlapping NASM<br>

> was implemented in MATLAB so we know the method converges. However, the<br>

> parallel NASM will not converge with our PETSc code. We don't understand<br>

> why NKS works while NASM does not. The F-norm residue monotonically<br>

> decreases and then stagnates.<br>

><br>

> Here is an example of the output when attempting to run NASM in parallel:<br>

> takahashi@ubuntu:~/Desktop/MA-DDM/Cpp/Rectangle$ mpiexec -n 4 ./test1 -t1_N<br>

> 20 -snes_max_it 50 -snes_monitor -snes_view -da_overlap 3 -snes_type nasm<br>

> -snes_nasm_type restrict<br>

>   0 SNES Function norm 7.244681057908e+02<br>

>   1 SNES Function norm 1.237688062971e+02<br>

>   2 SNES Function norm 1.068926073552e+02<br>

>   3 SNES Function norm 1.027563237834e+02<br>

>   4 SNES Function norm 1.022184806736e+02<br>

>   5 SNES Function norm 1.020818227640e+02<br>

>   6 SNES Function norm 1.020325629121e+02<br>

>   7 SNES Function norm 1.020149036595e+02<br>

>   8 SNES Function norm 1.020088110545e+02<br>

>   9 SNES Function norm 1.020067198030e+02<br>

>  10 SNES Function norm 1.020060034469e+02<br>

>  11 SNES Function norm 1.020057582380e+02<br>

>  12 SNES Function norm 1.020056743241e+02<br>

>  13 SNES Function norm 1.020056456101e+02<br>

>  14 SNES Function norm 1.020056357849e+02<br>

>  15 SNES Function norm 1.020056324231e+02<br>

>  16 SNES Function norm 1.020056312727e+02<br>

>  17 SNES Function norm 1.020056308791e+02<br>

>  18 SNES Function norm 1.020056307444e+02<br>

>  19 SNES Function norm 1.020056306983e+02<br>

>  20 SNES Function norm 1.020056306826e+02<br>

>  21 SNES Function norm 1.020056306772e+02<br>

>  22 SNES Function norm 1.020056306753e+02<br>

>  23 SNES Function norm 1.020056306747e+02<br>

>  24 SNES Function norm 1.020056306745e+02<br>

>  25 SNES Function norm 1.020056306744e+02<br>

>  26 SNES Function norm 1.020056306744e+02<br>

>  27 SNES Function norm 1.020056306744e+02<br>

>  28 SNES Function norm 1.020056306744e+02<br>

>  29 SNES Function norm 1.020056306744e+02<br>

>  30 SNES Function norm 1.020056306744e+02<br>

>  31 SNES Function norm 1.020056306744e+02<br>

>  32 SNES Function norm 1.020056306744e+02<br>

>  33 SNES Function norm 1.020056306744e+02<br>

>  34 SNES Function norm 1.020056306744e+02<br>

>  35 SNES Function norm 1.020056306744e+02<br>

>  36 SNES Function norm 1.020056306744e+02<br>

>  37 SNES Function norm 1.020056306744e+02<br>

>  38 SNES Function norm 1.020056306744e+02<br>

>  39 SNES Function norm 1.020056306744e+02<br>

>  40 SNES Function norm 1.020056306744e+02<br>

>  41 SNES Function norm 1.020056306744e+02<br>

>  42 SNES Function norm 1.020056306744e+02<br>

>  43 SNES Function norm 1.020056306744e+02<br>

>  44 SNES Function norm 1.020056306744e+02<br>

>  45 SNES Function norm 1.020056306744e+02<br>

>  46 SNES Function norm 1.020056306744e+02<br>

>  47 SNES Function norm 1.020056306744e+02<br>

>  48 SNES Function norm 1.020056306744e+02<br>

>  49 SNES Function norm 1.020056306744e+02<br>

>  50 SNES Function norm 1.020056306744e+02<br>

> SNES Object: 4 MPI processes<br>

>   type: nasm<br>

>     total subdomain blocks = 4<br>

>     Local solver information for first block on rank 0:<br>

>     Use -snes_view ::ascii_info_detail to display information for all blocks<br>

>     SNES Object: (sub_) 1 MPI processes<br>

>       type: newtonls<br>

>       maximum iterations=50, maximum function evaluations=10000<br>

>       tolerances: relative=1e-08, absolute=1e-50, solution=1e-08<br>

>       total number of linear solver iterations=22<br>

>       total number of function evaluations=40<br>

>       norm schedule ALWAYS<br>

>       Jacobian is built using a DMDA local Jacobian<br>

>       SNESLineSearch Object: (sub_) 1 MPI processes<br>

>         type: bt<br>

>           interpolation: cubic<br>

>           alpha=1.000000e-04<br>

>         maxstep=1.000000e+08, minlambda=1.000000e-12<br>

>         tolerances: relative=1.000000e-08, absolute=1.000000e-15,<br>

> lambda=1.000000e-08<br>

>         maximum iterations=40<br>

>       KSP Object: (sub_) 1 MPI processes<br>

>         type: preonly<br>

>         maximum iterations=10000, initial guess is zero<br>

>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

>         left preconditioning<br>

>         using NONE norm type for convergence test<br>

>       PC Object: (sub_) 1 MPI processes<br>

>         type: lu<br>

>           out-of-place factorization<br>

>           tolerance for zero pivot 2.22045e-14<br>

>           matrix ordering: nd<br>

>           factor fill ratio given 5., needed 2.13732<br>

>             Factored matrix follows:<br>

>               Mat Object: 1 MPI processes<br>

>                 type: seqaij<br>

>                 rows=169, cols=169<br>

>                 package used to perform factorization: petsc<br>

>                 total: nonzeros=13339, allocated nonzeros=13339<br>

>                   using I-node routines: found 104 nodes, limit used is 5<br>

>         linear system matrix = precond matrix:<br>

>         Mat Object: 1 MPI processes<br>

>           type: seqaij<br>

>           rows=169, cols=169<br>

>           total: nonzeros=6241, allocated nonzeros=6241<br>

>           total number of mallocs used during MatSetValues calls=0<br>

>             not using I-node routines<br>

>   maximum iterations=50, maximum function evaluations=10000<br>

>   tolerances: relative=1e-08, absolute=1e-50, solution=1e-08<br>

>   total number of function evaluations=51<br>

>   norm schedule ALWAYS<br>

>   Jacobian is built using a DMDA local Jacobian<br>

> problem ex10 on 20 x 20 point 2D grid with d = 3, and eps = 0.082:<br>

>   error |u-uexact|_inf = 3.996e-01, |u-uexact|_h = 2.837e-01<br>

><br>

> We have been stuck on this for a while now. We do not know how to debug<br>

> this issue. Please let us know if you have any insights.<br>

</blockquote></div>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div>