<div dir="ltr">Interesting. The main thing is it's now sorted out and then solver is back in production. <div><br></div><div>Thanks for your help</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 11, 2017 at 1:05 PM, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
> On Aug 11, 2017, at 2:43 PM, Gaetan Kenway <<a href="mailto:gaetank@gmail.com">gaetank@gmail.com</a>> wrote:<br>
><br>
> Huh. That's odd then. I was actually bisecting the petsc releases to narrow it down...I knew 3.3 was ok and 3.7 was not. So I tried 3.5 which was ok, and then 3.6 which was ok as well, leading to me to conclude the difference was between 3.6 and 3.7.<br>
<br>
</span> Other people have also reported only seeing the problem with later versions. I think it is related to the default tolerances and it is just "pure luck" that it didn't need a shift with the intermediate versions. ILU with nearly zero pivots is a fragile issue.<br>
<span class="HOEnZb"><font color="#888888"><br>
Barry<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
><br>
> On Fri, Aug 11, 2017 at 12:03 PM, Barry Smith <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>> wrote:<br>
><br>
> Thanks for confirming this. The change was actually in the 3.4 release. I have updated the 3.4 changes file to include this change in both the maint and master branches.<br>
><br>
> Barry<br>
><br>
> > On Aug 11, 2017, at 12:47 PM, Gaetan Kenway <<a href="mailto:gaetank@gmail.com">gaetank@gmail.com</a>> wrote:<br>
> ><br>
> > OK, so that was certainly it. I vaguely recall reading something about this on the mailing list at one point in time, but couldn't find anything.<br>
> > I would definitely put something on the 3.7 change doc since I looked there first to see if anything stuck out.<br>
> ><br>
> > Thanks!<br>
> ><br>
> > On Fri, Aug 11, 2017 at 10:28 AM, Barry Smith <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>> wrote:<br>
> ><br>
> > Run with the additional option<br>
> ><br>
> > -sub_pc_factor_shift_type nonzero<br>
> ><br>
> > does this resolve the problem. We changed the default behavior for ILU when it detects "zero" pivots.<br>
> ><br>
> > Please let us know if this resolves the problem and we'll update the changes file.<br>
> ><br>
> > Barry<br>
> ><br>
> ><br>
> ><br>
> > > On Aug 11, 2017, at 12:14 PM, Gaetan Kenway <<a href="mailto:gaetank@gmail.com">gaetank@gmail.com</a>> wrote:<br>
> > ><br>
> > > Hi All<br>
> > ><br>
> > > I'm in the process of updating a code that uses PETSc for solving linear systems for an unstructured CFD code. As of recently, it was using an ancient version (3.3). However, when I updated it up to 3.7.6 I ended up running into issues with one of the KSP solves. The remainder of the code is identical,<br>
> > > I've tracked the issue down to occurring between version 3.6.4 and version 3.7.0 . The same issue is present on the most recent version 3.7.6.<br>
> > ><br>
> > > Specifically the issue is that on the second iteration, on the 3.7 version the KSP kicks out with KSP converged reason of -11 or KSP_DIVERGED_PCSETUP_FAILED . After that the two runs differ. The KSPView for each of the two are given below which are identical, up to small formatting changes. There is still more I can track down, but I thought I would ask if someone knows what might have changed between these two versions which could save me a lot of time.<br>
> > ><br>
> > > Thanks,<br>
> > > Gaetan<br>
> > ><br>
> > > 3.6 KSP View:<br>
> > > KSP Object: 8 MPI processes<br>
> > > type: gmres<br>
> > > GMRES: restart=3, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>
> > > GMRES: happy breakdown tolerance 1e-30<br>
> > > maximum iterations=3<br>
> > > using preconditioner applied to right hand side for initial guess<br>
> > > tolerances: relative=1e-08, absolute=1e-20, divergence=1e+15<br>
> > > left preconditioning<br>
> > > using nonzero initial guess<br>
> > > using PRECONDITIONED norm type for convergence test<br>
> > > PC Object: 8 MPI processes<br>
> > > type: bjacobi<br>
> > > block Jacobi: number of blocks = 8<br>
> > > Local solve is same for all blocks, in the following KSP and PC objects:<br>
> > > KSP Object: (sub_) 1 MPI processes<br>
> > > type: preonly<br>
> > > maximum iterations=10000, initial guess is zero<br>
> > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000<br>
> > > left preconditioning<br>
> > > using NONE norm type for convergence test<br>
> > > PC Object: (sub_) 1 MPI processes<br>
> > > type: ilu<br>
> > > ILU: out-of-place factorization<br>
> > > 0 levels of fill<br>
> > > tolerance for zero pivot 2.22045e-14<br>
> > > matrix ordering: natural<br>
> > > factor fill ratio given 1, needed 1<br>
> > > Factored matrix follows:<br>
> > > Mat Object: 1 MPI processes<br>
> > > type: seqaij<br>
> > > rows=46439, cols=46439<br>
> > > package used to perform factorization: petsc<br>
> > > total: nonzeros=502615, allocated nonzeros=502615<br>
> > > total number of mallocs used during MatSetValues calls =0<br>
> > > not using I-node routines<br>
> > > linear system matrix = precond matrix:<br>
> > > Mat Object: 1 MPI processes<br>
> > > type: seqaij<br>
> > > rows=46439, cols=46439<br>
> > > total: nonzeros=502615, allocated nonzeros=504081<br>
> > > total number of mallocs used during MatSetValues calls =0<br>
> > > not using I-node routines<br>
> > > linear system matrix = precond matrix:<br>
> > > Mat Object: 8 MPI processes<br>
> > > type: mpiaij<br>
> > > rows=368656, cols=368656<br>
> > > total: nonzeros=4.63682e+06, allocated nonzeros=4.64417e+06<br>
> > > total number of mallocs used during MatSetValues calls =0<br>
> > > not using I-node (on process 0) routines<br>
> > > <my output: reason, iterations, rtol, atol><br>
> > > reason,its: 2 3 0.001 1e-20<br>
> > ><br>
> > ><br>
> > > Petsc 3.7 KSP View<br>
> > > KSP Object: 8 MPI processes<br>
> > > type: gmres<br>
> > > GMRES: restart=3, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>
> > > GMRES: happy breakdown tolerance 1e-30<br>
> > > maximum iterations=3<br>
> > > using preconditioner applied to right hand side for initial guess<br>
> > > tolerances: relative=1e-08, absolute=1e-20, divergence=1e+15<br>
> > > left preconditioning<br>
> > > using nonzero initial guess<br>
> > > using PRECONDITIONED norm type for convergence test<br>
> > > PC Object: 8 MPI processes<br>
> > > type: bjacobi<br>
> > > block Jacobi: number of blocks = 8<br>
> > > Local solve is same for all blocks, in the following KSP and PC objects:<br>
> > > KSP Object: (sub_) 1 MPI processes<br>
> > > type: preonly<br>
> > > maximum iterations=10000, initial guess is zero<br>
> > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
> > > left preconditioning<br>
> > > using NONE norm type for convergence test<br>
> > > PC Object: (sub_) 1 MPI processes<br>
> > > type: ilu<br>
> > > ILU: out-of-place factorization<br>
> > > 0 levels of fill<br>
> > > tolerance for zero pivot 2.22045e-14<br>
> > > matrix ordering: natural<br>
> > > factor fill ratio given 1., needed 1.<br>
> > > Factored matrix follows:<br>
> > > Mat Object: 1 MPI processes<br>
> > > type: seqaij<br>
> > > rows=46439, cols=46439<br>
> > > package used to perform factorization: petsc<br>
> > > total: nonzeros=502615, allocated nonzeros=502615<br>
> > > total number of mallocs used during MatSetValues calls =0<br>
> > > not using I-node routines<br>
> > > linear system matrix = precond matrix:<br>
> > > Mat Object: 1 MPI processes<br>
> > > type: seqaij<br>
> > > rows=46439, cols=46439<br>
> > > total: nonzeros=502615, allocated nonzeros=504081<br>
> > > total number of mallocs used during MatSetValues calls =0<br>
> > > not using I-node routines<br>
> > > linear system matrix = precond matrix:<br>
> > > Mat Object: 8 MPI processes<br>
> > > type: mpiaij<br>
> > > rows=368656, cols=368656<br>
> > > total: nonzeros=4636822, allocated nonzeros=4644168<br>
> > > total number of mallocs used during MatSetValues calls =0<br>
> > > not using I-node (on process 0) routines<br>
> > > <my output: reason, iterations, rtol, atol><br>
> > > reason,its: -11 0 0.001 1e-20<br>
> > ><br>
> > ><br>
> ><br>
> ><br>
><br>
><br>
<br>
</div></div></blockquote></div><br></div>