<div dir="ltr">Interesting. The main thing is it's now sorted out and then solver is back in production. <div><br></div><div>Thanks for your help</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 11, 2017 at 1:05 PM, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>

> On Aug 11, 2017, at 2:43 PM, Gaetan Kenway <<a href="mailto:gaetank@gmail.com">gaetank@gmail.com</a>> wrote:<br>

><br>

> Huh. That's odd then. I was actually bisecting the petsc releases to narrow it down...I knew 3.3 was ok and 3.7 was not. So I tried 3.5 which was ok, and then 3.6 which was ok as well, leading to me to conclude the difference was between 3.6 and 3.7.<br>

<br>

</span>   Other people have also reported only seeing the problem with later versions. I think it is related to the default tolerances and it is just "pure luck" that it didn't need a shift with the intermediate versions. ILU with nearly zero pivots is a fragile issue.<br>

<span class="HOEnZb"><font color="#888888"><br>

   Barry<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

><br>

> On Fri, Aug 11, 2017 at 12:03 PM, Barry Smith <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>> wrote:<br>

><br>

>    Thanks for confirming this. The change was actually in the 3.4 release. I have updated the 3.4 changes file to include this change in both the maint and master branches.<br>

><br>

>     Barry<br>

><br>

> > On Aug 11, 2017, at 12:47 PM, Gaetan Kenway <<a href="mailto:gaetank@gmail.com">gaetank@gmail.com</a>> wrote:<br>

> ><br>

> > OK, so that was certainly it.  I vaguely recall reading something about this on the mailing list at one point in time, but couldn't find anything.<br>

> > I would definitely put something on the 3.7 change doc since I looked there first to see if anything stuck out.<br>

> ><br>

> > Thanks!<br>

> ><br>

> > On Fri, Aug 11, 2017 at 10:28 AM, Barry Smith <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>> wrote:<br>

> ><br>

> >    Run with the additional option<br>

> ><br>

> >  -sub_pc_factor_shift_type nonzero<br>

> ><br>

> > does this resolve the problem. We changed the default behavior for ILU when it detects "zero" pivots.<br>

> ><br>

> > Please let us know if this resolves the problem and we'll update the changes file.<br>

> ><br>

> >    Barry<br>

> ><br>

> ><br>

> ><br>

> > > On Aug 11, 2017, at 12:14 PM, Gaetan Kenway <<a href="mailto:gaetank@gmail.com">gaetank@gmail.com</a>> wrote:<br>

> > ><br>

> > > Hi All<br>

> > ><br>

> > > I'm in the process of updating a code that uses PETSc for solving linear systems for an unstructured CFD code. As of recently, it was using an ancient version (3.3). However, when I updated it up to 3.7.6 I ended up running into issues with one of the KSP solves. The remainder of the code is identical,<br>

> > > I've tracked the issue down to occurring between version 3.6.4 and version 3.7.0 . The same issue is present on the most recent version 3.7.6.<br>

> > ><br>

> > > Specifically the issue is that on the second iteration, on the 3.7 version the KSP kicks out with KSP converged reason of -11 or  KSP_DIVERGED_PCSETUP_FAILED . After that the two runs differ.  The KSPView for each of the two are given below which are identical, up to small formatting changes. There is still more I can track down, but I thought I would ask if someone knows what might have changed between these two versions which could save me a lot of time.<br>

> > ><br>

> > > Thanks,<br>

> > > Gaetan<br>

> > ><br>

> > > 3.6 KSP View:<br>

> > > KSP Object: 8 MPI processes<br>

> > >   type: gmres<br>

> > >     GMRES: restart=3, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>

> > >     GMRES: happy breakdown tolerance 1e-30<br>

> > >   maximum iterations=3<br>

> > >   using preconditioner applied to right hand side for initial guess<br>

> > >   tolerances:  relative=1e-08, absolute=1e-20, divergence=1e+15<br>

> > >   left preconditioning<br>

> > >   using nonzero initial guess<br>

> > >   using PRECONDITIONED norm type for convergence test<br>

> > > PC Object: 8 MPI processes<br>

> > >   type: bjacobi<br>

> > >     block Jacobi: number of blocks = 8<br>

> > >     Local solve is same for all blocks, in the following KSP and PC objects:<br>

> > >   KSP Object:  (sub_)   1 MPI processes<br>

> > >     type: preonly<br>

> > >     maximum iterations=10000, initial guess is zero<br>

> > >     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>

> > >     left preconditioning<br>

> > >     using NONE norm type for convergence test<br>

> > >   PC Object:  (sub_)   1 MPI processes<br>

> > >     type: ilu<br>

> > >       ILU: out-of-place factorization<br>

> > >       0 levels of fill<br>

> > >       tolerance for zero pivot 2.22045e-14<br>

> > >       matrix ordering: natural<br>

> > >       factor fill ratio given 1, needed 1<br>

> > >         Factored matrix follows:<br>

> > >           Mat Object:           1 MPI processes<br>

> > >             type: seqaij<br>

> > >             rows=46439, cols=46439<br>

> > >             package used to perform factorization: petsc<br>

> > >             total: nonzeros=502615, allocated nonzeros=502615<br>

> > >             total number of mallocs used during MatSetValues calls =0<br>

> > >               not using I-node routines<br>

> > >     linear system matrix = precond matrix:<br>

> > >     Mat Object:     1 MPI processes<br>

> > >       type: seqaij<br>

> > >       rows=46439, cols=46439<br>

> > >       total: nonzeros=502615, allocated nonzeros=504081<br>

> > >       total number of mallocs used during MatSetValues calls =0<br>

> > >         not using I-node routines<br>

> > >   linear system matrix = precond matrix:<br>

> > >   Mat Object:   8 MPI processes<br>

> > >     type: mpiaij<br>

> > >     rows=368656, cols=368656<br>

> > >     total: nonzeros=4.63682e+06, allocated nonzeros=4.64417e+06<br>

> > >     total number of mallocs used during MatSetValues calls =0<br>

> > >       not using I-node (on process 0) routines<br>

> > > <my output: reason, iterations, rtol, atol><br>

> > > reason,its: 2 3 0.001 1e-20<br>

> > ><br>

> > ><br>

> > > Petsc 3.7 KSP View<br>

> > > KSP Object: 8 MPI processes<br>

> > >   type: gmres<br>

> > >     GMRES: restart=3, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>

> > >     GMRES: happy breakdown tolerance 1e-30<br>

> > >   maximum iterations=3<br>

> > >   using preconditioner applied to right hand side for initial guess<br>

> > >   tolerances:  relative=1e-08, absolute=1e-20, divergence=1e+15<br>

> > >   left preconditioning<br>

> > >   using nonzero initial guess<br>

> > >   using PRECONDITIONED norm type for convergence test<br>

> > > PC Object: 8 MPI processes<br>

> > >   type: bjacobi<br>

> > >     block Jacobi: number of blocks = 8<br>

> > >     Local solve is same for all blocks, in the following KSP and PC objects:<br>

> > >   KSP Object:  (sub_)   1 MPI processes<br>

> > >     type: preonly<br>

> > >     maximum iterations=10000, initial guess is zero<br>

> > >     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> > >     left preconditioning<br>

> > >     using NONE norm type for convergence test<br>

> > >   PC Object:  (sub_)   1 MPI processes<br>

> > >     type: ilu<br>

> > >       ILU: out-of-place factorization<br>

> > >       0 levels of fill<br>

> > >       tolerance for zero pivot 2.22045e-14<br>

> > >       matrix ordering: natural<br>

> > >       factor fill ratio given 1., needed 1.<br>

> > >         Factored matrix follows:<br>

> > >           Mat Object:           1 MPI processes<br>

> > >             type: seqaij<br>

> > >             rows=46439, cols=46439<br>

> > >             package used to perform factorization: petsc<br>

> > >             total: nonzeros=502615, allocated nonzeros=502615<br>

> > >             total number of mallocs used during MatSetValues calls =0<br>

> > >               not using I-node routines<br>

> > >     linear system matrix = precond matrix:<br>

> > >     Mat Object:     1 MPI processes<br>

> > >       type: seqaij<br>

> > >       rows=46439, cols=46439<br>

> > >       total: nonzeros=502615, allocated nonzeros=504081<br>

> > >       total number of mallocs used during MatSetValues calls =0<br>

> > >         not using I-node routines<br>

> > >   linear system matrix = precond matrix:<br>

> > >   Mat Object:   8 MPI processes<br>

> > >     type: mpiaij<br>

> > >     rows=368656, cols=368656<br>

> > >     total: nonzeros=4636822, allocated nonzeros=4644168<br>

> > >     total number of mallocs used during MatSetValues calls =0<br>

> > >       not using I-node (on process 0) routines<br>

> > > <my output: reason, iterations, rtol, atol><br>

> > > reason,its: -11 0 0.001 1e-20<br>

> > ><br>

> > ><br>

> ><br>

> ><br>

><br>

><br>

<br>

</div></div></blockquote></div><br></div>