<div dir="ltr"><div><div>Thanks a lot for your explanation, Barry,<br><br></div>This makes sense!<br><br></div>Fande,<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Feb 24, 2017 at 1:56 PM, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

    Fande,<br>

<br>

    Yes. Say one is doing a timestepping and using a direct solver. With time stepping we do not want to necessarily stop (in fact we almost never) on a failed solve (due to for example a failed factorization). We want the code to continue up the stack until it gets to the time stepper with the information that the linear solver failed; then the timestepper can decide what to do, decrease the time step and try again (common) or switch to another method etc. The same could be true for linear solvers called from within optimization routines.<br>

<br>

    The reason we do the funny business with putting infinity into the vector is so that errors that occur on any process get propagated to all processes during the next inner product or norm computation, otherwise we would need to have some "side channel" communication which MPI doesn't really provide to propagate errors to all processes, this would be a lot of performance crippling communication that would have to take place during the entire solution process. Plus our approach since it doesn't change the flow of control cannot result in lost memory etc. Note that C (and MPI) don't have any exception mechanism that could be used to handle these "exceptional cases" directly.<br>

<br>

   Note that you can use options like -snes_error_if_not_converged or -ksp_error_if_not_converged to force the error to be set as soon as any problem with convergence or zero factorization is found (good for debugging, but not usually for production unless production is solving a linear system only).<br>

<span class="HOEnZb"><font color="#888888"><br>

   Barry<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

> On Feb 24, 2017, at 2:40 PM, Kong, Fande <<a href="mailto:fande.kong@inl.gov">fande.kong@inl.gov</a>> wrote:<br>

><br>

> Hi All,<br>

><br>

> In MatSolve(), there is a piece of code:<br>

><br>

>   if (mat->errortype) {<br>

>     ierr = PetscInfo1(mat,"MatFactorError %D\n",mat->errortype);CHKERRQ(<wbr>ierr);<br>

>     ierr = VecSetInf(x);CHKERRQ(ierr);<br>

>   } else {<br>

>     ierr = (*mat->ops->solve)(mat,b,x);<wbr>CHKERRQ(ierr);<br>

>   }<br>

><br>

> If a direct solver such as LU or superlu fails to create a factorization, why we do not stop here and throw out an error message here or earlier?  Now we just let solver keep doing garbage computation, and finally we have a solution like this:<br>

><br>

> Vec Object: 1 MPI processes<br>

>   type: mpi<br>

> Process [0]<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

> inf.<br>

><br>

> Any particular reason to handle the thing in this way?<br>

><br>

> Fande,<br>

><br>

><br>

<br>

</div></div></blockquote></div><br></div>