<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">Danyang:</div><div class="gmail_quote">It is likely a zero pivot. I'm adding a feature to petsc. When matrix factorization fails, computation continues with error information stored in</div><div class="gmail_quote">ksp->reason=DIVERGED_PCSETUP_FAILED.</div><div class="gmail_quote">For your timestepping code, you may able to automatically reduce timestep and continue your simulation.</div><div class="gmail_quote"><br></div><div class="gmail_quote">Do you want to test it? If so, you need install petsc-dev with my branch hzhang/matpackage-erroriffpe on your cluster. We may merge this branch to petsc-master soon.<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><br>
It's not easy to run in debugging mode as the cluster does not have
petsc installed using debug mode. Restart the case from the crashing
time does not has the problem. So if I want to detect this error, I
need to start the simulation from beginning which takes hours in the
cluster.<br></div></blockquote><div><br></div><div>This is why we are adding this new feature. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF">
<br>
Do you mean I need to redo symbolic factorization? For now, I only
do factorization once at the first timestep and then reuse it. Some
of the code is shown below. <br>
<br>
if (timestep == 1) then<br>
call
PCFactorSetMatSolverPackage(pc_flow,MATSOLVERMUMPS,ierr)<br>
CHKERRQ(ierr)<br>
<br>
call PCFactorSetUpMatSolverPackage(pc_flow,ierr)<br>
CHKERRQ(ierr)<br>
<br>
call PCFactorGetMatrix(pc_flow,a_flow_j,ierr)<br>
CHKERRQ(ierr)<br>
end if<br>
<br>
call KSPSolve(ksp_flow,b_flow,x_flow,ierr)<br>
CHKERRQ(ierr)<br></div></blockquote><div><br></div><div>I do not think you need to change this part of code. </div><div>Does you code check convergence at each time step?</div><div><br></div><div>Hong</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><div class="h5"><br>
<br>
<div>On 15-12-02 08:39 AM, Hong wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">Danyang :
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">My
code fails due to the error in external library. It works
fine for the previous 2000+ timesteps but then crashes.<br>
<br>
[4]PETSC ERROR: Error in external library<br>
[4]PETSC ERROR: Error reported by MUMPS in numerical
factorization phase: INFO(1)=-1, INFO(2)=0<br>
</blockquote>
<div> </div>
<div>This simply says an error occurred in proc[0] during
numerical factorization, which usually either encounter a
zeropivot or run out of memory. Since it is at a later
timesteps, which I guess you reuse matrix factor,
zeropivot might be the problem.</div>
<div>Is possible to run it in debugging mode? In this way,
mumps would dump out more information.</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
Then I tried the same simulation on another machine using
the same number of processors, it does not fail.<br>
</blockquote>
<div>Does this machine have larger memory?</div>
<div><br>
</div>
<div>Hong</div>
</div>
</div>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br></div></div>