<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Nov 15, 2017 at 4:36 PM, Kong, Fande <span dir="ltr"><<a href="mailto:fande.kong@inl.gov" target="_blank">fande.kong@inl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div>Hi Barry,<br><br></div>Thanks for your reply. I was wondering why this happens only when we use superlu_dist. I am trying to understand the algorithm in superlu_dist. If we use ASM or MUMPS, we do not produce these differences. <br><br></div>The differences actually are NOT meaningless.  In fact, we have a real transient application that presents this issue.   When we run the simulation with superlu_dist in parallel for thousands of time steps, the final physics  solution looks totally different from different runs. The differences are not acceptable any more.  For a steady problem, the difference may be meaningless. But it is significant for the transient problem. <br></div></div></div></blockquote><div><br></div><div>Are you sure this formulation is stable? It does not seem like it.</div><div><br></div><div>    Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div></div>This makes the solution not reproducible, and we can not even set a targeting solution in the test system because the solution is so different from one run to another.   I guess there might/may be a tiny bug in superlu_dist or the PETSc interface to superlu_dist.<span class="HOEnZb"><font color="#888888"><br><br><br></font></span></div><span class="HOEnZb"><font color="#888888">Fande,<br><div><br><br><div><div><br></div></div></div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F. <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

  Meaningless differences<br>

<div class="m_-6513361910201159489HOEnZb"><div class="m_-6513361910201159489h5"><br>

<br>

> On Nov 15, 2017, at 2:26 PM, Kong, Fande <<a href="mailto:fande.kong@inl.gov" target="_blank">fande.kong@inl.gov</a>> wrote:<br>

><br>

> Hi,<br>

><br>

> There is a heat conduction problem. When superlu_dist is used as a preconditioner, we have random results from different runs. Is there a random algorithm in superlu_dist? If we use ASM or MUMPS as the preconditioner, we then don't have this issue.<br>

><br>

> run 1:<br>

><br>

>  0 Nonlinear |R| = 9.447423e+03<br>

>       0 Linear |R| = 9.447423e+03<br>

>       1 Linear |R| = 1.013384e-02<br>

>       2 Linear |R| = 4.020995e-08<br>

>  1 Nonlinear |R| = 1.404678e-02<br>

>       0 Linear |R| = 1.404678e-02<br>

>       1 Linear |R| = 5.104757e-08<br>

>       2 Linear |R| = 7.699637e-14<br>

>  2 Nonlinear |R| = 5.106418e-08<br>

><br>

><br>

> run 2:<br>

><br>

>  0 Nonlinear |R| = 9.447423e+03<br>

>       0 Linear |R| = 9.447423e+03<br>

>       1 Linear |R| = 1.013384e-02<br>

>       2 Linear |R| = 4.020995e-08<br>

>  1 Nonlinear |R| = 1.404678e-02<br>

>       0 Linear |R| = 1.404678e-02<br>

>       1 Linear |R| = 5.109913e-08<br>

>       2 Linear |R| = 7.189091e-14<br>

>  2 Nonlinear |R| = 5.111591e-08<br>

><br>

> run 3:<br>

><br>

>  0 Nonlinear |R| = 9.447423e+03<br>

>       0 Linear |R| = 9.447423e+03<br>

>       1 Linear |R| = 1.013384e-02<br>

>       2 Linear |R| = 4.020995e-08<br>

>  1 Nonlinear |R| = 1.404678e-02<br>

>       0 Linear |R| = 1.404678e-02<br>

>       1 Linear |R| = 5.104942e-08<br>

>       2 Linear |R| = 7.465572e-14<br>

>  2 Nonlinear |R| = 5.106642e-08<br>

><br>

> run 4:<br>

><br>

>  0 Nonlinear |R| = 9.447423e+03<br>

>       0 Linear |R| = 9.447423e+03<br>

>       1 Linear |R| = 1.013384e-02<br>

>       2 Linear |R| = 4.020995e-08<br>

>  1 Nonlinear |R| = 1.404678e-02<br>

>       0 Linear |R| = 1.404678e-02<br>

>       1 Linear |R| = 5.102730e-08<br>

>       2 Linear |R| = 7.132220e-14<br>

>  2 Nonlinear |R| = 5.104442e-08<br>

><br>

> Solver details:<br>

><br>

> SNES Object: 8 MPI processes<br>

>   type: newtonls<br>

>   maximum iterations=15, maximum function evaluations=10000<br>

>   tolerances: relative=1e-08, absolute=1e-11, solution=1e-50<br>

>   total number of linear solver iterations=4<br>

>   total number of function evaluations=7<br>

>   norm schedule ALWAYS<br>

>   SNESLineSearch Object: 8 MPI processes<br>

>     type: basic<br>

>     maxstep=1.000000e+08, minlambda=1.000000e-12<br>

>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08<br>

>     maximum iterations=40<br>

>   KSP Object: 8 MPI processes<br>

>     type: gmres<br>

>       restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>

>       happy breakdown tolerance 1e-30<br>

>     maximum iterations=100, initial guess is zero<br>

>     tolerances:  relative=1e-06, absolute=1e-50, divergence=10000.<br>

>     right preconditioning<br>

>     using UNPRECONDITIONED norm type for convergence test<br>

>   PC Object: 8 MPI processes<br>

>     type: lu<br>

>       out-of-place factorization<br>

>       tolerance for zero pivot 2.22045e-14<br>

>       matrix ordering: natural<br>

>       factor fill ratio given 0., needed 0.<br>

>         Factored matrix follows:<br>

>           Mat Object: 8 MPI processes<br>

>             type: superlu_dist<br>

>             rows=7925, cols=7925<br>

>             package used to perform factorization: superlu_dist<br>

>             total: nonzeros=0, allocated nonzeros=0<br>

>             total number of mallocs used during MatSetValues calls =0<br>

>               SuperLU_DIST run parameters:<br>

>                 Process grid nprow 4 x npcol 2<br>

>                 Equilibrate matrix TRUE<br>

>                 Matrix input mode 1<br>

>                 Replace tiny pivots FALSE<br>

>                 Use iterative refinement TRUE<br>

>                 Processors in row 4 col partition 2<br>

>                 Row permutation LargeDiag<br>

>                 Column permutation METIS_AT_PLUS_A<br>

>                 Parallel symbolic factorization FALSE<br>

>                 Repeated factorization SamePattern<br>

>     linear system matrix followed by preconditioner matrix:<br>

>     Mat Object: 8 MPI processes<br>

>       type: mffd<br>

>       rows=7925, cols=7925<br>

>         Matrix-free approximation:<br>

>           err=1.49012e-08 (relative error in function evaluation)<br>

>           Using wp compute h routine<br>

>               Does not compute normU<br>

>     Mat Object: () 8 MPI processes<br>

>       type: mpiaij<br>

>       rows=7925, cols=7925<br>

>       total: nonzeros=63587, allocated nonzeros=63865<br>

>       total number of mallocs used during MatSetValues calls =0<br>

>         not using I-node (on process 0) routines<br>

><br>

><br>

> Fande,<br>

><br>

><br>

<br>

</div></div></blockquote></div><br></div>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.caam.rice.edu/~mk51/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div>

</div></div>