<div dir="ltr">Thanks, Barry,<br><div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 15, 2017 at 4:04 PM, Smith, Barry F. <span dir="ltr"><<a target="_blank" href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><br>
Do the ASM runs for thousands of time-steps produce the same final "physical results" as the MUMPS run for thousands of timesteps? While with SuperLU you get a very different "physical results"?<br></blockquote><div><br></div><div>Let me update a little bit more. The simulation with SuperLU may fail at certain time step. Sometime we can also run the simulation successfully for the whole time range. It is totally random. <br><br></div><div>We will try ASM and MUMPS.<br><br></div><div>Fande,<br></div><div><br><br> </div><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">
<span class="gmail-HOEnZb"><font color="#888888"><br>
Barry<br>
</font></span><div class="gmail-HOEnZb"><div class="gmail-h5"><br>
<br>
> On Nov 15, 2017, at 4:52 PM, Kong, Fande <<a href="mailto:fande.kong@inl.gov">fande.kong@inl.gov</a>> wrote:<br>
><br>
><br>
><br>
> On Wed, Nov 15, 2017 at 3:35 PM, Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>> wrote:<br>
><br>
> Since the convergence labeled linear does not converge to 14 digits in one iteration I am assuming you are using lagged preconditioning and or lagged Jacobian?<br>
><br>
> We are using Jacobian-free Newton. So Jacobian is different from the preconditioning matrix.<br>
><br>
><br>
> What happens if you do no lagging and solve each linear solve with a new LU factorization?<br>
><br>
> We have the following results without using Jacobian-free Newton. Again, superlu_dist produces differences, while MUMPS gives the same results in terms of the residual norms.<br>
><br>
><br>
> Fande,<br>
><br>
><br>
> Superlu_dist run1:<br>
><br>
> 0 Nonlinear |R| = 9.447423e+03<br>
> 0 Linear |R| = 9.447423e+03<br>
> 1 Linear |R| = 1.322285e-11<br>
> 1 Nonlinear |R| = 1.666987e-11<br>
><br>
><br>
> Superlu_dist run2:<br>
><br>
> 0 Nonlinear |R| = 9.447423e+03<br>
> 0 Linear |R| = 9.447423e+03<br>
> 1 Linear |R| = 1.322171e-11<br>
> 1 Nonlinear |R| = 1.666977e-11<br>
><br>
><br>
> Superlu_dist run3:<br>
><br>
> 0 Nonlinear |R| = 9.447423e+03<br>
> 0 Linear |R| = 9.447423e+03<br>
> 1 Linear |R| = 1.321964e-11<br>
> 1 Nonlinear |R| = 1.666959e-11<br>
><br>
><br>
> Superlu_dist run4:<br>
><br>
> 0 Nonlinear |R| = 9.447423e+03<br>
> 0 Linear |R| = 9.447423e+03<br>
> 1 Linear |R| = 1.321978e-11<br>
> 1 Nonlinear |R| = 1.668688e-11<br>
><br>
><br>
> MUMPS run1:<br>
><br>
> 0 Nonlinear |R| = 9.447423e+03<br>
> 0 Linear |R| = 9.447423e+03<br>
> 1 Linear |R| = 1.360637e-11<br>
> 1 Nonlinear |R| = 1.654334e-11<br>
><br>
> MUMPS run 2:<br>
><br>
> 0 Nonlinear |R| = 9.447423e+03<br>
> 0 Linear |R| = 9.447423e+03<br>
> 1 Linear |R| = 1.360637e-11<br>
> 1 Nonlinear |R| = 1.654334e-11<br>
><br>
> MUMPS run 3:<br>
><br>
> 0 Nonlinear |R| = 9.447423e+03<br>
> 0 Linear |R| = 9.447423e+03<br>
> 1 Linear |R| = 1.360637e-11<br>
> 1 Nonlinear |R| = 1.654334e-11<br>
><br>
> MUMPS run4:<br>
><br>
> 0 Nonlinear |R| = 9.447423e+03<br>
> 0 Linear |R| = 9.447423e+03<br>
> 1 Linear |R| = 1.360637e-11<br>
> 1 Nonlinear |R| = 1.654334e-11<br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
> Barry<br>
><br>
><br>
> > On Nov 15, 2017, at 4:24 PM, Kong, Fande <<a href="mailto:fande.kong@inl.gov">fande.kong@inl.gov</a>> wrote:<br>
> ><br>
> ><br>
> ><br>
> > On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>> wrote:<br>
> ><br>
> ><br>
> > > On Nov 15, 2017, at 3:36 PM, Kong, Fande <<a href="mailto:fande.kong@inl.gov">fande.kong@inl.gov</a>> wrote:<br>
> > ><br>
> > > Hi Barry,<br>
> > ><br>
> > > Thanks for your reply. I was wondering why this happens only when we use superlu_dist. I am trying to understand the algorithm in superlu_dist. If we use ASM or MUMPS, we do not produce these differences.<br>
> > ><br>
> > > The differences actually are NOT meaningless. In fact, we have a real transient application that presents this issue. When we run the simulation with superlu_dist in parallel for thousands of time steps, the final physics solution looks totally different from different runs. The differences are not acceptable any more. For a steady problem, the difference may be meaningless. But it is significant for the transient problem.<br>
> ><br>
> > I submit that the "physics solution" of all of these runs is equally right and equally wrong. If the solutions are very different due to a small perturbation than something is wrong with the model or the integrator, I don't think you can blame the linear solver (see below)<br>
> > ><br>
> > > This makes the solution not reproducible, and we can not even set a targeting solution in the test system because the solution is so different from one run to another. I guess there might/may be a tiny bug in superlu_dist or the PETSc interface to superlu_dist.<br>
> ><br>
> > This is possible but it is also possible this is due to normal round off inside of SuperLU dist.<br>
> ><br>
> > Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't really matter exactly how well SuperLU_Dist does. The nonlinear iteration does essential defect correction for you; are you making sure that the nonlinear iteration always works for every timestep? For example confirm that SNESGetConvergedReason() is always positive.<br>
> ><br>
> > Definitely it could be something wrong on my side. But let us focus on the simple question first.<br>
> ><br>
> > To make the discussion a little simpler, let us back to the simple problem (heat conduction). Now I want to understand why this happens to superlu_dist only. When we are using ASM or MUMPS, why we can not see the differences from one run to another? I posted the residual histories for MUMPS and ASM. We can not see any differences in terms of the residual norms when using MUMPS or ASM. Does superlu_dist have higher round off than other solvers?<br>
> ><br>
> ><br>
> ><br>
> > MUMPS run1:<br>
> ><br>
> > 0 Nonlinear |R| = 9.447423e+03<br>
> > 0 Linear |R| = 9.447423e+03<br>
> > 1 Linear |R| = 1.013384e-02<br>
> > 2 Linear |R| = 4.020993e-08<br>
> > 1 Nonlinear |R| = 1.404678e-02<br>
> > 0 Linear |R| = 1.404678e-02<br>
> > 1 Linear |R| = 4.836162e-08<br>
> > 2 Linear |R| = 7.055620e-14<br>
> > 2 Nonlinear |R| = 4.836392e-08<br>
> ><br>
> > MUMPS run2:<br>
> ><br>
> > 0 Nonlinear |R| = 9.447423e+03<br>
> > 0 Linear |R| = 9.447423e+03<br>
> > 1 Linear |R| = 1.013384e-02<br>
> > 2 Linear |R| = 4.020993e-08<br>
> > 1 Nonlinear |R| = 1.404678e-02<br>
> > 0 Linear |R| = 1.404678e-02<br>
> > 1 Linear |R| = 4.836162e-08<br>
> > 2 Linear |R| = 7.055620e-14<br>
> > 2 Nonlinear |R| = 4.836392e-08<br>
> ><br>
> > MUMPS run3:<br>
> ><br>
> > 0 Nonlinear |R| = 9.447423e+03<br>
> > 0 Linear |R| = 9.447423e+03<br>
> > 1 Linear |R| = 1.013384e-02<br>
> > 2 Linear |R| = 4.020993e-08<br>
> > 1 Nonlinear |R| = 1.404678e-02<br>
> > 0 Linear |R| = 1.404678e-02<br>
> > 1 Linear |R| = 4.836162e-08<br>
> > 2 Linear |R| = 7.055620e-14<br>
> > 2 Nonlinear |R| = 4.836392e-08<br>
> ><br>
> > MUMPS run4:<br>
> ><br>
> > 0 Nonlinear |R| = 9.447423e+03<br>
> > 0 Linear |R| = 9.447423e+03<br>
> > 1 Linear |R| = 1.013384e-02<br>
> > 2 Linear |R| = 4.020993e-08<br>
> > 1 Nonlinear |R| = 1.404678e-02<br>
> > 0 Linear |R| = 1.404678e-02<br>
> > 1 Linear |R| = 4.836162e-08<br>
> > 2 Linear |R| = 7.055620e-14<br>
> > 2 Nonlinear |R| = 4.836392e-08<br>
> ><br>
> ><br>
> ><br>
> > ASM run1:<br>
> ><br>
> > 0 Nonlinear |R| = 9.447423e+03<br>
> > 0 Linear |R| = 9.447423e+03<br>
> > 1 Linear |R| = 6.189229e+03<br>
> > 2 Linear |R| = 3.252487e+02<br>
> > 3 Linear |R| = 3.485174e+01<br>
> > 4 Linear |R| = 8.600695e+00<br>
> > 5 Linear |R| = 3.333942e+00<br>
> > 6 Linear |R| = 1.706112e+00<br>
> > 7 Linear |R| = 5.047863e-01<br>
> > 8 Linear |R| = 2.337297e-01<br>
> > 9 Linear |R| = 1.071627e-01<br>
> > 10 Linear |R| = 4.692177e-02<br>
> > 11 Linear |R| = 1.340717e-02<br>
> > 12 Linear |R| = 4.753951e-03<br>
> > 1 Nonlinear |R| = 2.320271e-02<br>
> > 0 Linear |R| = 2.320271e-02<br>
> > 1 Linear |R| = 4.367880e-03<br>
> > 2 Linear |R| = 1.407852e-03<br>
> > 3 Linear |R| = 6.036360e-04<br>
> > 4 Linear |R| = 1.867661e-04<br>
> > 5 Linear |R| = 8.760076e-05<br>
> > 6 Linear |R| = 3.260519e-05<br>
> > 7 Linear |R| = 1.435418e-05<br>
> > 8 Linear |R| = 4.532875e-06<br>
> > 9 Linear |R| = 2.439053e-06<br>
> > 10 Linear |R| = 7.998549e-07<br>
> > 11 Linear |R| = 2.428064e-07<br>
> > 12 Linear |R| = 4.766918e-08<br>
> > 13 Linear |R| = 1.713748e-08<br>
> > 2 Nonlinear |R| = 3.671573e-07<br>
> ><br>
> ><br>
> > ASM run2:<br>
> ><br>
> > 0 Nonlinear |R| = 9.447423e+03<br>
> > 0 Linear |R| = 9.447423e+03<br>
> > 1 Linear |R| = 6.189229e+03<br>
> > 2 Linear |R| = 3.252487e+02<br>
> > 3 Linear |R| = 3.485174e+01<br>
> > 4 Linear |R| = 8.600695e+00<br>
> > 5 Linear |R| = 3.333942e+00<br>
> > 6 Linear |R| = 1.706112e+00<br>
> > 7 Linear |R| = 5.047863e-01<br>
> > 8 Linear |R| = 2.337297e-01<br>
> > 9 Linear |R| = 1.071627e-01<br>
> > 10 Linear |R| = 4.692177e-02<br>
> > 11 Linear |R| = 1.340717e-02<br>
> > 12 Linear |R| = 4.753951e-03<br>
> > 1 Nonlinear |R| = 2.320271e-02<br>
> > 0 Linear |R| = 2.320271e-02<br>
> > 1 Linear |R| = 4.367880e-03<br>
> > 2 Linear |R| = 1.407852e-03<br>
> > 3 Linear |R| = 6.036360e-04<br>
> > 4 Linear |R| = 1.867661e-04<br>
> > 5 Linear |R| = 8.760076e-05<br>
> > 6 Linear |R| = 3.260519e-05<br>
> > 7 Linear |R| = 1.435418e-05<br>
> > 8 Linear |R| = 4.532875e-06<br>
> > 9 Linear |R| = 2.439053e-06<br>
> > 10 Linear |R| = 7.998549e-07<br>
> > 11 Linear |R| = 2.428064e-07<br>
> > 12 Linear |R| = 4.766918e-08<br>
> > 13 Linear |R| = 1.713748e-08<br>
> > 2 Nonlinear |R| = 3.671573e-07<br>
> ><br>
> > ASM run3:<br>
> ><br>
> > 0 Nonlinear |R| = 9.447423e+03<br>
> > 0 Linear |R| = 9.447423e+03<br>
> > 1 Linear |R| = 6.189229e+03<br>
> > 2 Linear |R| = 3.252487e+02<br>
> > 3 Linear |R| = 3.485174e+01<br>
> > 4 Linear |R| = 8.600695e+00<br>
> > 5 Linear |R| = 3.333942e+00<br>
> > 6 Linear |R| = 1.706112e+00<br>
> > 7 Linear |R| = 5.047863e-01<br>
> > 8 Linear |R| = 2.337297e-01<br>
> > 9 Linear |R| = 1.071627e-01<br>
> > 10 Linear |R| = 4.692177e-02<br>
> > 11 Linear |R| = 1.340717e-02<br>
> > 12 Linear |R| = 4.753951e-03<br>
> > 1 Nonlinear |R| = 2.320271e-02<br>
> > 0 Linear |R| = 2.320271e-02<br>
> > 1 Linear |R| = 4.367880e-03<br>
> > 2 Linear |R| = 1.407852e-03<br>
> > 3 Linear |R| = 6.036360e-04<br>
> > 4 Linear |R| = 1.867661e-04<br>
> > 5 Linear |R| = 8.760076e-05<br>
> > 6 Linear |R| = 3.260519e-05<br>
> > 7 Linear |R| = 1.435418e-05<br>
> > 8 Linear |R| = 4.532875e-06<br>
> > 9 Linear |R| = 2.439053e-06<br>
> > 10 Linear |R| = 7.998549e-07<br>
> > 11 Linear |R| = 2.428064e-07<br>
> > 12 Linear |R| = 4.766918e-08<br>
> > 13 Linear |R| = 1.713748e-08<br>
> > 2 Nonlinear |R| = 3.671573e-07<br>
> ><br>
> ><br>
> ><br>
> > ASM run4:<br>
> > 0 Nonlinear |R| = 9.447423e+03<br>
> > 0 Linear |R| = 9.447423e+03<br>
> > 1 Linear |R| = 6.189229e+03<br>
> > 2 Linear |R| = 3.252487e+02<br>
> > 3 Linear |R| = 3.485174e+01<br>
> > 4 Linear |R| = 8.600695e+00<br>
> > 5 Linear |R| = 3.333942e+00<br>
> > 6 Linear |R| = 1.706112e+00<br>
> > 7 Linear |R| = 5.047863e-01<br>
> > 8 Linear |R| = 2.337297e-01<br>
> > 9 Linear |R| = 1.071627e-01<br>
> > 10 Linear |R| = 4.692177e-02<br>
> > 11 Linear |R| = 1.340717e-02<br>
> > 12 Linear |R| = 4.753951e-03<br>
> > 1 Nonlinear |R| = 2.320271e-02<br>
> > 0 Linear |R| = 2.320271e-02<br>
> > 1 Linear |R| = 4.367880e-03<br>
> > 2 Linear |R| = 1.407852e-03<br>
> > 3 Linear |R| = 6.036360e-04<br>
> > 4 Linear |R| = 1.867661e-04<br>
> > 5 Linear |R| = 8.760076e-05<br>
> > 6 Linear |R| = 3.260519e-05<br>
> > 7 Linear |R| = 1.435418e-05<br>
> > 8 Linear |R| = 4.532875e-06<br>
> > 9 Linear |R| = 2.439053e-06<br>
> > 10 Linear |R| = 7.998549e-07<br>
> > 11 Linear |R| = 2.428064e-07<br>
> > 12 Linear |R| = 4.766918e-08<br>
> > 13 Linear |R| = 1.713748e-08<br>
> > 2 Nonlinear |R| = 3.671573e-07<br>
> ><br>
> ><br>
> ><br>
> ><br>
> ><br>
> ><br>
> ><br>
> > ><br>
> > ><br>
> > > Fande,<br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > > On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>> wrote:<br>
> > ><br>
> > > Meaningless differences<br>
> > ><br>
> > ><br>
> > > > On Nov 15, 2017, at 2:26 PM, Kong, Fande <<a href="mailto:fande.kong@inl.gov">fande.kong@inl.gov</a>> wrote:<br>
> > > ><br>
> > > > Hi,<br>
> > > ><br>
> > > > There is a heat conduction problem. When superlu_dist is used as a preconditioner, we have random results from different runs. Is there a random algorithm in superlu_dist? If we use ASM or MUMPS as the preconditioner, we then don't have this issue.<br>
> > > ><br>
> > > > run 1:<br>
> > > ><br>
> > > > 0 Nonlinear |R| = 9.447423e+03<br>
> > > > 0 Linear |R| = 9.447423e+03<br>
> > > > 1 Linear |R| = 1.013384e-02<br>
> > > > 2 Linear |R| = 4.020995e-08<br>
> > > > 1 Nonlinear |R| = 1.404678e-02<br>
> > > > 0 Linear |R| = 1.404678e-02<br>
> > > > 1 Linear |R| = 5.104757e-08<br>
> > > > 2 Linear |R| = 7.699637e-14<br>
> > > > 2 Nonlinear |R| = 5.106418e-08<br>
> > > ><br>
> > > ><br>
> > > > run 2:<br>
> > > ><br>
> > > > 0 Nonlinear |R| = 9.447423e+03<br>
> > > > 0 Linear |R| = 9.447423e+03<br>
> > > > 1 Linear |R| = 1.013384e-02<br>
> > > > 2 Linear |R| = 4.020995e-08<br>
> > > > 1 Nonlinear |R| = 1.404678e-02<br>
> > > > 0 Linear |R| = 1.404678e-02<br>
> > > > 1 Linear |R| = 5.109913e-08<br>
> > > > 2 Linear |R| = 7.189091e-14<br>
> > > > 2 Nonlinear |R| = 5.111591e-08<br>
> > > ><br>
> > > > run 3:<br>
> > > ><br>
> > > > 0 Nonlinear |R| = 9.447423e+03<br>
> > > > 0 Linear |R| = 9.447423e+03<br>
> > > > 1 Linear |R| = 1.013384e-02<br>
> > > > 2 Linear |R| = 4.020995e-08<br>
> > > > 1 Nonlinear |R| = 1.404678e-02<br>
> > > > 0 Linear |R| = 1.404678e-02<br>
> > > > 1 Linear |R| = 5.104942e-08<br>
> > > > 2 Linear |R| = 7.465572e-14<br>
> > > > 2 Nonlinear |R| = 5.106642e-08<br>
> > > ><br>
> > > > run 4:<br>
> > > ><br>
> > > > 0 Nonlinear |R| = 9.447423e+03<br>
> > > > 0 Linear |R| = 9.447423e+03<br>
> > > > 1 Linear |R| = 1.013384e-02<br>
> > > > 2 Linear |R| = 4.020995e-08<br>
> > > > 1 Nonlinear |R| = 1.404678e-02<br>
> > > > 0 Linear |R| = 1.404678e-02<br>
> > > > 1 Linear |R| = 5.102730e-08<br>
> > > > 2 Linear |R| = 7.132220e-14<br>
> > > > 2 Nonlinear |R| = 5.104442e-08<br>
> > > ><br>
> > > > Solver details:<br>
> > > ><br>
> > > > SNES Object: 8 MPI processes<br>
> > > > type: newtonls<br>
> > > > maximum iterations=15, maximum function evaluations=10000<br>
> > > > tolerances: relative=1e-08, absolute=1e-11, solution=1e-50<br>
> > > > total number of linear solver iterations=4<br>
> > > > total number of function evaluations=7<br>
> > > > norm schedule ALWAYS<br>
> > > > SNESLineSearch Object: 8 MPI processes<br>
> > > > type: basic<br>
> > > > maxstep=1.000000e+08, minlambda=1.000000e-12<br>
> > > > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08<br>
> > > > maximum iterations=40<br>
> > > > KSP Object: 8 MPI processes<br>
> > > > type: gmres<br>
> > > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>
> > > > happy breakdown tolerance 1e-30<br>
> > > > maximum iterations=100, initial guess is zero<br>
> > > > tolerances: relative=1e-06, absolute=1e-50, divergence=10000.<br>
> > > > right preconditioning<br>
> > > > using UNPRECONDITIONED norm type for convergence test<br>
> > > > PC Object: 8 MPI processes<br>
> > > > type: lu<br>
> > > > out-of-place factorization<br>
> > > > tolerance for zero pivot 2.22045e-14<br>
> > > > matrix ordering: natural<br>
> > > > factor fill ratio given 0., needed 0.<br>
> > > > Factored matrix follows:<br>
> > > > Mat Object: 8 MPI processes<br>
> > > > type: superlu_dist<br>
> > > > rows=7925, cols=7925<br>
> > > > package used to perform factorization: superlu_dist<br>
> > > > total: nonzeros=0, allocated nonzeros=0<br>
> > > > total number of mallocs used during MatSetValues calls =0<br>
> > > > SuperLU_DIST run parameters:<br>
> > > > Process grid nprow 4 x npcol 2<br>
> > > > Equilibrate matrix TRUE<br>
> > > > Matrix input mode 1<br>
> > > > Replace tiny pivots FALSE<br>
> > > > Use iterative refinement TRUE<br>
> > > > Processors in row 4 col partition 2<br>
> > > > Row permutation LargeDiag<br>
> > > > Column permutation METIS_AT_PLUS_A<br>
> > > > Parallel symbolic factorization FALSE<br>
> > > > Repeated factorization SamePattern<br>
> > > > linear system matrix followed by preconditioner matrix:<br>
> > > > Mat Object: 8 MPI processes<br>
> > > > type: mffd<br>
> > > > rows=7925, cols=7925<br>
> > > > Matrix-free approximation:<br>
> > > > err=1.49012e-08 (relative error in function evaluation)<br>
> > > > Using wp compute h routine<br>
> > > > Does not compute normU<br>
> > > > Mat Object: () 8 MPI processes<br>
> > > > type: mpiaij<br>
> > > > rows=7925, cols=7925<br>
> > > > total: nonzeros=63587, allocated nonzeros=63865<br>
> > > > total number of mallocs used during MatSetValues calls =0<br>
> > > > not using I-node (on process 0) routines<br>
> > > ><br>
> > > ><br>
> > > > Fande,<br>
> > > ><br>
> > > ><br>
> > ><br>
> > ><br>
><br>
><br>
<br>
</div></div></blockquote></div><br></div></div></div>