<div dir="ltr">We had the similar problem before with superlu_dist, but it happened only when the number of processor cores is larger than 2. Direct solvers, in our experiences, often involve more messages (especially non-block communication). Then this causes different operation orders, and have different results. BUT, the differences do not matter as long as your formula is stable. <div><br></div><div>The problem we had before was that the convergence path even changed from run to run, and it turned out because our formula was unstable. </div><div><br></div><div>If you do not trust MUMPS at this point, you could use it as a preconditioner together with a Krylove method (such as GMRES). The preconditioner may not affect your solution as long as it is not singular. </div><div><br></div><div><br></div><div>Fande,</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 14, 2018 at 5:10 AM, Tim Steinhoff <span dir="ltr"><<a href="mailto:kandanovian@gmail.com" target="_blank">kandanovian@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I guess that the partioning is fixed, as two results can also differ<br>
when I call two successive solves, where matrix and rhs vector and<br>
everything is identical.<br>
In that case the factorization/partitioning is reused by MUMPS and<br>
only the solving phase is executed twice, which alone leads to<br>
slightly different results on two processes.<br>
The MUMPS manual says that there is some kind of dynamical task<br>
scheduling for the pivoting part, but I could not find any<br>
information, if they use something similar in the solving phase.<br>
<div class="HOEnZb"><div class="h5"><br>
2018-03-13 21:59 GMT+01:00 Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>>:<br>
><br>
><br>
>> On Mar 13, 2018, at 1:10 PM, Tim Steinhoff <<a href="mailto:kandanovian@gmail.com">kandanovian@gmail.com</a>> wrote:<br>
>><br>
>> Thanks for your fast reply.<br>
>> I see that I can't expect the same results when changing the number of<br>
>> processes, but how does MPI change the order of operations, when there<br>
>> are for example 2 processes and the partitioning is fixed?<br>
><br>
> Hmm, how do you know the partitioning is fixed? Is there the use of a random number generator in MUMPS or the partitioning packages uses?<br>
><br>
> Barry<br>
><br>
>> With GMRES I could not prorduce that behavior, no matter how many processes.<br>
>><br>
>> 2018-03-13 18:17 GMT+01:00 Stefano Zampini <<a href="mailto:stefano.zampini@gmail.com">stefano.zampini@gmail.com</a>>:<br>
>>> This is expected. In parallel, you cannot assume the order of operations is<br>
>>> preserved<br>
>>><br>
>>> Il 13 Mar 2018 8:14 PM, "Tim Steinhoff" <<a href="mailto:kandanovian@gmail.com">kandanovian@gmail.com</a>> ha scritto:<br>
>>>><br>
>>>> Hi all,<br>
>>>><br>
>>>> I get some randomness when solving certain equation systems with MUMPS.<br>
>>>> When I repeatedly solve the attached equation system by ksp example<br>
>>>> 10, I get different solution vectors and therefore different residual<br>
>>>> norms.<br>
>>>><br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.15502e-12<br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.15502e-12<br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.17364e-12<br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.17364e-12<br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.17364e-12<br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.15502e-12<br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.15502e-12<br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.17364e-12<br>
>>>> jac@jac-VirtualBox:~/data/rep/<wbr>petsc/src/ksp/ksp/examples/<wbr>tutorials$<br>
>>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly<br>
>>>> -pc_type lu -pc_factor_mat_solver_package mumps<br>
>>>> Number of iterations = 1<br>
>>>> Residual norm 4.15502e-12<br>
>>>><br>
>>>> It seems to be depending on a combination of number of processes and<br>
>>>> the equation system.<br>
>>>> I used GCC 7.2.0, Intel 16, MUMPS 5.1.1 / 5.1.2 (with & without<br>
>>>> metis/parmetis), openMPI 2.1.2. All with the same results.<br>
>>>> PETSc configuration is the current maint branch:<br>
>>>> ./configure --download-mumps --with-debugging=0 --COPTFLAGS="-O3"<br>
>>>> --CXXOPTFLAGS="-O3" --FOPTFLAGS="-O3" --with-scalapack<br>
>>>><br>
>>>> Using "--download-fblaslapack --download-scalapack" didnt make a<br>
>>>> difference neither.<br>
>>>> Can anyone reproduce that issue?<br>
>>>><br>
>>>> Thanks and kind regards,<br>
>>>><br>
>>>> Volker<br>
><br>
</div></div></blockquote></div><br></div>