[petsc-users] Non deterministic results with MUMPS?

Fande Kong fdkong.jd at gmail.com
Wed Mar 14 09:27:26 CDT 2018


We had the similar problem before with superlu_dist, but it happened only
when the number of processor cores is larger than 2.  Direct solvers, in
our experiences, often involve more messages (especially non-block
communication).  Then this causes different operation orders, and have
different results. BUT, the differences do not matter as long as your
formula is stable.

The problem we had before was  that the convergence path even changed  from
run to run, and it turned  out because our formula was unstable.

If you do not trust MUMPS at this point, you could use it as a
preconditioner together with a Krylove method (such as GMRES).  The
preconditioner may not affect your solution as long as it is not singular.


Fande,

On Wed, Mar 14, 2018 at 5:10 AM, Tim Steinhoff <kandanovian at gmail.com>
wrote:

> I guess that the partioning is fixed, as two results can also differ
> when I call two successive solves, where matrix and rhs vector and
> everything is identical.
> In that case the factorization/partitioning is reused by MUMPS and
> only the solving phase is executed twice, which alone leads to
> slightly different results on two processes.
> The MUMPS manual says that there is some kind of dynamical task
> scheduling for the pivoting part, but I could not find any
> information, if they use something similar in the solving phase.
>
> 2018-03-13 21:59 GMT+01:00 Smith, Barry F. <bsmith at mcs.anl.gov>:
> >
> >
> >> On Mar 13, 2018, at 1:10 PM, Tim Steinhoff <kandanovian at gmail.com>
> wrote:
> >>
> >> Thanks for your fast reply.
> >> I see that I can't expect the same results when changing the number of
> >> processes, but how does MPI change the order of operations, when there
> >> are for example 2 processes and the partitioning is fixed?
> >
> >     Hmm, how do you know the partitioning is fixed? Is there the use of
> a random number generator in MUMPS or the partitioning packages uses?
> >
> >    Barry
> >
> >> With GMRES I could not prorduce that behavior, no matter how many
> processes.
> >>
> >> 2018-03-13 18:17 GMT+01:00 Stefano Zampini <stefano.zampini at gmail.com>:
> >>> This is expected. In parallel, you cannot assume the order of
> operations is
> >>> preserved
> >>>
> >>> Il 13 Mar 2018 8:14 PM, "Tim Steinhoff" <kandanovian at gmail.com> ha
> scritto:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> I get some randomness when solving certain equation systems with
> MUMPS.
> >>>> When I repeatedly solve the attached equation system by ksp example
> >>>> 10, I get different solution vectors and therefore different residual
> >>>> norms.
> >>>>
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.15502e-12
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.15502e-12
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.17364e-12
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.17364e-12
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.17364e-12
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.15502e-12
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.15502e-12
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.17364e-12
> >>>> jac at jac-VirtualBox:~/data/rep/petsc/src/ksp/ksp/examples/tutorials$
> >>>> mpiexec -np 4 ./ex10 -f ./1.mat -rhs ./1.vec -ksp_type preonly
> >>>> -pc_type lu -pc_factor_mat_solver_package mumps
> >>>> Number of iterations =   1
> >>>> Residual norm 4.15502e-12
> >>>>
> >>>> It seems to be depending on a combination of number of processes and
> >>>> the equation system.
> >>>> I used GCC 7.2.0, Intel 16, MUMPS 5.1.1 / 5.1.2 (with & without
> >>>> metis/parmetis), openMPI 2.1.2. All with the same results.
> >>>> PETSc configuration is the current maint branch:
> >>>> ./configure --download-mumps --with-debugging=0 --COPTFLAGS="-O3"
> >>>> --CXXOPTFLAGS="-O3" --FOPTFLAGS="-O3" --with-scalapack
> >>>>
> >>>> Using "--download-fblaslapack --download-scalapack" didnt make a
> >>>> difference neither.
> >>>> Can anyone reproduce that issue?
> >>>>
> >>>> Thanks and kind regards,
> >>>>
> >>>> Volker
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180314/d3e99761/attachment.html>


More information about the petsc-users mailing list