[petsc-users] Fieldsplit with sub pc MUMPS in parallel

Dave May dave.mayhem23 at gmail.com
Thu Jan 5 05:58:45 CST 2017


Do you now see identical residual histories for a job using 1 rank and 4
ranks?

If not, I am inclined to believe that the IS's you are defining for the
splits in the parallel case are incorrect. The operator created to
approximate the Schur complement with selfp should not depend on  the
number of ranks.

Or possibly your problem is horribly I'll-conditioned. If it is, then this
could result in slightly different residual histories when using different
numbers of ranks - even if the operators are in fact identical


Thanks,
  Dave




On Thu, 5 Jan 2017 at 12:14, Karin&NiKo <niko.karin at gmail.com> wrote:

> Dear Barry, dear Dave,
>
> THANK YOU!
> You two pointed out the right problem.By using the options you provided
> (-fieldsplit_0_ksp_type gmres -fieldsplit_0_ksp_pc_side right
> -fieldsplit_1_ksp_type gmres -fieldsplit_1_ksp_pc_side right), the solver
> converges in 3 iterations whatever the size of the communicator.
> All the trick is in the precise resolution of the Schur complement, by
> using a Krylov method (and not only preonly) *and* applying the
> preconditioner on the right (so evaluating the convergence on the
> unpreconditioned residual).
>
> @Barry : the difference you see on the nonzero allocations for the
> different runs is just an artefact : when using more than one proc, we
> slighly over-estimate the number of non-zero terms. If I run the same
> problem with the -info option, I get extra information :
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 110 X 110; storage space: 0
> unneeded,5048 used
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 271 X 271; storage space: 4249
> unneeded,26167 used
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 307 X 307; storage space: 7988
> unneeded,31093 used
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 110 X 244; storage space: 0
> unneeded,6194 used
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 271 X 233; storage space: 823
> unneeded,9975 used
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 307 X 197; storage space: 823
> unneeded,8263 used
> And 5048+26167+31093+6194+9975+8263=86740 which is the number of exactly
> estimated nonzero terms for 1 proc.
>
>
> Thank you again!
>
> Best regards,
> Nicolas
>
>
> 2017-01-05 1:36 GMT+01:00 Barry Smith <bsmith at mcs.anl.gov>:
>
>
>
>
>    There is something wrong with your set up.
>
>
>
>
>
> 1 process
>
>
>
>
>
>            total: nonzeros=140616, allocated nonzeros=140616
>
>
>           total: nonzeros=68940, allocated nonzeros=68940
>
>
>                 total: nonzeros=3584, allocated nonzeros=3584
>
>
>                 total: nonzeros=1000, allocated nonzeros=1000
>
>
>                 total: nonzeros=8400, allocated nonzeros=8400
>
>
>
>
>
> 2 processes
>
>
>                 total: nonzeros=146498, allocated nonzeros=146498
>
>
>           total: nonzeros=73470, allocated nonzeros=73470
>
>
>                 total: nonzeros=3038, allocated nonzeros=3038
>
>
>                 total: nonzeros=1110, allocated nonzeros=1110
>
>
>                 total: nonzeros=6080, allocated nonzeros=6080
>
>
>                         total: nonzeros=146498, allocated nonzeros=146498
>
>
>                   total: nonzeros=73470, allocated nonzeros=73470
>
>
>                 total: nonzeros=6080, allocated nonzeros=6080
>
>
>           total: nonzeros=2846, allocated nonzeros=2846
>
>
>     total: nonzeros=86740, allocated nonzeros=94187
>
>
>
>
>
>   It looks like you are setting up the problem differently in parallel and
> seq. If it is suppose to be an identical problem then the number nonzeros
> should be the same in at least the first two matrices.
>
>
>
>
>
>
>
>
>
>
>
> > On Jan 4, 2017, at 3:39 PM, Karin&NiKo <niko.karin at gmail.com> wrote:
>
>
> >
>
>
> > Dear Petsc team,
>
>
> >
>
>
> > I am (still) trying to solve Biot's poroelasticity problem :
>
>
> >  <image.png>
>
>
> >
>
>
> > I am using a mixed P2-P1 finite element discretization. The matrix of
> the discretized system in binary format is attached to this email.
>
>
> >
>
>
> > I am using the fieldsplit framework to solve the linear system. Since I
> am facing some troubles, I have decided to go back to simple things. Here
> are the options I am using :
>
>
> >
>
>
> > -ksp_rtol 1.0e-5
>
>
> > -ksp_type fgmres
>
>
> > -pc_type fieldsplit
>
>
> > -pc_fieldsplit_schur_factorization_type full
>
>
> > -pc_fieldsplit_type schur
>
>
> > -pc_fieldsplit_schur_precondition selfp
>
>
> > -fieldsplit_0_pc_type lu
>
>
> > -fieldsplit_0_pc_factor_mat_solver_package mumps
>
>
> > -fieldsplit_0_ksp_type preonly
>
>
> > -fieldsplit_0_ksp_converged_reason
>
>
> > -fieldsplit_1_pc_type lu
>
>
> > -fieldsplit_1_pc_factor_mat_solver_package mumps
>
>
> > -fieldsplit_1_ksp_type preonly
>
>
> > -fieldsplit_1_ksp_converged_reason
>
>
> >
>
>
> > On a single proc, everything runs fine : the solver converges in 3
> iterations, according to the theory (see Run-1-proc.txt [contains
> -log_view]).
>
>
> >
>
>
> > On 2 procs, the solver converges in 28 iterations (see Run-2-proc.txt).
>
>
> >
>
>
> > On 3 procs, the solver converges in 91 iterations (see Run-3-proc.txt).
>
>
> >
>
>
> > I do not understand this behavior : since MUMPS is a parallel direct
> solver, shouldn't the solver converge in max 3 iterations whatever the
> number of procs?
>
>
> >
>
>
> >
>
>
> > Thanks for your precious help,
>
>
> > Nicolas
>
>
> >
>
>
> > <Run-1-proc.txt><Run-2-proc.txt><Run-3-proc.txt><1_Warning.txt>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170105/68f8937e/attachment.html>


More information about the petsc-users mailing list