[petsc-users] problem with nested fieldsplits

Thu Sep 25 13:05:09 CDT 2014

On Mon, Sep 22, 2014 at 9:15 PM, Jean-Arthur Louis Olive <jaolive at mit.edu>
wrote:

>  Hi all,
> I am using PETSc (dev version) to solve the Stokes + temperature
> equations. My DM has fields (vx, vy, p, T).
>

I have finally had time to look at this. I have tried to reproduce this
setup in a PETSc example. Here is SNES ex19:

cd src/snes/examples/tutorials
make ex19
./ex19 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_block_size 4
-pc_fieldsplit_type SCHUR -pc_fieldsplit_0_fields 0,1,2
-pc_fieldsplit_1_fields 3 -fieldsplit_1_pc_type lu -fieldsplit_0_pc_type
fieldsplit -fieldsplit_0_pc_fieldsplit_block_size 3
-fieldsplit_0_pc_fieldsplit_0_fields 0,1
-fieldsplit_0_pc_fieldsplit_1_fields 2 -fieldsplit_0_pc_fieldsplit_type
schur -fieldsplit_0_fieldsplit_0_pc_type lu
-fieldsplit_0_fieldsplit_1_pc_type lu -snes_monitor_short
-ksp_monitor_short -snes_view

lid velocity = 0.0625, prandtl # = 1, grashof # = 1
  0 SNES Function norm 0.239155
    0 KSP Residual norm 0.239155
    1 KSP Residual norm 8.25786e-07
  1 SNES Function norm 6.82106e-05
    0 KSP Residual norm 6.82106e-05
    1 KSP Residual norm 1.478e-11
  2 SNES Function norm 1.533e-11
SNES Object: 1 MPI processes
  type: newtonls
  maximum iterations=50, maximum function evaluations=10000
  tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
  total number of linear solver iterations=2
  total number of function evaluations=3
  SNESLineSearch Object:   1 MPI processes
    type: bt
      interpolation: cubic
      alpha=1.000000e-04
    maxstep=1.000000e+08, minlambda=1.000000e-12
    tolerances: relative=1.000000e-08, absolute=1.000000e-15,
lambda=1.000000e-08
    maximum iterations=40
  KSP Object:   1 MPI processes
    type: fgmres
      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-30
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
    right preconditioning
    using UNPRECONDITIONED norm type for convergence test
  PC Object:   1 MPI processes
    type: fieldsplit
      FieldSplit with Schur preconditioner, blocksize = 4, factorization
FULL
      Preconditioner for the Schur complement formed from A11
      Split info:
      Split number 0 Fields  0, 1, 2
      Split number 1 Fields  3
      KSP solver for A00 block
        KSP Object:        (fieldsplit_0_)         1 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        PC Object:        (fieldsplit_0_)         1 MPI processes
          type: fieldsplit
            FieldSplit with Schur preconditioner, blocksize = 3,
factorization FULL
            Preconditioner for the Schur complement formed from A11
            Split info:
            Split number 0 Fields  0, 1
            Split number 1 Fields  2
            KSP solver for A00 block
              KSP Object:              (fieldsplit_0_fieldsplit_0_)
      1 MPI processes
                type: gmres
                  GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
                  GMRES: happy breakdown tolerance 1e-30
                maximum iterations=10000, initial guess is zero
                tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
                left preconditioning
                using PRECONDITIONED norm type for convergence test
              PC Object:              (fieldsplit_0_fieldsplit_0_)
      1 MPI processes
                type: lu
                  LU: out-of-place factorization
                  tolerance for zero pivot 2.22045e-14
                  matrix ordering: nd
                  factor fill ratio given 5, needed 1.875
                    Factored matrix follows:
                      Mat Object:                       1 MPI processes
                        type: seqaij
                        rows=32, cols=32
                        package used to perform factorization: petsc
                        total: nonzeros=480, allocated nonzeros=480
                        total number of mallocs used during MatSetValues
calls =0
                          using I-node routines: found 12 nodes, limit used
is 5
                linear system matrix = precond matrix:
                Mat Object:                (fieldsplit_0_fieldsplit_0_)
            1 MPI processes
                  type: seqaij
                  rows=32, cols=32
                  total: nonzeros=256, allocated nonzeros=256
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 16 nodes, limit used is 5
            KSP solver for S = A11 - A10 inv(A00) A01
              KSP Object:              (fieldsplit_0_fieldsplit_1_)
      1 MPI processes
                type: gmres
                  GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
                  GMRES: happy breakdown tolerance 1e-30
                maximum iterations=10000, initial guess is zero
                tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
                left preconditioning
                using PRECONDITIONED norm type for convergence test
              PC Object:              (fieldsplit_0_fieldsplit_1_)
      1 MPI processes
                type: lu
                  LU: out-of-place factorization
                  tolerance for zero pivot 2.22045e-14
                  matrix ordering: nd
                  factor fill ratio given 5, needed 1.875
                    Factored matrix follows:
                      Mat Object:                       1 MPI processes
                        type: seqaij
                        rows=16, cols=16
                        package used to perform factorization: petsc
                        total: nonzeros=120, allocated nonzeros=120
                        total number of mallocs used during MatSetValues
calls =0
                          using I-node routines: found 12 nodes, limit used
is 5
                linear system matrix followed by preconditioner matrix:
                Mat Object:                (fieldsplit_0_fieldsplit_1_)
            1 MPI processes
                  type: schurcomplement
                  rows=16, cols=16
                    Schur complement A11 - A10 inv(A00) A01
                    A11
                      Mat Object:
 (fieldsplit_0_fieldsplit_1_)                       1 MPI processes
                        type: seqaij
                        rows=16, cols=16
                        total: nonzeros=64, allocated nonzeros=64
                        total number of mallocs used during MatSetValues
calls =0
                          not using I-node routines
                    A10
                      Mat Object:                       1 MPI processes
                        type: seqaij
                        rows=16, cols=32
                        total: nonzeros=128, allocated nonzeros=128
                        total number of mallocs used during MatSetValues
calls =0
                          not using I-node routines
                    KSP of A00
                      KSP Object:
 (fieldsplit_0_fieldsplit_0_)                       1 MPI processes
                        type: gmres
                          GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
                          GMRES: happy breakdown tolerance 1e-30
                        maximum iterations=10000, initial guess is zero
                        tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
                        left preconditioning
                        using PRECONDITIONED norm type for convergence test
                      PC Object:
 (fieldsplit_0_fieldsplit_0_)                       1 MPI processes
                        type: lu
                          LU: out-of-place factorization
                          tolerance for zero pivot 2.22045e-14
                          matrix ordering: nd
                          factor fill ratio given 5, needed 1.875
                            Factored matrix follows:
                              Mat Object:                               1
MPI processes
                                type: seqaij
                                rows=32, cols=32
                                package used to perform factorization: petsc
                                total: nonzeros=480, allocated nonzeros=480
                                total number of mallocs used during
MatSetValues calls =0
                                  using I-node routines: found 12 nodes,
limit used is 5
                        linear system matrix = precond matrix:
                        Mat Object:
 (fieldsplit_0_fieldsplit_0_)                         1 MPI processes
                          type: seqaij
                          rows=32, cols=32
                          total: nonzeros=256, allocated nonzeros=256
                          total number of mallocs used during MatSetValues
calls =0
                            using I-node routines: found 16 nodes, limit
used is 5
                    A01
                      Mat Object:                       1 MPI processes
                        type: seqaij
                        rows=32, cols=16
                        total: nonzeros=128, allocated nonzeros=128
                        total number of mallocs used during MatSetValues
calls =0
                          using I-node routines: found 16 nodes, limit used
is 5
                Mat Object:                (fieldsplit_0_fieldsplit_1_)
            1 MPI processes
                  type: seqaij
                  rows=16, cols=16
                  total: nonzeros=64, allocated nonzeros=64
                  total number of mallocs used during MatSetValues calls =0
                    not using I-node routines
          linear system matrix = precond matrix:
          Mat Object:          (fieldsplit_0_)           1 MPI processes
            type: seqaij
            rows=48, cols=48
            total: nonzeros=576, allocated nonzeros=576
            total number of mallocs used during MatSetValues calls =0
              using I-node routines: found 16 nodes, limit used is 5
      KSP solver for S = A11 - A10 inv(A00) A01
        KSP Object:        (fieldsplit_1_)         1 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        PC Object:        (fieldsplit_1_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            matrix ordering: nd
            factor fill ratio given 5, needed 1.875
              Factored matrix follows:
                Mat Object:                 1 MPI processes
                  type: seqaij
                  rows=16, cols=16
                  package used to perform factorization: petsc
                  total: nonzeros=120, allocated nonzeros=120
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 12 nodes, limit used is 5
          linear system matrix followed by preconditioner matrix:
          Mat Object:          (fieldsplit_1_)           1 MPI processes
            type: schurcomplement
            rows=16, cols=16
              Schur complement A11 - A10 inv(A00) A01
              A11
                Mat Object:                (fieldsplit_1_)
1 MPI processes
                  type: seqaij
                  rows=16, cols=16
                  total: nonzeros=64, allocated nonzeros=64
                  total number of mallocs used during MatSetValues calls =0
                    not using I-node routines
              A10
                Mat Object:                 1 MPI processes
                  type: seqaij
                  rows=16, cols=48
                  total: nonzeros=192, allocated nonzeros=192
                  total number of mallocs used during MatSetValues calls =0
                    not using I-node routines
              KSP of A00
                KSP Object:                (fieldsplit_0_)
1 MPI processes
                  type: gmres
                    GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
                    GMRES: happy breakdown tolerance 1e-30
                  maximum iterations=10000, initial guess is zero
                  tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
                  left preconditioning
                  using PRECONDITIONED norm type for convergence test
                PC Object:                (fieldsplit_0_)                 1
MPI processes
                  type: fieldsplit
                    FieldSplit with Schur preconditioner, blocksize = 3,
factorization FULL
                    Preconditioner for the Schur complement formed from A11
                    Split info:
                    Split number 0 Fields  0, 1
                    Split number 1 Fields  2
                    KSP solver for A00 block
                      KSP Object:
 (fieldsplit_0_fieldsplit_0_)                       1 MPI processes
                        type: gmres
                          GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
                          GMRES: happy breakdown tolerance 1e-30
                        maximum iterations=10000, initial guess is zero
                        tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
                        left preconditioning
                        using PRECONDITIONED norm type for convergence test
                      PC Object:
 (fieldsplit_0_fieldsplit_0_)                       1 MPI processes
                        type: lu
                          LU: out-of-place factorization
                          tolerance for zero pivot 2.22045e-14
                          matrix ordering: nd
                          factor fill ratio given 5, needed 1.875
                            Factored matrix follows:
                              Mat Object:                               1
MPI processes
                                type: seqaij
                                rows=32, cols=32
                                package used to perform factorization: petsc
                                total: nonzeros=480, allocated nonzeros=480
                                total number of mallocs used during
MatSetValues calls =0
                                  using I-node routines: found 12 nodes,
limit used is 5
                        linear system matrix = precond matrix:
                        Mat Object:
 (fieldsplit_0_fieldsplit_0_)                         1 MPI processes
                          type: seqaij
                          rows=32, cols=32
                          total: nonzeros=256, allocated nonzeros=256
                          total number of mallocs used during MatSetValues
calls =0
                            using I-node routines: found 16 nodes, limit
used is 5
                    KSP solver for S = A11 - A10 inv(A00) A01
                      KSP Object:
 (fieldsplit_0_fieldsplit_1_)                       1 MPI processes
                        type: gmres
                          GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
                          GMRES: happy breakdown tolerance 1e-30
                        maximum iterations=10000, initial guess is zero
                        tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
                        left preconditioning
                        using PRECONDITIONED norm type for convergence test
                      PC Object:
 (fieldsplit_0_fieldsplit_1_)                       1 MPI processes
                        type: lu
                          LU: out-of-place factorization
                          tolerance for zero pivot 2.22045e-14
                          matrix ordering: nd
                          factor fill ratio given 5, needed 1.875
                            Factored matrix follows:
                              Mat Object:                               1
MPI processes
                                type: seqaij
                                rows=16, cols=16
                                package used to perform factorization: petsc
                                total: nonzeros=120, allocated nonzeros=120
                                total number of mallocs used during
MatSetValues calls =0
                                  using I-node routines: found 12 nodes,
limit used is 5
                        linear system matrix followed by preconditioner
matrix:
                        Mat Object:
 (fieldsplit_0_fieldsplit_1_)                         1 MPI processes
                          type: schurcomplement
                          rows=16, cols=16
                            Schur complement A11 - A10 inv(A00) A01
                            A11
                              Mat Object:
 (fieldsplit_0_fieldsplit_1_)                               1 MPI processes
                                type: seqaij
                                rows=16, cols=16
                                total: nonzeros=64, allocated nonzeros=64
                                total number of mallocs used during
MatSetValues calls =0
                                  not using I-node routines
                            A10
                              Mat Object:                               1
MPI processes
                                type: seqaij
                                rows=16, cols=32
                                total: nonzeros=128, allocated nonzeros=128
                                total number of mallocs used during
MatSetValues calls =0
                                  not using I-node routines
                            KSP of A00
                              KSP Object:
 (fieldsplit_0_fieldsplit_0_)                               1 MPI processes
                                type: gmres
                                  GMRES: restart=30, using Classical
(unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
                                  GMRES: happy breakdown tolerance 1e-30
                                maximum iterations=10000, initial guess is
zero
                                tolerances:  relative=1e-05,
absolute=1e-50, divergence=10000
                                left preconditioning
                                using PRECONDITIONED norm type for
convergence test
                              PC Object:
 (fieldsplit_0_fieldsplit_0_)                               1 MPI processes
                                type: lu
                                  LU: out-of-place factorization
                                  tolerance for zero pivot 2.22045e-14
                                  matrix ordering: nd
                                  factor fill ratio given 5, needed 1.875
                                    Factored matrix follows:
                                      Mat Object:
            1 MPI processes
                                        type: seqaij
                                        rows=32, cols=32
                                        package used to perform
factorization: petsc
                                        total: nonzeros=480, allocated
nonzeros=480
                                        total number of mallocs used during
MatSetValues calls =0
                                          using I-node routines: found 12
nodes, limit used is 5
                                linear system matrix = precond matrix:
                                Mat Object:
 (fieldsplit_0_fieldsplit_0_)                                 1 MPI
processes
                                  type: seqaij
                                  rows=32, cols=32
                                  total: nonzeros=256, allocated
nonzeros=256
                                  total number of mallocs used during
MatSetValues calls =0
                                    using I-node routines: found 16 nodes,
limit used is 5
                            A01
                              Mat Object:                               1
MPI processes
                                type: seqaij
                                rows=32, cols=16
                                total: nonzeros=128, allocated nonzeros=128
                                total number of mallocs used during
MatSetValues calls =0
                                  using I-node routines: found 16 nodes,
limit used is 5
                        Mat Object:
 (fieldsplit_0_fieldsplit_1_)                         1 MPI processes
                          type: seqaij
                          rows=16, cols=16
                          total: nonzeros=64, allocated nonzeros=64
                          total number of mallocs used during MatSetValues
calls =0
                            not using I-node routines
                  linear system matrix = precond matrix:
                  Mat Object:                  (fieldsplit_0_)
      1 MPI processes
                    type: seqaij
                    rows=48, cols=48
                    total: nonzeros=576, allocated nonzeros=576
                    total number of mallocs used during MatSetValues calls
=0
                      using I-node routines: found 16 nodes, limit used is 5
              A01
                Mat Object:                 1 MPI processes
                  type: seqaij
                  rows=48, cols=16
                  total: nonzeros=192, allocated nonzeros=192
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 16 nodes, limit used is 5
          Mat Object:          (fieldsplit_1_)           1 MPI processes
            type: seqaij
            rows=16, cols=16
            total: nonzeros=64, allocated nonzeros=64
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
    linear system matrix = precond matrix:
    Mat Object:     1 MPI processes
      type: seqaij
      rows=64, cols=64, bs=4
      total: nonzeros=1024, allocated nonzeros=1024
      total number of mallocs used during MatSetValues calls =0
        using I-node routines: found 16 nodes, limit used is 5
Number of SNES iterations = 2

I cannot replicate your failure here. I am running on the 'next' branch
here. What are you using?
Also, are you using the DMDA for data layout?

I would like to use nested fieldsplits to separate the T part from the
> Stokes part, and apply a Schur complement approach to the Stokes block.
> Unfortunately, I keep getting this error message:
> [1]PETSC ERROR: DMCreateFieldDecomposition() line 1274 in
> /home/jolive/petsc/src/dm/interface/dm.c Decomposition defined only after
> DMSetUp
>
>  Here are the command line options I tried:
>
>  -snes_type ksponly \
>  -ksp_type fgmres \
> # define 2 fields: [vx vy p] and [T]
> -pc_type fieldsplit -pc_fieldsplit_0_fields 0,1,2 -pc_fieldsplit_1_fields
> 3 \
> # split [vx vy p] into 2 fields: [vx vy] and [p]
> -fieldsplit_0_pc_type fieldsplit \
>  -pc_fieldsplit_0_fieldsplit_0_fields 0,1 -pc_fieldsplit_0_fieldsplit_1
> _fields 2 \
>

Note that the 2 options above are wrong. It should be
-fieldsplit_0_pc_fieldsplit_0_fields 0,1

  Thanks,

     Matt

> # apply schur complement to [vx vy p]
>  -fieldsplit_0_pc_fieldsplit_type schur \
> -fieldsplit_0_pc_fieldsplit_schur_factorization_type upper \
>
>  # solve everything with lu, just for testing
> -fieldsplit_0_fieldsplit_0_ksp_type preonly \
> -fieldsplit_0_fieldsplit_0_pc_type lu -fieldsplit_0_fieldsplit_0_pc_factor_mat_solver_package
> superlu_dist \
> -fieldsplit_0_fieldsplit_1_ksp_type preonly \
> -fieldsplit_0_fieldsplit_1_pc_type lu -fieldsplit_0_fieldsplit_1_pc_factor_mat_solver_package
> superlu_dist \
> -fieldsplit_1_ksp_type preonly \
> -fieldsplit_1_pc_type lu -fieldsplit_1_pc_factor_mat_solver_package
> superlu_dist \
>
>  Any idea what could be causing this?
> Thanks a lot,
> Arthur
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140925/3dff5b40/attachment-0001.html>