<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Sep 4, 2014 at 7:26 AM, Klaij, Christiaan <span dir="ltr"><<a href="mailto:C.Klaij@marin.nl" target="_blank">C.Klaij@marin.nl</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Sorry, here's the ksp_view. I'm expecting<br>


<br>


-fieldsplit_1_inner_ksp_type preonly<br>


<br>


to set the ksp(A00) in the Schur complement only, but it seems to set it in the inv(A00) of the diagonal as well.<br></blockquote><div><br></div><div>I think something is wrong in your example (we strongly advise against using MatNest directly). I cannot reproduce this using SNES ex62:</div>


<div><br></div><div>  ./config/builder2.py check src/snes/examples/tutorials/ex62.c --testnum=36 --args="-fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi"</div><div><br></div>


<div>which translates to</div><div><br></div><div>  ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -ksp_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi</div>


<div><br></div><div>gives</div><div><br></div><div>  Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1</div><div>SNES Object: 1 MPI processes</div><div>  type: newtonls</div><div>  maximum iterations=50, maximum function evaluations=10000</div>


<div>  tolerances: relative=1e-08, absolute=1e-50, solution=1e-08</div><div>  total number of linear solver iterations=20</div><div>  total number of function evaluations=2</div><div>  SNESLineSearch Object:   1 MPI processes</div>


<div>    type: bt</div><div>      interpolation: cubic</div><div>      alpha=1.000000e-04</div><div>    maxstep=1.000000e+08, minlambda=1.000000e-12</div><div>    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08</div>


<div>    maximum iterations=40</div><div>  KSP Object:   1 MPI processes</div><div>    type: fgmres</div><div>      GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div>


<div>      GMRES: happy breakdown tolerance 1e-30</div><div>    maximum iterations=10000, initial guess is zero</div><div>    tolerances:  relative=1e-09, absolute=1e-50, divergence=10000</div><div>    right preconditioning</div>


<div>    has attached null space</div><div>    using UNPRECONDITIONED norm type for convergence test</div><div>  PC Object:   1 MPI processes</div><div>    type: fieldsplit</div><div>      FieldSplit with Schur preconditioner, factorization FULL</div>


<div>      Preconditioner for the Schur complement formed from A11</div><div>      Split info:</div><div>      Split number 0 Defined by IS</div><div>      Split number 1 Defined by IS</div><div>      KSP solver for A00 block</div>


<div>        KSP Object:        (fieldsplit_velocity_)         1 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div>


<div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10000, initial guess is zero</div><div>          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>          left preconditioning</div>


<div>          using PRECONDITIONED norm type for convergence test</div><div>        PC Object:        (fieldsplit_velocity_)         1 MPI processes</div><div>          type: lu</div><div>            LU: out-of-place factorization</div>


<div>            tolerance for zero pivot 2.22045e-14</div><div>            matrix ordering: nd</div><div>            factor fill ratio given 5, needed 3.45047</div><div>              Factored matrix follows:</div><div>                Mat Object:                 1 MPI processes</div>


<div>                  type: seqaij</div><div>                  rows=962, cols=962</div><div>                  package used to perform factorization: petsc</div><div>                  total: nonzeros=68692, allocated nonzeros=68692</div>


<div>                  total number of mallocs used during MatSetValues calls =0</div><div>                    using I-node routines: found 456 nodes, limit used is 5</div><div>          linear system matrix = precond matrix:</div>


<div>          Mat Object:          (fieldsplit_velocity_)           1 MPI processes</div><div>            type: seqaij</div><div>            rows=962, cols=962</div><div>            total: nonzeros=19908, allocated nonzeros=19908</div>


<div>            total number of mallocs used during MatSetValues calls =0</div><div>              using I-node routines: found 481 nodes, limit used is 5</div><div>      KSP solver for S = A11 - A10 inv(A00) A01 </div><div>


        KSP Object:        (fieldsplit_pressure_)         1 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div>


<div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10000, initial guess is zero</div><div>          tolerances:  relative=1e-10, absolute=1e-50, divergence=10000</div><div>          left preconditioning</div>


<div>          has attached null space</div><div>          using PRECONDITIONED norm type for convergence test</div><div>        PC Object:        (fieldsplit_pressure_)         1 MPI processes</div><div>          type: jacobi</div>


<div>          linear system matrix followed by preconditioner matrix:</div><div>          Mat Object:          (fieldsplit_pressure_)           1 MPI processes</div><div>            type: schurcomplement</div><div>            rows=145, cols=145</div>


<div>              has attached null space</div><div>              Schur complement A11 - A10 inv(A00) A01</div><div>              A11</div><div>                Mat Object:                (fieldsplit_pressure_)                 1 MPI processes</div>


<div>                  type: seqaij</div><div>                  rows=145, cols=145</div><div>                  total: nonzeros=945, allocated nonzeros=945</div><div>                  total number of mallocs used during MatSetValues calls =0</div>


<div>                    has attached null space</div><div>                    not using I-node routines</div><div>              A10</div><div>                Mat Object:                 1 MPI processes</div><div>                  type: seqaij</div>


<div>                  rows=145, cols=962</div><div>                  total: nonzeros=4466, allocated nonzeros=4466</div><div>                  total number of mallocs used during MatSetValues calls =0</div><div>                    not using I-node routines</div>


<div>              KSP of A00</div><div>                KSP Object:                (fieldsplit_pressure_inner_)                 1 MPI processes</div><div>                  type: preonly</div><div>                  maximum iterations=10000, initial guess is zero</div>


<div>                  tolerances:  relative=1e-09, absolute=1e-50, divergence=10000</div><div>                  left preconditioning</div><div>                  using NONE norm type for convergence test</div><div>                PC Object:                (fieldsplit_pressure_inner_)                 1 MPI processes</div>


<div>                  type: jacobi</div><div>                  linear system matrix = precond matrix:</div><div>                  Mat Object:                  (fieldsplit_velocity_)                   1 MPI processes</div>


<div>                    type: seqaij</div><div>                    rows=962, cols=962</div><div>                    total: nonzeros=19908, allocated nonzeros=19908</div><div>                    total number of mallocs used during MatSetValues calls =0</div>


<div>                      using I-node routines: found 481 nodes, limit used is 5</div><div>              A01</div><div>                Mat Object:                 1 MPI processes</div><div>                  type: seqaij</div>


<div>                  rows=962, cols=145</div><div>                  total: nonzeros=4466, allocated nonzeros=4466</div><div>                  total number of mallocs used during MatSetValues calls =0</div><div>                    using I-node routines: found 481 nodes, limit used is 5</div>


<div>          Mat Object:          (fieldsplit_pressure_)           1 MPI processes</div><div>            type: seqaij</div><div>            rows=145, cols=145</div><div>            total: nonzeros=945, allocated nonzeros=945</div>


<div>            total number of mallocs used during MatSetValues calls =0</div><div>              has attached null space</div><div>              not using I-node routines</div><div>    linear system matrix = precond matrix:</div>


<div>    Mat Object:     1 MPI processes</div><div>      type: seqaij</div><div>      rows=1107, cols=1107</div><div>      total: nonzeros=29785, allocated nonzeros=29785</div><div>      total number of mallocs used during MatSetValues calls =0</div>


<div>        has attached null space</div><div>        using I-node routines: found 513 nodes, limit used is 5</div><div><br></div><div>   Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">


Chris<br>


<br>


  0 KSP Residual norm 1.229687498638e+00<br>


    Residual norms for fieldsplit_1_ solve.<br>


    0 KSP Residual norm 7.185799114488e+01<br>


    1 KSP Residual norm 3.873274154012e+01<br>


  1 KSP Residual norm 1.107969383366e+00<br>


KSP Object: 1 MPI processes<br>


  type: fgmres<br>


    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>


    GMRES: happy breakdown tolerance 1e-30<br>


  maximum iterations=1, initial guess is zero<br>


  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>


  right preconditioning<br>


  using UNPRECONDITIONED norm type for convergence test<br>


PC Object: 1 MPI processes<br>


  type: fieldsplit<br>


    FieldSplit with Schur preconditioner, factorization LOWER<br>


    Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse<br>


    Split info:<br>


    Split number 0 Defined by IS<br>


    Split number 1 Defined by IS<br>


    KSP solver for A00 block<br>


      KSP Object:      (fieldsplit_0_)       1 MPI processes<br>


        type: preonly<br>


        maximum iterations=1, initial guess is zero<br>


        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>


        left preconditioning<br>


        using NONE norm type for convergence test<br>


      PC Object:      (fieldsplit_0_)       1 MPI processes<br>


        type: bjacobi<br>


          block Jacobi: number of blocks = 1<br>


          Local solve is same for all blocks, in the following KSP and PC objects:<br>


          KSP Object:          (fieldsplit_0_sub_)           1 MPI processes<br>


            type: preonly<br>


            maximum iterations=10000, initial guess is zero<br>


            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>


            left preconditioning<br>


            using NONE norm type for convergence test<br>


          PC Object:          (fieldsplit_0_sub_)           1 MPI processes<br>


            type: ilu<br>


              ILU: out-of-place factorization<br>


              0 levels of fill<br>


              tolerance for zero pivot 2.22045e-14<br>


              using diagonal shift on blocks to prevent zero pivot [INBLOCKS]<br>


              matrix ordering: natural<br>


              factor fill ratio given 1, needed 1<br>


                Factored matrix follows:<br>


                  Mat Object:                   1 MPI processes<br>


                    type: seqaij<br>


                    rows=48, cols=48<br>


                    package used to perform factorization: petsc<br>


                    total: nonzeros=200, allocated nonzeros=200<br>


                    total number of mallocs used during MatSetValues calls =0<br>


                      not using I-node routines<br>


            linear system matrix = precond matrix:<br>


            Mat Object:            (fieldsplit_0_)             1 MPI processes<br>


              type: seqaij<br>


              rows=48, cols=48<br>


              total: nonzeros=200, allocated nonzeros=240<br>


              total number of mallocs used during MatSetValues calls =0<br>


                not using I-node routines<br>


        linear system matrix = precond matrix:<br>


        Mat Object:        (fieldsplit_0_)         1 MPI processes<br>


          type: mpiaij<br>


          rows=48, cols=48<br>


          total: nonzeros=200, allocated nonzeros=480<br>


          total number of mallocs used during MatSetValues calls =0<br>


            not using I-node (on process 0) routines<br>


    KSP solver for S = A11 - A10 inv(A00) A01<br>


      KSP Object:      (fieldsplit_1_)       1 MPI processes<br>


        type: gmres<br>


          GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>


          GMRES: happy breakdown tolerance 1e-30<br>


        maximum iterations=1, initial guess is zero<br>


        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>


        left preconditioning<br>


        using PRECONDITIONED norm type for convergence test<br>


      PC Object:      (fieldsplit_1_)       1 MPI processes<br>


        type: bjacobi<br>


          block Jacobi: number of blocks = 1<br>


          Local solve is same for all blocks, in the following KSP and PC objects:<br>


          KSP Object:          (fieldsplit_1_sub_)           1 MPI processes<br>


            type: preonly<br>


            maximum iterations=10000, initial guess is zero<br>


            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>


            left preconditioning<br>


            using NONE norm type for convergence test<br>


          PC Object:          (fieldsplit_1_sub_)           1 MPI processes<br>


            type: bjacobi<br>


              block Jacobi: number of blocks = 1<br>


              Local solve is same for all blocks, in the following KSP and PC objects:<br>


              KSP Object:              (fieldsplit_1_sub_sub_)               1 MPI processes<br>


                type: preonly<br>


                maximum iterations=10000, initial guess is zero<br>


                tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>


                left preconditioning<br>


                using NONE norm type for convergence test<br>


              PC Object:              (fieldsplit_1_sub_sub_)               1 MPI processes<br>


                type: ilu<br>


                  ILU: out-of-place factorization<br>


                  0 levels of fill<br>


                  tolerance for zero pivot 2.22045e-14<br>


                  using diagonal shift on blocks to prevent zero pivot [INBLOCKS]<br>


                  matrix ordering: natural<br>


                  factor fill ratio given 1, needed 1<br>


                    Factored matrix follows:<br>


                      Mat Object:                       1 MPI processes<br>


                        type: seqaij<br>


                        rows=24, cols=24<br>


                        package used to perform factorization: petsc<br>


                        total: nonzeros=120, allocated nonzeros=120<br>


                        total number of mallocs used during MatSetValues calls =0<br>


                          not using I-node routines<br>


                linear system matrix = precond matrix:<br>


                Mat Object:                 1 MPI processes<br>


                  type: seqaij<br>


                  rows=24, cols=24<br>


                  total: nonzeros=120, allocated nonzeros=120<br>


                  total number of mallocs used during MatSetValues calls =0<br>


                    not using I-node routines<br>


            linear system matrix = precond matrix:<br>


            Mat Object:             1 MPI processes<br>


              type: mpiaij<br>


              rows=24, cols=24<br>


              total: nonzeros=120, allocated nonzeros=120<br>


              total number of mallocs used during MatSetValues calls =0<br>


                not using I-node (on process 0) routines<br>


        linear system matrix followed by preconditioner matrix:<br>


        Mat Object:        (fieldsplit_1_)         1 MPI processes<br>


          type: schurcomplement<br>


          rows=24, cols=24<br>


            Schur complement A11 - A10 inv(A00) A01<br>


            A11<br>


              Mat Object:              (fieldsplit_1_)               1 MPI processes<br>


                type: mpiaij<br>


                rows=24, cols=24<br>


                total: nonzeros=0, allocated nonzeros=0<br>


                total number of mallocs used during MatSetValues calls =0<br>


                  using I-node (on process 0) routines: found 5 nodes, limit used is 5<br>


            A10<br>


              Mat Object:              (a10_)               1 MPI processes<br>


                type: mpiaij<br>


                rows=24, cols=48<br>


                total: nonzeros=96, allocated nonzeros=96<br>


                total number of mallocs used during MatSetValues calls =0<br>


                  not using I-node (on process 0) routines<br>


            KSP of A00<br>


              KSP Object:              (fieldsplit_1_inner_)               1 MPI processes<br>


                type: preonly<br>


                maximum iterations=1, initial guess is zero<br>


                tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>


                left preconditioning<br>


                using NONE norm type for convergence test<br>


              PC Object:              (fieldsplit_1_inner_)               1 MPI processes<br>


                type: jacobi<br>


                linear system matrix = precond matrix:<br>


                Mat Object:                (fieldsplit_0_)                 1 MPI processes<br>


                  type: mpiaij<br>


                  rows=48, cols=48<br>


                  total: nonzeros=200, allocated nonzeros=480<br>


                  total number of mallocs used during MatSetValues calls =0<br>


                    not using I-node (on process 0) routines<br>


            A01<br>


              Mat Object:              (a01_)               1 MPI processes<br>


                type: mpiaij<br>


                rows=48, cols=24<br>


                total: nonzeros=96, allocated nonzeros=480<br>


                total number of mallocs used during MatSetValues calls =0<br>


                  not using I-node (on process 0) routines<br>


        Mat Object:         1 MPI processes<br>


          type: mpiaij<br>


          rows=24, cols=24<br>


          total: nonzeros=120, allocated nonzeros=120<br>


          total number of mallocs used during MatSetValues calls =0<br>


            not using I-node (on process 0) routines<br>


  linear system matrix = precond matrix:<br>


  Mat Object:   1 MPI processes<br>


    type: nest<br>


    rows=72, cols=72<br>


      Matrix object:<br>


        type=nest, rows=2, cols=2<br>


        MatNest structure:<br>


        (0,0) : prefix="fieldsplit_0_", type=mpiaij, rows=48, cols=48<br>


        (0,1) : prefix="a01_", type=mpiaij, rows=48, cols=24<br>


        (1,0) : prefix="a10_", type=mpiaij, rows=24, cols=48<br>


        (1,1) : prefix="fieldsplit_1_", type=mpiaij, rows=24, cols=24<br>


<br>


<br>


From: Matthew Knepley <<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>><br>


Sent: Thursday, September 04, 2014 2:20 PM<br>


To: Klaij, Christiaan<br>


Cc: <a href="mailto:petsc-users@mcs.anl.gov">petsc-users@mcs.anl.gov</a><br>


Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp<br>


 <br>


<br>


<br>


<br>


On Thu, Sep 4, 2014 at 7:06 AM, Klaij, Christiaan  <<a href="mailto:C.Klaij@marin.nl">C.Klaij@marin.nl</a>> wrote:<br>


 I'm playing with the selfp option in fieldsplit using<br>


snes/examples/tutorials/ex70.c. For example:<br>


<br>


mpiexec -n 2 ./ex70 -nx 4 -ny 6 \<br>


-ksp_type fgmres \<br>


-pc_type fieldsplit \<br>


-pc_fieldsplit_type schur \<br>


-pc_fieldsplit_schur_fact_type lower \<br>


-pc_fieldsplit_schur_precondition selfp \<br>


-fieldsplit_1_inner_ksp_type preonly \<br>


-fieldsplit_1_inner_pc_type jacobi \<br>


-fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \<br>


-fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \<br>


-ksp_monitor -ksp_max_it 1<br>


<br>


gives the following output<br>


<br>


  0 KSP Residual norm 1.229687498638e+00<br>


    Residual norms for fieldsplit_1_ solve.<br>


    0 KSP Residual norm 2.330138480101e+01<br>


    1 KSP Residual norm 1.609000846751e+01<br>


  1 KSP Residual norm 1.180287268335e+00<br>


<br>


To my suprise I don't see anything for the fieldsplit_0_ solve,<br>


why?<br>


<br>


<br>


<br>


Always run with -ksp_view for any solver question.<br>


<br>


<br>


  Thanks,<br>


<br>


<br>


    Matt<br>


   Furthermore, if I understand correctly the above should be<br>


exactly equivalent with<br>


<br>


mpiexec -n 2 ./ex70 -nx 4 -ny 6 \<br>


-ksp_type fgmres \<br>


-pc_type fieldsplit \<br>


-pc_fieldsplit_type schur \<br>


-pc_fieldsplit_schur_fact_type lower \<br>


-user_ksp \<br>


-fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \<br>


-fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \<br>


-ksp_monitor -ksp_max_it 1<br>


<br>


  0 KSP Residual norm 1.229687498638e+00<br>


    Residual norms for fieldsplit_0_ solve.<br>


    0 KSP Residual norm 5.486639587672e-01<br>


    1 KSP Residual norm 6.348354253703e-02<br>


    Residual norms for fieldsplit_1_ solve.<br>


    0 KSP Residual norm 2.321938107977e+01<br>


    1 KSP Residual norm 1.605484031258e+01<br>


  1 KSP Residual norm 1.183225251166e+00<br>


<br>


because -user_ksp replaces the Schur complement by the simple<br>


approximation A11 - A10 inv(diag(A00)) A01. Beside the missing<br>


fielsplit_0_ part, the numbers are pretty close but not exactly<br>


the same. Any explanation?<br>


<br>


Chris<br>


<br>


<br>


dr. ir. Christiaan Klaij<br>


CFD Researcher<br>


Research & Development<br>


E mailto:<a href="mailto:C.Klaij@marin.nl">C.Klaij@marin.nl</a><br>


T <a href="tel:%2B31%20317%2049%2033%2044" value="+31317493344">+31 317 49 33 44</a><br>


<br>


<br>


MARIN<br>


2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands<br>


T <a href="tel:%2B31%20317%2049%2039%2011" value="+31317493911">+31 317 49 39 11</a>, F <a href="tel:%2B31%20317%2049%2032%2045" value="+31317493245">+31 317 49 32 45</a>, I <a href="http://www.marin.nl" target="_blank">www.marin.nl</a><br>


<br>


<br>


<br>


<br>


<br>


 --<br>


What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>


-- Norbert Wiener      </blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>


-- Norbert Wiener


</div></div>