[petsc-users] Questions about PCMG

Tue Apr 3 14:25:36 CDT 2012

Dear All,

I want to create two grid preconditioner for the linear Jacobian solve for the nonlinear problem. I am trying to use the inexact Newton as the nonlinear solver, and the fGMRES as the linear solve. For the preconditioner for the linear solve, I want to create a two level ASM preconditioner by using PCMG. 

I use the additive Schwarz (ASM) preconditioned Richardson iteration as the smoother on the fine grid. And use the ASM preconditioned GMRES as the coarse solve. 

But I have the follow questions in setting up the preconditioer,

1. When I use the command PCMGGetSmoother to setup the fine grid smoother, it only effective on the down solver but not the up solver. Although I want to use same smoother for up and down procedure, I need to call PCMGGetSmootherUp and PCMGGetSmootherDown to setup the smoother seperately. Do you have any ideas about this issue?

2. In my two level preconditioner, I need three ASM preconditioner, two for the up and down smoother, and one for the coarse solve. If I want to use LU factorization as the subdomain solve for both the up and down smoother, but I don't want to redo the LU factorization for the up smoother. How can I keep the LU factorization for the down smoother in order to use for the up smoother?

3. I run my code with the my twolevel preconditioner. The SNES converges in two iteration.

  0 SNES norm 1.014991e+02, 0 KSP its (nan coarse its average), last norm 0.000000e+00.
  1 SNES norm 9.925218e-05, 4 KSP its (5.25 coarse its average), last norm 2.268574e-06.
  2 SNES norm 1.397282e-09, 5 KSP its (5.20 coarse its average), last norm 1.312605e-12.

In the log summary, I check the number of the factorization in the problem

MatLUFactorSym         4 1.0 1.1232e+00 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01  2  0  0  0  1   2  0  0  0  1     0
MatLUFactorNum         4 1.0 1.3245e+01 2.2 1.22e+09 3.0 0.0e+00 0.0e+00 0.0e+00 28 86  0  0  0  30 86  0  0  0   553

Do you have any ideas that why the number of MatLUFactorSym is 4? And is there any approach that I can find out how much LU factorization are done for the coarse solve, and how much LU factorization are done for the up and down smoother? I believe the number should be six, in each SNES iteration, I have need to do two LU for the up and down smoother, and one LU for the coarse solve. Because we have two SNES iteration now, the number of LU factorization should be 6 instead of 4.

Thank you so much for your help. Below is the output form the SNESView of my program, and the setup of my preconditioner.

Best

Yuqi Wu

/* SNES view */
SNES Object: 8 MPI processes
  type: ls
    line search variant: SNESLineSearchCubic
    alpha=1.000000000000e-04, maxstep=1.000000000000e+08, minlambda=1.000000000000e-12
  maximum iterations=10, maximum function evaluations=10000
  tolerances: relative=1e-07, absolute=1e-50, solution=1e-08
  total number of linear solver iterations=9
  total number of function evaluations=3
  KSP Object:   8 MPI processes  
    type: fgmres
      GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-30
    maximum iterations=1000, initial guess is zero
    tolerances:  relative=1e-06, absolute=1e-14, divergence=10000
    right preconditioning
    using UNPRECONDITIONED norm type for convergence test
  PC Object:   8 MPI processes  
    type: mg
      MG: type is MULTIPLICATIVE, levels=2 cycles=v
        Cycles per PCApply=1
        Using Galerkin computed coarse grid matrices
    Coarse grid solver -- level -------------------------------
      KSP Object:      (coarse_)       8 MPI processes      
        type: gmres
          GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
          GMRES: happy breakdown tolerance 1e-30
        maximum iterations=1000, initial guess is zero
        tolerances:  relative=0.001, absolute=1e-50, divergence=10000
        right preconditioning
        using UNPRECONDITIONED norm type for convergence test
      PC Object:      (coarse_)       8 MPI processes      
        type: asm
          Additive Schwarz: total subdomain blocks = 8, user-defined overlap
          Additive Schwarz: restriction/interpolation type - RESTRICT
          Local solve is same for all blocks, in the following KSP and PC objects:
        KSP Object:        (coarse_sub_)         1 MPI processes        
          type: preonly
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (coarse_sub_)         1 MPI processes        
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 1e-12
            matrix ordering: nd
            factor fill ratio given 5, needed 3.73447
              Factored matrix follows:
                Matrix Object:                 1 MPI processes                
                  type: seqaij
                  rows=840, cols=840
                  package used to perform factorization: petsc
                  total: nonzeros=384486, allocated nonzeros=384486
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 349 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes          
            type: seqaij
            rows=840, cols=840
            total: nonzeros=102956, allocated nonzeros=102956
            total number of mallocs used during MatSetValues calls =0
              using I-node routines: found 360 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         8 MPI processes        
          type: mpiaij
          rows=4186, cols=4186
          total: nonzeros=656174, allocated nonzeros=656174
          total number of mallocs used during MatSetValues calls =0
            using I-node (on process 0) routines: found 315 nodes, limit used is 5
    Down solver (pre-smoother) on level 1 -------------------------------
      KSP Object:      (mg_levels_1_)       8 MPI processes      
        type: richardson
          Richardson: damping factor=1
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using PRECONDITIONED norm type for convergence test
      PC Object:      (fine_)       8 MPI processes      
        type: asm
          Additive Schwarz: total subdomain blocks = 8, user-defined overlap
          Additive Schwarz: restriction/interpolation type - RESTRICT
          Local solve is same for all blocks, in the following KSP and PC objects:
        KSP Object:        (fine_sub_)         1 MPI processes        
          type: preonly
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (fine_sub_)         1 MPI processes        
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 1e-12
            matrix ordering: nd
            factor fill ratio given 5, needed 5.49559
              Factored matrix follows:
                Matrix Object:                 1 MPI processes                
                  type: seqaij
                  rows=2100, cols=2100
                  package used to perform factorization: petsc
                  total: nonzeros=401008, allocated nonzeros=401008
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 1491 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes          
            type: seqaij
            rows=2100, cols=2100
            total: nonzeros=72969, allocated nonzeros=72969
            total number of mallocs used during MatSetValues calls =0
              using I-node routines: found 1532 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         8 MPI processes        
          type: mpiaij
          rows=11585, cols=11585
          total: nonzeros=458097, allocated nonzeros=3026770
          total number of mallocs used during MatSetValues calls =978
            using I-node (on process 0) routines: found 1365 nodes, limit used is 5
    Up solver (post-smoother) on level 1 -------------------------------
      KSP Object:      (mg_levels_1_)       8 MPI processes      
        type: richardson
          Richardson: damping factor=1
        maximum iterations=1
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using nonzero initial guess
        using PRECONDITIONED norm type for convergence test
      PC Object:      (fine_)       8 MPI processes      
        type: asm
          Additive Schwarz: total subdomain blocks = 8, user-defined overlap
          Additive Schwarz: restriction/interpolation type - RESTRICT
          Local solve is same for all blocks, in the following KSP and PC objects:
        KSP Object:        (fine_sub_)         1 MPI processes        
          type: preonly
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (fine_sub_)         1 MPI processes        
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 1e-12
            matrix ordering: nd
            factor fill ratio given 5, needed 5.49559
              Factored matrix follows:
                Matrix Object:                 1 MPI processes                
                  type: seqaij
                  rows=2100, cols=2100
                  package used to perform factorization: petsc
                  total: nonzeros=401008, allocated nonzeros=401008
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 1491 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes          
            type: seqaij
            rows=2100, cols=2100
            total: nonzeros=72969, allocated nonzeros=72969
            total number of mallocs used during MatSetValues calls =0
              using I-node routines: found 1532 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         8 MPI processes        
          type: mpiaij
          rows=11585, cols=11585
          total: nonzeros=458097, allocated nonzeros=3026770
          total number of mallocs used during MatSetValues calls =978
            using I-node (on process 0) routines: found 1365 nodes, limit used is 5
    linear system matrix = precond matrix:
    Matrix Object:     8 MPI processes    
      type: mpiaij
      rows=11585, cols=11585
      total: nonzeros=458097, allocated nonzeros=3026770
      total number of mallocs used during MatSetValues calls =978
        using I-node (on process 0) routines: found 1365 nodes, limit used is 5
SNES converged: CONVERGED_FNORM_RELATIVE.

//*******************************
Below setup of my preconditioner,

  /* set up the MG preconditioner */
  ierr = SNESGetKSP(snes,&fineksp);CHKERRQ(ierr);
  ierr = KSPGetPC(fineksp,&finepc);CHKERRQ(ierr);
  ierr = PCSetType(finepc,PCMG);CHKERRQ(ierr);
  ierr = PCMGSetType(finepc,PC_MG_MULTIPLICATIVE);CHKERRQ(ierr);
  ierr = PCMGSetLevels(finepc,2,PETSC_NULL);CHKERRQ(ierr);
  ierr = PCMGSetCycleType(finepc,PC_MG_CYCLE_V);CHKERRQ(ierr);
  ierr = PCMGSetNumberSmoothUp(finepc,1);CHKERRQ(ierr);
  ierr = PCMGSetNumberSmoothDown(finepc,1);CHKERRQ(ierr);
  ierr = PCMGSetGalerkin(finepc,PETSC_TRUE);CHKERRQ(ierr);
  ierr = PCMGSetResidual(finepc,1,PCMGDefaultResidual,algebra->J);CHKERRQ(ierr);

  ierr = PCMGSetInterpolation(finepc,1,ctx->Interp);CHKERRQ(ierr); 

  /* set up the coarse solve */
  ierr = PCMGGetCoarseSolve(finepc,&ctx->coarseksp);CHKERRQ(ierr);
  ierr = KSPSetOptionsPrefix(ctx->coarseksp,"coarse_");CHKERRQ(ierr);
  ierr = KSPSetFromOptions(ctx->coarseksp);CHKERRQ(ierr);

  /* set up the fine grid smoother */
  ierr = PCMGGetSmoother(finepc,1,&kspsmooth);CHKERRQ(ierr);
  ierr = KSPSetType(kspsmooth, KSPRICHARDSON);CHKERRQ(ierr);
  ierr = KSPGetPC(kspsmooth,&asmpc);CHKERRQ(ierr);
  ierr = PCSetType(asmpc,PCASM);CHKERRQ(ierr);
  ierr = PCASMSetOverlap(asmpc,0);CHKERRQ(ierr);
  ierr = PCASMSetLocalSubdomains(asmpc,1,&grid->df_global_asm,PETSC_NULL);CHKERRQ(ierr);