[petsc-users] Questions about PCMG

Yuqi Wu ywu at culink.Colorado.EDU
Tue Apr 3 18:56:28 CDT 2012


Hi, Mark,

Thank you so much for your suggestion.

The problem 1 is resolved by avoiding calling PCMGSetNumberSmoothUp. 

But since I am using the unstructured grid in my application, I didn't use DA or dmmg, so -pc_mg_log didn't give any level information. I try to run my code using -info with 1 processor, and I find out some interesting issues.

As I mentioned before, I am solving a nonlinear problem, and use inexact Newton with linear search as the nonlinear solve, the fGMRES as the linear solve and two level multi-grid type ASM preconditioner as the preconditioner. For example, if solving the nonlinear problem needs two SNES iteration. In my opinion, in each SNES iteration, we need to setup the ASM preconditioner for the up and down smoother, and we need to setup the coarse solve (ASM preconditioned GMRES) every KSP iteration. That means we need to have 2 LU factorization for setting up the up and down smoother in every SNES iteration, and at least one LU factorization for the coarse solve. Since the up and down smoother are the same, LU factorization can be saved in the up smoother stage. That is at each SNES iteration, it needs at least 2 LU factorization, one for down smoother and one for coarse solve. So if my program converged in 2 SNES iterations, it should need at least 4 LU factorizations, 2 for the down sm
 o!
other, and 2 for the coarse solve. But in my -info output, it shows that there are only 3 LU factorization are carried on in my program. One MatLUFactorSymbolic is used in the down smoother of the first SNES iteration, one MatLUFactorSymbolic is used in the coarse solve of the first SNES iteration, and one MatLUFactorSymbolic is used in the coarse solve of the second SNES iteration. It seems that PETSC didn't call MatLUFactorSymbolic for the down smoother in the second SNES iteration. Do you have any ideas about this issue? It seems that in the second SNES iteration, the smoothers still use the LU factorization of the previous SNES iteration. I have enclosed the -info output as an attachment in this email. In the -info output file, it has the following lines

[0] PCSetUp_MG(): Using outer operators to define finest grid operator because PCMGGetSmoother(pc,nlevels-1,&ksp);KSPSetOperators(ksp,...); was not called.

It seems that the KSP for the smoother is using the SAME_PRECONDITIONER flag, so that the down smoother in second SNES iteration is not called MatLUFactorSymbolic any more.

Can I also ask another questions related to the twolevel preconditioner? I have also try to use PCComposite to create two level ASM preconditioner instead of using PCMG. Since I am not able to use DA, which way is better to use in create a two level preconditioner? 

If I use the PCMG approach, it seems that the LU factorization in the up smoother can be avoided, only the LU factorization in the down smoother are need to be computed. But if I have the following precondition created with PCComposite

 ierr = PCSetType(finepc,PCCOMPOSITE);CHKERRQ(ierr);
 /* smooth down preconditioner */
 ierr = PCCompositeAddPC(finepc,PCASM);CHKERRQ(ierr);
 /* coarse solve */
 ierr = PCCompositeAddPC(finepc,PCSHELL);CHKERRQ(ierr);
 /* smooth up preconditioner */
 ierr = PCCompositeAddPC(finepc,PCASM);CHKERRQ(ierr);

How can I save the LU factorization in the smooth up stage, if I want to use the same precondition in smooth up and smooth down?

Thank you.


Yuqi 



---- Original message ----
>Date: Tue, 3 Apr 2012 16:18:46 -0400
>From: petsc-users-bounces at mcs.anl.gov (on behalf of "Mark F. Adams" <mark.adams at columbia.edu>)
>Subject: Re: [petsc-users] Questions about PCMG  
>To: PETSc users list <petsc-users at mcs.anl.gov>
>
>
>On Apr 3, 2012, at 3:25 PM, Yuqi Wu wrote:
>
>> Dear All,
>> 
>> I want to create two grid preconditioner for the linear Jacobian solve for the nonlinear problem. I am trying to use the inexact Newton as the nonlinear solver, and the fGMRES as the linear solve. For the preconditioner for the linear solve, I want to create a two level ASM preconditioner by using PCMG. 
>> 
>> I use the additive Schwarz (ASM) preconditioned Richardson iteration as the smoother on the fine grid. And use the ASM preconditioned GMRES as the coarse solve. 
>> 
>> But I have the follow questions in setting up the preconditioer,
>> 
>> 1. When I use the command PCMGGetSmoother to setup the fine grid smoother, it only effective on the down solver but not the up solver.
>
>This should work.  I see that you have PCMGSetNumberSmoothUp in the code below.  By calling an up/down method I think PETSc will decouple the two smoothers, that is make a copy of the smoother and use one for up and one for down.  So you want to avoid and U//Down methods or command line parameters.
>
>> Although I want to use same smoother for up and down procedure, I need to call PCMGGetSmootherUp and PCMGGetSmootherDown to setup the smoother seperately. Do you have any ideas about this issue?
>> 
>> 2. In my two level preconditioner, I need three ASM preconditioner, two for the up and down smoother, and one for the coarse solve. If I want to use LU factorization as the subdomain solve for both the up and down smoother, but I don't want to redo the LU factorization for the up smoother. How can I keep the LU factorization for the down smoother in order to use for the up smoother?
>> 
>> 3. I run my code with the my twolevel preconditioner. The SNES converges in two iteration.
>> 
>>  0 SNES norm 1.014991e+02, 0 KSP its (nan coarse its average), last norm 0.000000e+00.
>>  1 SNES norm 9.925218e-05, 4 KSP its (5.25 coarse its average), last norm 2.268574e-06.
>>  2 SNES norm 1.397282e-09, 5 KSP its (5.20 coarse its average), last norm 1.312605e-12.
>> 
>> In the log summary, I check the number of the factorization in the problem
>> 
>> MatLUFactorSym         4 1.0 1.1232e+00 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01  2  0  0  0  1   2  0  0  0  1     0
>> MatLUFactorNum         4 1.0 1.3245e+01 2.2 1.22e+09 3.0 0.0e+00 0.0e+00 0.0e+00 28 86  0  0  0  30 86  0  0  0   553
>> 
>> Do you have any ideas that why the number of MatLUFactorSym is 4? And is there any approach that I can find out how much LU factorization are done for the coarse solve, and how much LU factorization are done for the up and down smoother?
>
>-pc_mg_log will give some level information but I'm not sure if it does setup stuff.  
>
>Running with -info gives verbose output (I would run with one processor) and you can see the sizes of the factorizations, along with fill stats, etc.
>
>Mark
>
>> I believe the number should be six, in each SNES iteration, I have need to do two LU for the up and down smoother, and one LU for the coarse solve. Because we have two SNES iteration now, the number of LU factorization should be 6 instead of 4.
>> 
>> Thank you so much for your help. Below is the output form the SNESView of my program, and the setup of my preconditioner.
>> 
>> Best
>> 
>> Yuqi Wu
>> 
>> 
>> /* SNES view */
>> SNES Object: 8 MPI processes
>>  type: ls
>>    line search variant: SNESLineSearchCubic
>>    alpha=1.000000000000e-04, maxstep=1.000000000000e+08, minlambda=1.000000000000e-12
>>  maximum iterations=10, maximum function evaluations=10000
>>  tolerances: relative=1e-07, absolute=1e-50, solution=1e-08
>>  total number of linear solver iterations=9
>>  total number of function evaluations=3
>>  KSP Object:   8 MPI processes  
>>    type: fgmres
>>      GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>      GMRES: happy breakdown tolerance 1e-30
>>    maximum iterations=1000, initial guess is zero
>>    tolerances:  relative=1e-06, absolute=1e-14, divergence=10000
>>    right preconditioning
>>    using UNPRECONDITIONED norm type for convergence test
>>  PC Object:   8 MPI processes  
>>    type: mg
>>      MG: type is MULTIPLICATIVE, levels=2 cycles=v
>>        Cycles per PCApply=1
>>        Using Galerkin computed coarse grid matrices
>>    Coarse grid solver -- level -------------------------------
>>      KSP Object:      (coarse_)       8 MPI processes      
>>        type: gmres
>>          GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>          GMRES: happy breakdown tolerance 1e-30
>>        maximum iterations=1000, initial guess is zero
>>        tolerances:  relative=0.001, absolute=1e-50, divergence=10000
>>        right preconditioning
>>        using UNPRECONDITIONED norm type for convergence test
>>      PC Object:      (coarse_)       8 MPI processes      
>>        type: asm
>>          Additive Schwarz: total subdomain blocks = 8, user-defined overlap
>>          Additive Schwarz: restriction/interpolation type - RESTRICT
>>          Local solve is same for all blocks, in the following KSP and PC objects:
>>        KSP Object:        (coarse_sub_)         1 MPI processes        
>>          type: preonly
>>          maximum iterations=10000, initial guess is zero
>>          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>          left preconditioning
>>          using NONE norm type for convergence test
>>        PC Object:        (coarse_sub_)         1 MPI processes        
>>          type: lu
>>            LU: out-of-place factorization
>>            tolerance for zero pivot 1e-12
>>            matrix ordering: nd
>>            factor fill ratio given 5, needed 3.73447
>>              Factored matrix follows:
>>                Matrix Object:                 1 MPI processes                
>>                  type: seqaij
>>                  rows=840, cols=840
>>                  package used to perform factorization: petsc
>>                  total: nonzeros=384486, allocated nonzeros=384486
>>                  total number of mallocs used during MatSetValues calls =0
>>                    using I-node routines: found 349 nodes, limit used is 5
>>          linear system matrix = precond matrix:
>>          Matrix Object:           1 MPI processes          
>>            type: seqaij
>>            rows=840, cols=840
>>            total: nonzeros=102956, allocated nonzeros=102956
>>            total number of mallocs used during MatSetValues calls =0
>>              using I-node routines: found 360 nodes, limit used is 5
>>        linear system matrix = precond matrix:
>>        Matrix Object:         8 MPI processes        
>>          type: mpiaij
>>          rows=4186, cols=4186
>>          total: nonzeros=656174, allocated nonzeros=656174
>>          total number of mallocs used during MatSetValues calls =0
>>            using I-node (on process 0) routines: found 315 nodes, limit used is 5
>>    Down solver (pre-smoother) on level 1 -------------------------------
>>      KSP Object:      (mg_levels_1_)       8 MPI processes      
>>        type: richardson
>>          Richardson: damping factor=1
>>        maximum iterations=1, initial guess is zero
>>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>        left preconditioning
>>        using PRECONDITIONED norm type for convergence test
>>      PC Object:      (fine_)       8 MPI processes      
>>        type: asm
>>          Additive Schwarz: total subdomain blocks = 8, user-defined overlap
>>          Additive Schwarz: restriction/interpolation type - RESTRICT
>>          Local solve is same for all blocks, in the following KSP and PC objects:
>>        KSP Object:        (fine_sub_)         1 MPI processes        
>>          type: preonly
>>          maximum iterations=10000, initial guess is zero
>>          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>          left preconditioning
>>          using NONE norm type for convergence test
>>        PC Object:        (fine_sub_)         1 MPI processes        
>>          type: lu
>>            LU: out-of-place factorization
>>            tolerance for zero pivot 1e-12
>>            matrix ordering: nd
>>            factor fill ratio given 5, needed 5.49559
>>              Factored matrix follows:
>>                Matrix Object:                 1 MPI processes                
>>                  type: seqaij
>>                  rows=2100, cols=2100
>>                  package used to perform factorization: petsc
>>                  total: nonzeros=401008, allocated nonzeros=401008
>>                  total number of mallocs used during MatSetValues calls =0
>>                    using I-node routines: found 1491 nodes, limit used is 5
>>          linear system matrix = precond matrix:
>>          Matrix Object:           1 MPI processes          
>>            type: seqaij
>>            rows=2100, cols=2100
>>            total: nonzeros=72969, allocated nonzeros=72969
>>            total number of mallocs used during MatSetValues calls =0
>>              using I-node routines: found 1532 nodes, limit used is 5
>>        linear system matrix = precond matrix:
>>        Matrix Object:         8 MPI processes        
>>          type: mpiaij
>>          rows=11585, cols=11585
>>          total: nonzeros=458097, allocated nonzeros=3026770
>>          total number of mallocs used during MatSetValues calls =978
>>            using I-node (on process 0) routines: found 1365 nodes, limit used is 5
>>    Up solver (post-smoother) on level 1 -------------------------------
>>      KSP Object:      (mg_levels_1_)       8 MPI processes      
>>        type: richardson
>>          Richardson: damping factor=1
>>        maximum iterations=1
>>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>        left preconditioning
>>        using nonzero initial guess
>>        using PRECONDITIONED norm type for convergence test
>>      PC Object:      (fine_)       8 MPI processes      
>>        type: asm
>>          Additive Schwarz: total subdomain blocks = 8, user-defined overlap
>>          Additive Schwarz: restriction/interpolation type - RESTRICT
>>          Local solve is same for all blocks, in the following KSP and PC objects:
>>        KSP Object:        (fine_sub_)         1 MPI processes        
>>          type: preonly
>>          maximum iterations=10000, initial guess is zero
>>          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>          left preconditioning
>>          using NONE norm type for convergence test
>>        PC Object:        (fine_sub_)         1 MPI processes        
>>          type: lu
>>            LU: out-of-place factorization
>>            tolerance for zero pivot 1e-12
>>            matrix ordering: nd
>>            factor fill ratio given 5, needed 5.49559
>>              Factored matrix follows:
>>                Matrix Object:                 1 MPI processes                
>>                  type: seqaij
>>                  rows=2100, cols=2100
>>                  package used to perform factorization: petsc
>>                  total: nonzeros=401008, allocated nonzeros=401008
>>                  total number of mallocs used during MatSetValues calls =0
>>                    using I-node routines: found 1491 nodes, limit used is 5
>>          linear system matrix = precond matrix:
>>          Matrix Object:           1 MPI processes          
>>            type: seqaij
>>            rows=2100, cols=2100
>>            total: nonzeros=72969, allocated nonzeros=72969
>>            total number of mallocs used during MatSetValues calls =0
>>              using I-node routines: found 1532 nodes, limit used is 5
>>        linear system matrix = precond matrix:
>>        Matrix Object:         8 MPI processes        
>>          type: mpiaij
>>          rows=11585, cols=11585
>>          total: nonzeros=458097, allocated nonzeros=3026770
>>          total number of mallocs used during MatSetValues calls =978
>>            using I-node (on process 0) routines: found 1365 nodes, limit used is 5
>>    linear system matrix = precond matrix:
>>    Matrix Object:     8 MPI processes    
>>      type: mpiaij
>>      rows=11585, cols=11585
>>      total: nonzeros=458097, allocated nonzeros=3026770
>>      total number of mallocs used during MatSetValues calls =978
>>        using I-node (on process 0) routines: found 1365 nodes, limit used is 5
>> SNES converged: CONVERGED_FNORM_RELATIVE.
>> 
>> //*******************************
>> Below setup of my preconditioner,
>> 
>>  /* set up the MG preconditioner */
>>  ierr = SNESGetKSP(snes,&fineksp);CHKERRQ(ierr);
>>  ierr = KSPGetPC(fineksp,&finepc);CHKERRQ(ierr);
>>  ierr = PCSetType(finepc,PCMG);CHKERRQ(ierr);
>>  ierr = PCMGSetType(finepc,PC_MG_MULTIPLICATIVE);CHKERRQ(ierr);
>>  ierr = PCMGSetLevels(finepc,2,PETSC_NULL);CHKERRQ(ierr);
>>  ierr = PCMGSetCycleType(finepc,PC_MG_CYCLE_V);CHKERRQ(ierr);
>>  ierr = PCMGSetNumberSmoothUp(finepc,1);CHKERRQ(ierr);
>>  ierr = PCMGSetNumberSmoothDown(finepc,1);CHKERRQ(ierr);
>>  ierr = PCMGSetGalerkin(finepc,PETSC_TRUE);CHKERRQ(ierr);
>>  ierr = PCMGSetResidual(finepc,1,PCMGDefaultResidual,algebra->J);CHKERRQ(ierr);
>> 
>>  ierr = PCMGSetInterpolation(finepc,1,ctx->Interp);CHKERRQ(ierr); 
>> 
>>  /* set up the coarse solve */
>>  ierr = PCMGGetCoarseSolve(finepc,&ctx->coarseksp);CHKERRQ(ierr);
>>  ierr = KSPSetOptionsPrefix(ctx->coarseksp,"coarse_");CHKERRQ(ierr);
>>  ierr = KSPSetFromOptions(ctx->coarseksp);CHKERRQ(ierr);
>> 
>>  /* set up the fine grid smoother */
>>  ierr = PCMGGetSmoother(finepc,1,&kspsmooth);CHKERRQ(ierr);
>>  ierr = KSPSetType(kspsmooth, KSPRICHARDSON);CHKERRQ(ierr);
>>  ierr = KSPGetPC(kspsmooth,&asmpc);CHKERRQ(ierr);
>>  ierr = PCSetType(asmpc,PCASM);CHKERRQ(ierr);
>>  ierr = PCASMSetOverlap(asmpc,0);CHKERRQ(ierr);
>>  ierr = PCASMSetLocalSubdomains(asmpc,1,&grid->df_global_asm,PETSC_NULL);CHKERRQ(ierr);
>> 
>> 
>> 
>> 
>> 
>> 
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mg_info.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120403/4aac5bbd/attachment-0001.txt>


More information about the petsc-users mailing list