[petsc-users] PCFactorSetShiftType does not work in code but -pc_factor_set_shift_type works

Thu May 25 02:26:16 CDT 2017

Dear Hong and Barry,

I have implemented this option in the code, as we also need to use 
configuration from file for convenience. When I run the code using 
options, it works fine, however, when I run the code using configuration 
file, it does not work. The code has two set of equations, flow and 
reactive, with prefix been set to "flow_" and "react_". When I run the 
code using

mpiexec -n 4 ../executable -flow_sub_pc_factor_shift_type nonzero 
-react_sub_pc_factor_shift_type nonzero

it works. However, if I run using

mpiexec -n 4 ../executable

and let the executable file read the options from file, it just does not 
work at "call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr)  or 
none, positive_definite ...". Do I miss something here?

Below is the pseudo code I have used for flow equations, similar for 
reactive equations.

           call MatCreateAIJ(Petsc_Comm_World,nndof,nndof,nngbldof,     &
nngbldof,d_nz,PETSC_NULL_INTEGER,o_nz,     &
                             PETSC_NULL_INTEGER,a_flow,ierr)
           CHKERRQ(ierr)

         call MatSetFromOptions(a_flow,ierr)
         CHKERRQ(ierr)

         call KSPCreate(Petsc_Comm_World, ksp_flow, ierr)
         CHKERRQ(ierr)

         call KSPAppendOptionsPrefix(ksp_flow,"flow_",ierr)
         CHKERRQ(ierr)

         call KSPSetInitialGuessNonzero(ksp_flow,                       &
                 b_initial_guess_nonzero_flow, ierr)
         CHKERRQ(ierr)

         call KSPSetInitialGuessNonzero(ksp_flow,                       &
                 b_initial_guess_nonzero_flow, ierr)
         CHKERRQ(ierr)

         call KSPSetDM(ksp_flow,dmda_flow%da,ierr)
         CHKERRQ(ierr)
         call KSPSetDMActive(ksp_flow,PETSC_FALSE,ierr)
         CHKERRQ(ierr)

         !!!!*********CHECK IF READ OPTION FROM FILE*********!!!!
         if (read_option_from_file) then

           call KSPSetType(ksp_flow, KSPGMRES, ierr)    !or KSPBCGS or 
others...
           CHKERRQ(ierr)

           call KSPGetPC(ksp_flow, pc_flow, ierr)
           CHKERRQ(ierr)

           call PCSetType(pc_flow,PCBJACOBI, ierr)       !or PCILU or 
PCJACOBI or PCHYPRE ...
           CHKERRQ(ierr)

           call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr)  
or none, positive_definite ...
           CHKERRQ(ierr)

         end if

         call PCFactorGetMatSolverPackage(pc_flow,solver_pkg_flow,ierr)
         CHKERRQ(ierr)

         call compute_jacobian(rank,dmda_flow%da,                       &
a_flow,a_in,ia_in,ja_in,nngl_in,         &
row_idx_l2pg,col_idx_l2pg,               &
                               b_non_interlaced)
         call KSPSetFromOptions(ksp_flow,ierr)
         CHKERRQ(ierr)

         call KSPSetUp(ksp_flow,ierr)
         CHKERRQ(ierr)

         call KSPSetUpOnBlocks(ksp_flow,ierr)
         CHKERRQ(ierr)

         call KSPSolve(ksp_flow,b_flow,x_flow,ierr)
         CHKERRQ(ierr)

Thanks and Regards,

Danyang

On 17-05-24 06:32 PM, Hong wrote:
> Remove your option '-vecload_block_size 10'.
> Hong
>
> On Wed, May 24, 2017 at 3:06 PM, Danyang Su <danyang.su at gmail.com 
> <mailto:danyang.su at gmail.com>> wrote:
>
>     Dear Hong,
>
>     I just tested with different number of processors for the same
>     matrix. It sometimes got "ERROR: Arguments are incompatible" for
>     different number of processors. It works fine using 4, 8, or 24
>     processors, but failed with "ERROR: Arguments are incompatible"
>     using 16 or 48 processors. The error information is attached. I
>     tested this on my local computer with 6 cores 12 threads. Any
>     suggestion on this?
>
>     Thanks,
>
>     Danyang
>
>
>     On 17-05-24 12:28 PM, Danyang Su wrote:
>>
>>     Hi Hong,
>>
>>     Awesome. Thanks for testing the case. I will try your options for
>>     the code and get back to you later.
>>
>>     Regards,
>>
>>     Danyang
>>
>>
>>     On 17-05-24 12:21 PM, Hong wrote:
>>>     Danyang :
>>>     I tested your data.
>>>     Your matrices encountered zero pivots, e.g.
>>>     petsc/src/ksp/ksp/examples/tutorials (master)
>>>     $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs
>>>     b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged
>>>
>>>     [15]PETSC ERROR: Zero pivot in LU factorization:
>>>     http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>>>     <http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot>
>>>     [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance
>>>     2.22045e-14
>>>     ...
>>>
>>>     Adding option '-sub_pc_factor_shift_type nonzero', I got
>>>     mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
>>>     -ksp_monitor -ksp_error_if_not_converged
>>>     -sub_pc_factor_shift_type nonzero -mat_view ascii::ascii_info
>>>
>>>     Mat Object: 24 MPI processes
>>>       type: mpiaij
>>>       rows=450000, cols=450000
>>>       total: nonzeros=6991400, allocated nonzeros=6991400
>>>       total number of mallocs used during MatSetValues calls =0
>>>         not using I-node (on process 0) routines
>>>       0 KSP Residual norm 5.849777711755e+01
>>>       1 KSP Residual norm 6.824179430230e-01
>>>       2 KSP Residual norm 3.994483555787e-02
>>>       3 KSP Residual norm 6.085841461433e-03
>>>       4 KSP Residual norm 8.876162583511e-04
>>>       5 KSP Residual norm 9.407780665278e-05
>>>     Number of iterations =   5
>>>     Residual norm 0.00542891
>>>
>>>     Hong
>>>
>>>         Hi Matt,
>>>
>>>         Yes. The matrix is 450000x450000 sparse. The hypre takes
>>>         hundreds of iterates, not for all but in most of the
>>>         timesteps. The matrix is not well conditioned, with nonzero
>>>         entries range from 1.0e-29 to 1.0e2. I also made double
>>>         check if there is anything wrong in the parallel version,
>>>         however, the matrix is the same with sequential version
>>>         except some round error which is relatively very small.
>>>         Usually for those not well conditioned matrix, direct solver
>>>         should be faster than iterative solver, right? But when I
>>>         use the sequential iterative solver with ILU prec developed
>>>         almost 20 years go by others, the solver converge fast with
>>>         appropriate factorization level. In other words, when I use
>>>         24 processor using hypre, the speed is almost the same as as
>>>         the old sequential iterative solver using 1 processor.
>>>
>>>         I use most of the default configuration for the general case
>>>         with pretty good speedup. And I am not sure if I miss
>>>         something for this problem.
>>>
>>>         Thanks,
>>>
>>>         Danyang
>>>
>>>
>>>         On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>>>         On Wed, May 24, 2017 at 12:50 PM, Danyang Su
>>>>         <danyang.su at gmail.com <mailto:danyang.su at gmail.com>> wrote:
>>>>
>>>>             Hi Matthew and Barry,
>>>>
>>>>             Thanks for the quick response.
>>>>
>>>>             I also tried superlu and mumps, both work but it is
>>>>             about four times slower than ILU(dt) prec through
>>>>             hypre, with 24 processors I have tested.
>>>>
>>>>         You mean the total time is 4x? And you are taking hundreds
>>>>         of iterates? That seems hard to believe, unless you are
>>>>         dropping
>>>>         a huge number of elements.
>>>>
>>>>             When I look into the convergence information, the
>>>>             method using ILU(dt) still takes 200 to 3000 linear
>>>>             iterations for each newton iteration. One reason is
>>>>             this equation is hard to solve. As for the general
>>>>             cases, the same method works awesome and get very good
>>>>             speedup.
>>>>
>>>>         I do not understand what you mean here.
>>>>
>>>>             I also doubt if I use hypre correctly for this case. Is
>>>>             there anyway to check this problem, or is it possible
>>>>             to increase the factorization level through hypre?
>>>>
>>>>         I don't know.
>>>>
>>>>           Matt
>>>>
>>>>             Thanks,
>>>>
>>>>             Danyang
>>>>
>>>>
>>>>             On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>>>             On Wed, May 24, 2017 at 2:21 AM, Danyang Su
>>>>>             <danyang.su at gmail.com <mailto:danyang.su at gmail.com>>
>>>>>             wrote:
>>>>>
>>>>>                 Dear All,
>>>>>
>>>>>                 I use PCFactorSetLevels for ILU and
>>>>>                 PCFactorSetFill for other preconditioning in my
>>>>>                 code to help solve the problems that the default
>>>>>                 option is hard to solve. However, I found the
>>>>>                 latter one, PCFactorSetFill does not take effect
>>>>>                 for my problem. The matrices and rhs as well as
>>>>>                 the solutions are attached from the link below. I
>>>>>                 obtain the solution using hypre preconditioner and
>>>>>                 it takes 7 and 38 iterations for matrix 1 and
>>>>>                 matrix 2. However, if I use other preconditioner,
>>>>>                 the solver just failed at the first matrix. I have
>>>>>                 tested this matrix using the native sequential
>>>>>                 solver (not PETSc) with ILU preconditioning. If I
>>>>>                 set the incomplete factorization level to 0, this
>>>>>                 sequential solver will take more than 100
>>>>>                 iterations. If I increase the factorization level
>>>>>                 to 1 or more, it just takes several iterations.
>>>>>                 This remind me that the PC factor for this
>>>>>                 matrices should be increased. However, when I
>>>>>                 tried it in PETSc, it just does not work.
>>>>>
>>>>>                 Matrix and rhs can be obtained from the link below.
>>>>>
>>>>>                 https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>>                 <https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>
>>>>>
>>>>>                 Would anyone help to check if you can make this
>>>>>                 work by increasing the PC factor level or fill?
>>>>>
>>>>>
>>>>>             We have ILU(k) supported in serial. However ILU(dt)
>>>>>             which takes a tolerance only works through Hypre
>>>>>
>>>>>             http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>>>             <http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html>
>>>>>
>>>>>             I recommend you try SuperLU or MUMPS, which can both
>>>>>             be downloaded automatically by configure, and
>>>>>             do a full sparse LU.
>>>>>
>>>>>               Thanks,
>>>>>
>>>>>                 Matt
>>>>>
>>>>>                 Thanks and regards,
>>>>>
>>>>>                 Danyang
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             -- 
>>>>>             What most experimenters take for granted before they
>>>>>             begin their experiments is infinitely more interesting
>>>>>             than any results to which their experiments lead.
>>>>>             -- Norbert Wiener
>>>>>
>>>>>             http://www.caam.rice.edu/~mk51/
>>>>>             <http://www.caam.rice.edu/%7Emk51/>
>>>>
>>>>
>>>>
>>>>
>>>>         -- 
>>>>         What most experimenters take for granted before they begin
>>>>         their experiments is infinitely more interesting than any
>>>>         results to which their experiments lead.
>>>>         -- Norbert Wiener
>>>>
>>>>         http://www.caam.rice.edu/~mk51/
>>>>         <http://www.caam.rice.edu/%7Emk51/>
>>>
>>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/3ee44a39/attachment-0001.html>