[petsc-users] Question on incomplete factorization level and fill

Wed May 24 15:18:09 CDT 2017

   I don't think this has anything to do with the specific solver but is because you are loading both a vector and matrix from a file and when it uses the default parallel layout for each, because you have -matload_block_size 1  and -vecload_block_size 10 they do not get the same layout.

   Remove the -matload_block_size 1  and -vecload_block_size 10 they don't mean anything here anyways.

   Does this resolve the problem?

   Barry

> On May 24, 2017, at 3:06 PM, Danyang Su <danyang.su at gmail.com> wrote:
> 
> Dear Hong,
> 
> I just tested with different number of processors for the same matrix. It sometimes got "ERROR: Arguments are incompatible" for different number of processors. It works fine using 4, 8, or 24 processors, but failed with "ERROR: Arguments are incompatible" using 16 or 48 processors. The error information is attached. I tested this on my local computer with 6 cores 12 threads. Any suggestion on this?
> 
> Thanks,
> Danyang
> 
> On 17-05-24 12:28 PM, Danyang Su wrote:
>> Hi Hong,
>> 
>> Awesome. Thanks for testing the case. I will try your options for the code and get back to you later.
>> 
>> Regards,
>> 
>> Danyang
>> 
>> On 17-05-24 12:21 PM, Hong wrote:
>>> Danyang :
>>> I tested your data.
>>> Your matrices encountered zero pivots, e.g.
>>> petsc/src/ksp/ksp/examples/tutorials (master)
>>> $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged
>>> 
>>> [15]PETSC ERROR: Zero pivot in LU factorization: http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>>> [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance 2.22045e-14
>>> ...
>>> 
>>> Adding option '-sub_pc_factor_shift_type nonzero', I got
>>> mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged -sub_pc_factor_shift_type nonzero -mat_view ascii::ascii_info
>>> 
>>> Mat Object: 24 MPI processes
>>>   type: mpiaij
>>>   rows=450000, cols=450000
>>>   total: nonzeros=6991400, allocated nonzeros=6991400
>>>   total number of mallocs used during MatSetValues calls =0
>>>     not using I-node (on process 0) routines
>>>   0 KSP Residual norm 5.849777711755e+01
>>>   1 KSP Residual norm 6.824179430230e-01
>>>   2 KSP Residual norm 3.994483555787e-02
>>>   3 KSP Residual norm 6.085841461433e-03
>>>   4 KSP Residual norm 8.876162583511e-04
>>>   5 KSP Residual norm 9.407780665278e-05
>>> Number of iterations =   5
>>> Residual norm 0.00542891
>>> 
>>> Hong
>>> Hi Matt,
>>> 
>>> Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds of iterates, not for all but in most of the timesteps. The matrix is not well conditioned, with nonzero entries range from 1.0e-29 to 1.0e2. I also made double check if there is anything wrong in the parallel version, however, the matrix is the same with sequential version except some round error which is relatively very small. Usually for those not well conditioned matrix, direct solver should be faster than iterative solver, right? But when I use the sequential iterative solver with ILU prec developed almost 20 years go by others, the solver converge fast with appropriate factorization level. In other words, when I use 24 processor using hypre, the speed is almost the same as as the old sequential iterative solver using 1 processor.
>>> 
>>> I use most of the default configuration for the general case with pretty good speedup. And I am not sure if I miss something for this problem.
>>> 
>>> Thanks,
>>> 
>>> Danyang
>>> 
>>> On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>>> On Wed, May 24, 2017 at 12:50 PM, Danyang Su <danyang.su at gmail.com> wrote:
>>>> Hi Matthew and Barry,
>>>> 
>>>> Thanks for the quick response. 
>>>> I also tried superlu and mumps, both work but it is about four times slower than ILU(dt) prec through hypre, with 24 processors I have tested.
>>>> 
>>>> You mean the total time is 4x? And you are taking hundreds of iterates? That seems hard to believe, unless you are dropping
>>>> a huge number of elements. 
>>>> When I look into the convergence information, the method using ILU(dt) still takes 200 to 3000 linear iterations for each newton iteration. One reason is this equation is hard to solve. As for the general cases, the same method works awesome and get very good speedup.
>>>> 
>>>> I do not understand what you mean here. 
>>>> I also doubt if I use hypre correctly for this case. Is there anyway to check this problem, or is it possible to increase the factorization level through hypre?
>>>> 
>>>> I don't know.
>>>> 
>>>>   Matt 
>>>> Thanks,
>>>> 
>>>> Danyang
>>>> 
>>>> On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>>> On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com> wrote:
>>>>> Dear All,
>>>>> 
>>>>> I use PCFactorSetLevels for ILU and PCFactorSetFill for other preconditioning in my code to help solve the problems that the default option is hard to solve. However, I found the latter one, PCFactorSetFill does not take effect for my problem. The matrices and rhs as well as the solutions are attached from the link below. I obtain the solution using hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and matrix 2. However, if I use other preconditioner, the solver just failed at the first matrix. I have tested this matrix using the native sequential solver (not PETSc) with ILU preconditioning. If I set the incomplete factorization level to 0, this sequential solver will take more than 100 iterations. If I increase the factorization level to 1 or more, it just takes several iterations. This remind me that the PC factor for this matrices should be increased. However, when I tried it in PETSc, it just does not work.
>>>>> 
>>>>> Matrix and rhs can be obtained from the link below.
>>>>> 
>>>>> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>> 
>>>>> Would anyone help to check if you can make this work by increasing the PC factor level or fill?
>>>>> 
>>>>> We have ILU(k) supported in serial. However ILU(dt) which takes a tolerance only works through Hypre
>>>>> 
>>>>>   http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>>> 
>>>>> I recommend you try SuperLU or MUMPS, which can both be downloaded automatically by configure, and
>>>>> do a full sparse LU.
>>>>> 
>>>>>   Thanks,
>>>>> 
>>>>>     Matt
>>>>>  
>>>>> Thanks and regards,
>>>>> 
>>>>> Danyang
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>> 
>>>>> http://www.caam.rice.edu/~mk51/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> http://www.caam.rice.edu/~mk51/
>>> 
>>> 
>> 
> 
> <outscreen_p16.txt><outscreen_p48.txt><outscreen_p8.txt>