[petsc-users] SuperLU convergence problem (More test)

Xiaoye S. Li xsli at lbl.gov
Tue Dec 8 11:08:29 CST 2015


Hi Hong,
Thank you very much for keeping tracking down the problem. I apologize for
slow response... some administrative duties consumed most of my cycles for
a few days.

SamePattern can be used for a sequence of linear systems with the same
sparsity pattern. Then the sparsity ordering is done only once, save some
work.

SamePattern_SameRowPerm is more aggressive, it assumes both pattern and
numerical values are the same (similar), the numerical pivoting (row
permutation) is also reused.  This option should be used with caution,
 because when the values change a lot, the row permutation is not good any
more.

In SuperLU_DIST/SRC/pdgssvx.c, the leading comment in the code describes
details.   I am copying the relevant part below.

Thanks,
Sherry


* *   3. The second value of options->Fact assumes that a matrix with the
same                    *

* *      sparsity pattern as A has already been factored:
                       *

* *
                       *

* *      -  options->Fact = SamePattern: A is factored, assuming that it
has                      *

* *            the same nonzero pattern as a previously factored matrix.
In                       *

* *            this case the algorithm saves time by reusing the
previously                       *

* *            computed column permutation vector stored in
                       *

* *            ScalePermstruct->perm_c and the "elimination tree" of A
                       *

* *            stored in LUstruct->etree
                       *

* *
                       *

* *      In this case the user must still specify the following options
                       *

* *      as before:
                       *

* *
                       *

* *        o  options->Equil
                       *

* *        o  options->RowPerm
                       *

* *        o  options->ReplaceTinyPivot
                       *

* *
                       *

* *      but not options->ColPerm, whose value is ignored. This is because
the                    *

* *      previous column permutation from ScalePermstruct->perm_c is used
as                      *

* *      input. The user must also supply
                       *

* *
                       *

* *        o  A, the input matrix
                       *

* *        o  ScalePermstruct->perm_c, the column permutation
                       *

* *        o  LUstruct->etree, the elimination tree
                    *


* *      The outputs returned include
                       *

* *
                       *

* *        o  A, the input matrix A overwritten by the scaled and permuted
                       *

* *              matrix as described above
                       *

* *        o  ScalePermstruct, modified to describe how the input matrix A
was                    *

* *                            equilibrated and row permuted
                       *

* *        o  LUstruct, modified to contain the new L and U factors
                       *

* *
                       *

* *   4. The third value of options->Fact assumes that a matrix B with the
same                   *

* *      sparsity pattern as A has already been factored, and where the
                       *

* *      row permutation of B can be reused for A. This is useful when A
and B                    *

* *      have similar numerical values, so that the same row permutation
                       *

* *      will make both factorizations numerically stable. This lets us
reuse                     *

* *      all of the previously computed structure of L and U.
                       *

* *
                       *

* *      -  options->Fact = SamePattern_SameRowPerm: A is factored,
                       *

* *            assuming not only the same nonzero pattern as the
previously                       *

* *            factored matrix B, but reusing B's row permutation.
                       *

* *
                       *

* *      In this case the user must still specify the following options
                       *

* *      as before:
                       *

* *
                       *

* *        o  options->Equil
                       *

* *        o  options->ReplaceTinyPivot
                       *

* *
                       *

* *      but not options->RowPerm or options->ColPerm, whose values are
                       *

* *      ignored. This is because the permutations from
ScalePermstruct->perm_r                   *

* *      and ScalePermstruct->perm_c are used as input.
                       *

* *
                       *

* *      The user must also supply   *

* *
                       *

* *        o  A, the input matrix
                       *

* *        o  ScalePermstruct->DiagScale, how the previous matrix was row
                       *

* *                                       and/or column scaled
                       *

* *        o  ScalePermstruct->R, the row scalings of the previous matrix,
                       *

* *                               if any
                       *

* *        o  ScalePermstruct->C, the columns scalings of the previous
matrix,                    *

* *                               if any
                       *

* *        o  ScalePermstruct->perm_r, the row permutation of the previous
                       *

* *                                    matrix
                       *

* *        o  ScalePermstruct->perm_c, the column permutation of the
previous                     *

* *                                    matrix
                       *

* *        o  all of LUstruct, the previously computed information about
                       *

* *                            L and U (the actual numerical values of L
and U                    *

* *                            stored in LUstruct->Llu are ignored)
                       *

* *
                       *

* *      The outputs returned include
                       *

* *
                       *

* *        o  A, the input matrix A overwritten by the scaled and permuted
                       *

* *              matrix as described above
                       *

* *        o  ScalePermstruct,  modified to describe how the input matrix A
was                   *

* *                             equilibrated (thus
ScalePermstruct->DiagScale,                    *

* *                             R and C may be modified)
                       *

* *        o  LUstruct, modified to contain the new L and U factors
                *





On Mon, Dec 7, 2015 at 10:42 AM, Hong <hzhang at mcs.anl.gov> wrote:

> Danyang :
>
> Adding '-mat_superlu_dist_fact SamePattern' fixed the problem. Below is
> how I figured it out.
>
> 1. Reading ex52f.F, I see '-superlu_default' =
> '-pc_factor_mat_solver_package superlu_dist', the later enables runtime
> options for other packages. I use superlu_dist-4.2 and superlu-4.1 for the
> tests below.
>
> 2. Use the Matrix 168 to setup KSP solver and factorization, all packages,
> petsc, superlu_dist and mumps give same correct results:
>
> ./ex52f -f0 matrix_and_rhs_bin/a_flow_check_168.bin -rhs
> matrix_and_rhs_bin/b_flow_check_168.bin -loop_matrices flow_check
> -loop_folder matrix_and_rhs_bin -pc_type lu -pc_factor_mat_solver_package
> petsc
>  -->loac matrix a
>  -->load rhs b
>  size l,m,n,mm       90000       90000       90000       90000
> Norm of error  7.7308E-11 iterations     1
>  -->Test for matrix          168
> ..
>  -->Test for matrix          172
> Norm of error  3.8461E-11 iterations     1
>
> ./ex52f -f0 matrix_and_rhs_bin/a_flow_check_168.bin -rhs
> matrix_and_rhs_bin/b_flow_check_168.bin -loop_matrices flow_check
> -loop_folder matrix_and_rhs_bin -pc_type lu -pc_factor_mat_solver_package
> superlu_dist
> Norm of error  9.4073E-11 iterations     1
>  -->Test for matrix          168
> ...
>  -->Test for matrix          172
> Norm of error  3.8187E-11 iterations     1
>
> 3. Use superlu, I get
> ./ex52f -f0 matrix_and_rhs_bin/a_flow_check_168.bin -rhs
> matrix_and_rhs_bin/b_flow_check_168.bin -loop_matrices flow_check
> -loop_folder matrix_and_rhs_bin -pc_type lu -pc_factor_mat_solver_package
> superlu
> Norm of error  1.0191E-06 iterations     1
>  -->Test for matrix          168
> ...
>  -->Test for matrix          172
> Norm of error  9.7858E-07 iterations     1
>
> Replacing default DiagPivotThresh: 1. to 0.0, I get same solutions as
> other packages:
>
> ./ex52f -f0 matrix_and_rhs_bin/a_flow_check_168.bin -rhs
> matrix_and_rhs_bin/b_flow_check_168.bin -loop_matrices flow_check
> -loop_folder matrix_and_rhs_bin -pc_type lu -pc_factor_mat_solver_package
> superlu -mat_superlu_diagpivotthresh 0.0
>
> Norm of error  8.3614E-11 iterations     1
>  -->Test for matrix          168
> ...
>  -->Test for matrix          172
> Norm of error  3.7098E-11 iterations     1
>
> 4.
> using '-mat_view ascii::ascii_info', I found that a_flow_check_1.bin and
> a_flow_check_168.bin seem have same structure:
>
>  -->loac matrix a
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=90000, cols=90000
>   total: nonzeros=895600, allocated nonzeros=895600
>   total number of mallocs used during MatSetValues calls =0
>     using I-node routines: found 45000 nodes, limit used is 5
>
> 5.
> Using a_flow_check_1.bin, I am able to reproduce the error you reported:
> all packages give correct results except superlu_dist:
> ./ex52f -f0 matrix_and_rhs_bin/a_flow_check_1.bin -rhs
> matrix_and_rhs_bin/b_flow_check_168.bin -loop_matrices flow_check
> -loop_folder matrix_and_rhs_bin -pc_type lu -pc_factor_mat_solver_package
> superlu_dist
> Norm of error  2.5970E-12 iterations     1
>  -->Test for matrix          168
> Norm of error  1.3936E-01 iterations    34
>  -->Test for matrix          169
>
> I guess the error might come from reuse of matrix factor. Replacing default
> -mat_superlu_dist_fact <SamePattern_SameRowPerm> with
> -mat_superlu_dist_fact SamePattern, I get
>
> ./ex52f -f0 matrix_and_rhs_bin/a_flow_check_1.bin -rhs
> matrix_and_rhs_bin/b_flow_check_168.bin -loop_matrices flow_check
> -loop_folder matrix_and_rhs_bin -pc_type lu -pc_factor_mat_solver_package
> superlu_dist -mat_superlu_dist_fact SamePattern
>
> Norm of error  2.5970E-12 iterations     1
>  -->Test for matrix          168
> Norm of error  9.4073E-11 iterations     1
>  -->Test for matrix          169
> Norm of error  6.4303E-11 iterations     1
>  -->Test for matrix          170
> Norm of error  7.4327E-11 iterations     1
>  -->Test for matrix          171
> Norm of error  5.4162E-11 iterations     1
>  -->Test for matrix          172
> Norm of error  3.4440E-11 iterations     1
>  --> End of test, bye
>
> Sherry may tell you why SamePattern_SameRowPerm cause the difference here.
> Best on the above experiments, I would set following as default
> '-mat_superlu_diagpivotthresh 0.0' in petsc/superlu interface.
> '-mat_superlu_dist_fact SamePattern' in petsc/superlu_dist interface.
>
> Hong
>
> Hi Hong,
>>
>> I did more test today and finally found that the solution accuracy
>> depends on the initial (first) matrix quality. I modified the ex52f.F to do
>> the test. There are 6 matrices and right-hand-side vectors. All these
>> matrices and rhs are from my reactive transport simulation. Results will be
>> quite different depending on which one you use to do factorization. Results
>> will also be different if you run with different options. My code is
>> similar to the First or the Second test below. When the matrix is well
>> conditioned, it works fine. But if the initial matrix is well conditioned,
>> it likely to crash when the matrix become ill-conditioned. Since most of my
>> case are well conditioned so I didn't detect the problem before. This case
>> is a special one.
>>
>>
>> How can I avoid this problem? Shall I redo factorization? Can PETSc
>> automatically detect this prolbem or is there any option available to do
>> this?
>>
>> All the data and test code (modified ex52f) can be found via the dropbox
>> link below.
>>
>> *https://www.dropbox.com/s/4al1a60creogd8m/petsc-superlu-test.tar.gz?dl=0
>> <https://www.dropbox.com/s/4al1a60creogd8m/petsc-superlu-test.tar.gz?dl=0>*
>>
>>
>> Summary of my test is shown below.
>>
>> First, use the Matrix 1 to setup KSP solver and factorization, then solve
>> 168 to 172
>>
>> mpiexec.hydra -n 1 ./ex52f -f0
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin/a_flow_check_1.bin
>> -rhs
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin/b_flow_check_1.bin
>> -loop_matrices flow_check -loop_folder
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin -pc_type lu
>> -pc_factor_mat_solver_package superlu_dist
>>
>> Norm of error  3.8815E-11 iterations     1
>>  -->Test for matrix          168
>> Norm of error  4.2307E-01 iterations    32
>>  -->Test for matrix          169
>> Norm of error  3.0528E-01 iterations    32
>>  -->Test for matrix          170
>> Norm of error  3.1177E-01 iterations    32
>>  -->Test for matrix          171
>> Norm of error  3.2793E-01 iterations    32
>>  -->Test for matrix          172
>> Norm of error  3.1251E-01 iterations    31
>>
>> Second, use the Matrix 1 to setup KSP solver and factorization using the
>> implemented SuperLU relative codes. I thought this will generate the same
>> results as the First test, but it actually not.
>>
>> mpiexec.hydra -n 1 ./ex52f -f0
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin/a_flow_check_1.bin
>> -rhs
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin/b_flow_check_1.bin
>> -loop_matrices flow_check -loop_folder
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin -superlu_default
>>
>> Norm of error  2.2632E-12 iterations     1
>>  -->Test for matrix          168
>> Norm of error  1.0817E+04 iterations     1
>>  -->Test for matrix          169
>> Norm of error  1.0786E+04 iterations     1
>>  -->Test for matrix          170
>> Norm of error  1.0792E+04 iterations     1
>>  -->Test for matrix          171
>> Norm of error  1.0792E+04 iterations     1
>>  -->Test for matrix          172
>> Norm of error  1.0792E+04 iterations     1
>>
>>
>> Third, use the Matrix 168 to setup KSP solver and factorization, then
>> solve 168 to 172
>>
>> mpiexec.hydra -n 1 ./ex52f -f0
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin/a_flow_check_168.bin
>> -rhs
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin/b_flow_check_168.bin
>> -loop_matrices flow_check -loop_folder
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin -pc_type lu
>> -pc_factor_mat_solver_package superlu_dist
>>
>> Norm of error  9.5528E-10 iterations     1
>>  -->Test for matrix          168
>> Norm of error  9.4945E-10 iterations     1
>>  -->Test for matrix          169
>> Norm of error  6.4279E-10 iterations     1
>>  -->Test for matrix          170
>> Norm of error  7.4633E-10 iterations     1
>>  -->Test for matrix          171
>> Norm of error  7.4863E-10 iterations     1
>>  -->Test for matrix          172
>> Norm of error  8.9701E-10 iterations     1
>>
>> Fourth, use the Matrix 168 to setup KSP solver and factorization using
>> the implemented SuperLU relative codes. I thought this will generate the
>> same results as the Third test, but it actually not.
>>
>> mpiexec.hydra -n 1 ./ex52f -f0
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin/a_flow_check_168.bin
>> -rhs
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin/b_flow_check_168.bin
>> -loop_matrices flow_check -loop_folder
>> /home/dsu/work/petsc-superlu-test/matrix_and_rhs_bin -superlu_default
>>
>> Norm of error  3.7017E-11 iterations     1
>>  -->Test for matrix          168
>> Norm of error  3.6420E-11 iterations     1
>>  -->Test for matrix          169
>> Norm of error  3.7184E-11 iterations     1
>>  -->Test for matrix          170
>> Norm of error  3.6847E-11 iterations     1
>>  -->Test for matrix          171
>> Norm of error  3.7883E-11 iterations     1
>>  -->Test for matrix          172
>> Norm of error  3.8805E-11 iterations     1
>>
>> Thanks very much,
>>
>> Danyang
>>
>> On 15-12-03 01:59 PM, Hong wrote:
>>
>> Danyang :
>> Further testing a_flow_check_168.bin,
>> ./ex10 -f0 /Users/Hong/Downloads/matrix_and_rhs_bin/a_flow_check_168.bin
>> -rhs /Users/Hong/Downloads/matrix_and_rhs_bin/x_flow_check_168.bin -pc_type
>> lu -pc_factor_mat_solver_package superlu -ksp_monitor_true_residual
>> -mat_superlu_conditionnumber
>>   Recip. condition number = 1.610480e-12
>>   0 KSP preconditioned resid norm 6.873340313547e+09 true resid norm
>> 7.295020990196e+03 ||r(i)||/||b|| 1.000000000000e+00
>>   1 KSP preconditioned resid norm 2.051833296449e-02 true resid norm
>> 2.976859070118e-02 ||r(i)||/||b|| 4.080672384793e-06
>> Number of iterations =   1
>> Residual norm 0.0297686
>>
>> condition number of this matrix = 1/1.610480e-12 = 1.e+12,
>> i.e., this matrix is ill-conditioned.
>>
>> Hong
>>
>>
>> Hi Hong,
>>>
>>> The binary format of matrix, rhs and solution can be downloaded via the
>>> link below.
>>>
>>> https://www.dropbox.com/s/cl3gfi0s0kjlktf/matrix_and_rhs_bin.tar.gz?dl=0
>>>
>>> Thanks,
>>>
>>> Danyang
>>>
>>>
>>> On 15-12-03 10:50 AM, Hong wrote:
>>>
>>> Danyang:
>>>
>>>>
>>>>
>>>> To my surprising, solutions from SuperLU at timestep 29 seems not
>>>> correct for the first 4 Newton iterations, but the solutions from iteration
>>>> solver and MUMPS are correct.
>>>>
>>>> Please find all the matrices, rhs and solutions at timestep 29 via the
>>>> link below. The data is a bit large so that I just share it through
>>>> Dropbox. A piece of matlab code to read these data and then computer the
>>>> norm has also been attached.
>>>> *https://www.dropbox.com/s/rr8ueysgflmxs7h/results-check.tar.gz?dl=0
>>>> <https://www.dropbox.com/s/rr8ueysgflmxs7h/results-check.tar.gz?dl=0>*
>>>>
>>>
>>> Can you send us matrix in petsc binary format?
>>>
>>> e.g., call MatView(M, PETSC_VIEWER_BINARY_(PETSC_COMM_WORLD))
>>> or '-ksp_view_mat binary'
>>>
>>> Hong
>>>
>>>>
>>>>
>>>> Below is a summary of the norm from the three solvers at timestep 29,
>>>> newton iteration 1 to 5.
>>>>
>>>> Timestep 29
>>>> Norm of residual seq 1.661321e-09, superlu 1.657103e+04, mumps
>>>> 3.731225e-11
>>>> Norm of residual seq 1.753079e-09, superlu 6.675467e+02, mumps
>>>> 1.509919e-13
>>>> Norm of residual seq 4.914971e-10, superlu 1.236362e-01, mumps
>>>> 2.139303e-17
>>>> Norm of residual seq 3.532769e-10, superlu 1.304670e-04, mumps
>>>> 5.387000e-20
>>>> Norm of residual seq 3.885629e-10, superlu 2.754876e-07, mumps
>>>> 4.108675e-21
>>>>
>>>> Would anybody please check if SuperLU can solve these matrices? Another
>>>> possibility is that something is wrong in my own code. But so far, I cannot
>>>> find any problem in my code since the same code works fine if I using
>>>> iterative solver or direct solver MUMPS. But for other cases I have
>>>> tested,  all these solvers work fine.
>>>>
>>>> Please let me know if I did not write down the problem clearly.
>>>>
>>>> Thanks,
>>>>
>>>> Danyang
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20151208/497e4d88/attachment-0001.html>


More information about the petsc-users mailing list