[petsc-users] Question concerning ilu and bcgs

Wed Feb 18 11:10:20 CST 2015

I tried fieldsplitting several months ago, it didn't work due to the complicated coupled irregular bdry conditions. So I tried direct solver and now I modified the PDE system a little bit so that the ILU/bcgs works in MATLAB. But thank you for the suggestions, although I doubt it would work, maybe I will still try fieldsplitting with my new system.

________________________________
From: Matthew Knepley [knepley at gmail.com]
Sent: Wednesday, February 18, 2015 8:54 AM
To: Sun, Hui
Cc: hong at aspiritech.org; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question concerning ilu and bcgs

On Wed, Feb 18, 2015 at 10:47 AM, Sun, Hui <hus003 at ucsd.edu<mailto:hus003 at ucsd.edu>> wrote:
The matrix is from a 3D fluid problem, with complicated irregular boundary conditions. I've tried using direct solvers such as UMFPACK, SuperLU_dist and MUMPS. It seems that SuperLU_dist does not solve for my linear system; UMFPACK solves the system but would run into memory issue even with small size matrices and it cannot parallelize; MUMPS does solve the system but it also fails when the size is big and it takes much time. That's why I'm seeking an iterative method.

I guess the direct method is faster than an iterative method for a small A, but that may not be true for bigger A.

If this is a Stokes flow, you should use PCFIELDSPLIT and multigrid. If it is advection dominated, I know of nothing better
than sparse direct or perhaps Block-Jacobi with sparse direct blocks. Since MUMPS solved your system, I would consider
using BJacobi/ASM and MUMPS or UMFPACK as the block solver.

  Thanks,

     Matt

________________________________
From: Matthew Knepley [knepley at gmail.com<mailto:knepley at gmail.com>]
Sent: Wednesday, February 18, 2015 8:33 AM
To: Sun, Hui
Cc: hong at aspiritech.org<mailto:hong at aspiritech.org>; petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Question concerning ilu and bcgs

On Wed, Feb 18, 2015 at 10:31 AM, Sun, Hui <hus003 at ucsd.edu<mailto:hus003 at ucsd.edu>> wrote:
So far I just try around, I haven't looked into literature yet.

However, both MATLAB's ilu+gmres and ilu+bcgs work. Is it possible that some parameter or options need to be tuned in using PETSc's ilu or hypre's ilu? Besides, is there a way to view how good the performance of the pc is and output the matrices L and U, so that I can do some test in MATLAB?

1) Its not clear exactly what Matlab is doing

2) PETSc uses ILU(0) by default (you can set it to use ILU(k))

3) I don't know what Hypre's ILU can do

I would really discourage from using ILU. I cannot imagine it is faster than sparse direct factorization
for your system, such as from SuperLU or MUMPS.

  Thanks,

     Matt

Hui

________________________________
From: Matthew Knepley [knepley at gmail.com<mailto:knepley at gmail.com>]
Sent: Wednesday, February 18, 2015 8:09 AM
To: Sun, Hui
Cc: hong at aspiritech.org<mailto:hong at aspiritech.org>; petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Question concerning ilu and bcgs

On Wed, Feb 18, 2015 at 10:02 AM, Sun, Hui <hus003 at ucsd.edu<mailto:hus003 at ucsd.edu>> wrote:
Yes I've tried other solvers, gmres/ilu does not work, neither does bcgs/ilu. Here are the options:

-pc_type ilu -pc_factor_nonzeros_along_diagonal -pc_factor_levels 0 -pc_factor_reuse_ordering -ksp_ty\

pe bcgs -ksp_rtol 1e-6 -ksp_max_it 10 -ksp_monitor_short -ksp_view

Note here that ILU(0) is an unreliable and generally crappy preconditioner. Have you looked in the
literature for the kinds of preconditioners that are effective for your problem?

  Thanks,

     Matt

Here is the output:

  0 KSP Residual norm 211292

  1 KSP Residual norm 13990.2

  2 KSP Residual norm 9870.08

  3 KSP Residual norm 9173.9

  4 KSP Residual norm 9121.94

  5 KSP Residual norm 7386.1

  6 KSP Residual norm 6222.55

  7 KSP Residual norm 7192.94

  8 KSP Residual norm 33964

  9 KSP Residual norm 33960.4

 10 KSP Residual norm 1068.54

KSP Object: 1 MPI processes

  type: bcgs

  maximum iterations=10, initial guess is zero

  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000

  left preconditioning

  using PRECONDITIONED norm type for convergence test

PC Object: 1 MPI processes

  type: ilu

    ILU: out-of-place factorization

    ILU: Reusing reordering from past factorization

    0 levels of fill

    tolerance for zero pivot 2.22045e-14

    using diagonal shift on blocks to prevent zero pivot [INBLOCKS]

    matrix ordering: natural

    factor fill ratio given 1, needed 1

      Factored matrix follows:

        Mat Object:         1 MPI processes

          type: seqaij

          rows=62500, cols=62500

          package used to perform factorization: petsc

          total: nonzeros=473355, allocated nonzeros=473355

          total number of mallocs used during MatSetValues calls =0

            not using I-node routines

  linear system matrix = precond matrix:

  Mat Object:   1 MPI processes

    type: seqaij

    rows=62500, cols=62500

    total: nonzeros=473355, allocated nonzeros=7.8125e+06

    total number of mallocs used during MatSetValues calls =0

      not using I-node routines

Time cost: 0.307149,  0.268402,  0.0990018

________________________________
From: hong at aspiritech.org<mailto:hong at aspiritech.org> [hong at aspiritech.org<mailto:hong at aspiritech.org>]
Sent: Wednesday, February 18, 2015 7:49 AM
To: Sun, Hui
Cc: Matthew Knepley; petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Question concerning ilu and bcgs

 Have you tried other solvers, e.g., PETSc default gmres/ilu, bcgs/ilu etc.
The matrix is small. If it is ill-conditioned, then pc_type lu would work the best.

Hong

On Wed, Feb 18, 2015 at 9:34 AM, Sun, Hui <hus003 at ucsd.edu<mailto:hus003 at ucsd.edu>> wrote:
With options:

-pc_type hypre -pc_hypre_type pilut -pc_hypre_pilut_maxiter 1000 -pc_hypre_pilut_tol 1e-3 -ksp_type bcgs -ksp_rtol 1e-10 -ksp_max_it 10 -ksp_monitor_short -ksp_converged_reason -ksp_view

Here is the full output:

  0 KSP Residual norm 1404.62

  1 KSP Residual norm 88.9068

  2 KSP Residual norm 64.73

  3 KSP Residual norm 71.0224

  4 KSP Residual norm 69.5044

  5 KSP Residual norm 455.458

  6 KSP Residual norm 174.876

  7 KSP Residual norm 183.031

  8 KSP Residual norm 650.675

  9 KSP Residual norm 79.2441

 10 KSP Residual norm 84.1985

Linear solve did not converge due to DIVERGED_ITS iterations 10

KSP Object: 1 MPI processes

  type: bcgs

  maximum iterations=10, initial guess is zero

  tolerances:  relative=1e-10, absolute=1e-50, divergence=10000

  left preconditioning

  using PRECONDITIONED norm type for convergence test

PC Object: 1 MPI processes

  type: hypre

    HYPRE Pilut preconditioning

    HYPRE Pilut: maximum number of iterations 1000

    HYPRE Pilut: drop tolerance 0.001

    HYPRE Pilut: default factor row size

  linear system matrix = precond matrix:

  Mat Object:   1 MPI processes

    type: seqaij

    rows=62500, cols=62500

    total: nonzeros=473355, allocated nonzeros=7.8125e+06

    total number of mallocs used during MatSetValues calls =0

      not using I-node routines

Time cost: 0.756198,  0.662984,  0.105672

________________________________
From: Matthew Knepley [knepley at gmail.com<mailto:knepley at gmail.com>]
Sent: Wednesday, February 18, 2015 3:30 AM
To: Sun, Hui
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Question concerning ilu and bcgs

On Wed, Feb 18, 2015 at 12:33 AM, Sun, Hui <hus003 at ucsd.edu<mailto:hus003 at ucsd.edu>> wrote:
I have a matrix system Ax = b, A is of type MatSeqAIJ or MatMPIAIJ, depending on the number of cores.

I try to solve this problem by pc_type ilu and ksp_type bcgs, it does not converge. The options I specify are:

-pc_type hypre -pc_hypre_type pilut -pc_hypre_pilut_maxiter 1000 -pc_hypre_pilut_tol 1e-3 -ksp_type b\

cgs -ksp_rtol 1e-10 -ksp_max_it 1000 -ksp_monitor_short -ksp_converged_reason

1) Run with -ksp_view, so we can see exactly what was used

2) ILUT is unfortunately not a well-defined algorithm, and I believe the parallel version makes different decisions
    than the serial version.

  Thanks,

    Matt

The first a few lines of the output are:

  0 KSP Residual norm 1404.62

  1 KSP Residual norm 88.9068

  2 KSP Residual norm 64.73

  3 KSP Residual norm 71.0224

  4 KSP Residual norm 69.5044

  5 KSP Residual norm 455.458

  6 KSP Residual norm 174.876

  7 KSP Residual norm 183.031

  8 KSP Residual norm 650.675

  9 KSP Residual norm 79.2441

 10 KSP Residual norm 84.1985

This clearly indicates non-convergence. However, I output the sparse matrix A and vector b to MATLAB, and run the following command:

[L,U] = ilu(A,struct('type','ilutp','droptol',1e-3));

[ux1,fl1,rr1,it1,rv1] = bicgstab(A,b,1e-10,1000,L,U);

And it converges in MATLAB, with flag fl1=0, relative residue rr1=8.2725e-11, and iteration it1=89.5. I'm wondering how can I figure out what's wrong.

Best,

Hui

--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150218/576b91b1/attachment-0001.html>