[petsc-users] DIVERGED_NANORINF when using HYPRE/BoomerAMG

Wed Jul 1 07:15:13 CDT 2020

As Matt said AMG does not work for everything out-of-the-box.

Hypre is pretty robust and I am surprised that every option gives
INFs/NANs. But if it solves your stretched Lapacian then it's probably
hooked up correctly.

You might try a non-adjoint Navier-Stokes problem, if you can.

You could try 'gamg' with the fgmres/ilu solver that works, as your
smoother (use -ksp_view to check that you get everything set correctly).
Use fgmres as the outer Krylov method and use -pc_gamg_agg_nsmooths 0.

Mark

On Tue, Jun 30, 2020 at 9:16 AM Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Jun 30, 2020 at 7:59 AM Andrea Iob <andrea_iob at hotmail.com> wrote:
>
>> Hi,
>>
>> I'm trying to solve a linear system using HYPRE/BoomerAMG as
>> preconditioner. The system comes from a two-dimensional adjoint
>> Navier-Stokes problem.
>>
>
> Do you expect the system you are solving to be elliptic? BoomerAMG is
> designed for elliptic systems, and can easily fail if applied
> to a more general system, say one with zeros on the diagonal.
>
>   Thanks,
>
>      Matt
>
>
>> The mesh is structured (6400 cells) and there are 24 DoFs for each cell
>> (the matrix has a total of 153600 rows). The condition number of the matrix
>> should be in the order of 1e9. Using ILU preconditioner and FGMRES, the
>> system is correctly solved (it takes 800 iterations to reach a residual of
>> 1e-12). However, replacing ILU with HYPRE/BoomerAMG, the solver stops right
>> after the first iteration (DIVERGED_NANORINF). I've tried different
>> BoomerAMG options and different solvers (including Richardson to use
>> BoomerAMG without a Krylov method), but I always get the same
>> DIVERGED_NANORINF error.
>>
>> Using the same HYPRE/BoomerAMG + FGMERS setup on another problem
>> (Laplacian on a highly stretched grid), the solver reaches convergence very
>> quickly without any problems.
>>
>> At the bottom of this mail, you'll find the log of a run that stops with
>> DIVERGED_NANORINF. Do you spot something in my setup that may explain why
>> the solver is not converging?
>>
>> Thanks. Best regards,
>> Andrea
>>
>> -----------------------------------------
>> Assembly started...
>>  Reading matrix file...
>>    - Number of rows (N) = 153600
>>    - Number of non-zero entries (NNZ) = 18247680
>>  Assembly completed.
>>  Time elapsed 83.3231s
>>  Set up started...
>>    Set up completed.
>>    Time elapsed 6.62441s
>>  Solution started...
>>   0 KSP unpreconditioned resid norm 7.736650641501e-01 true resid norm
>> 7.736650641501e-01 ||r(i)||/||b|| 1.000000000000e+00
>>   0 KSP Residual norm 7.736650641501e-01 % max 1.000000000000e+00 min
>> 1.000000000000e+00 max/min 1.000000000000e+00
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR:
>> [0]PETSC ERROR: KSPSolve has not converged due to Nan or Inf inner product
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.10.3, unknown
>> [0]PETSC ERROR: bitpit_system_solver on a  named xxx by xxx Tue Jun 30
>> 09:42:09 2020
>> [0]PETSC ERROR: Configure options PETSC_ARCH=arch-linux2-c-opt-gcc7-hypre
>> --with-blaslapack-dir=/opt/lapack/3.8.0-gcc.7.4.0/lib64 --with-debugging=0
>> COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-valgrind=1
>> --with-valgrind-dir=/opt/valgrind/3.14.0/
>> --prefix=/opt/petsc/3.10.3_gcc-7.4.0 --with-mpi=1 --download-hypre
>> [0]PETSC ERROR: #1 KSPGMRESClassicalGramSchmidtOrthogonalization() line
>> 67 in /root/InstallSources/petsc/src/ksp/ksp/impls/gmres/borthog2.c
>> [0]PETSC ERROR: #2 KSPFGMRESCycle() line 175 in
>> /root/InstallSources/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
>> [0]PETSC ERROR: #3 KSPSolve_FGMRES() line 291 in
>> /root/InstallSources/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
>> [0]PETSC ERROR: #4 KSPSolve() line 780 in
>> /root/InstallSources/petsc/src/ksp/ksp/interface/itfunc.c
>> KSP Object: 1 MPI processes
>>   type: fgmres
>>     restart=45, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>     happy breakdown tolerance 1e-30
>>   maximum iterations=10000, nonzero initial guess
>>   tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>>   right preconditioning
>>   using UNPRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI processes
>>   type: hypre
>>     HYPRE BoomerAMG preconditioning
>>       Cycle type V
>>       Maximum number of levels 25
>>       Maximum number of iterations PER hypre call 1
>>       Convergence tolerance PER hypre call 0.
>>       Threshold for strong coupling 0.7
>>       Interpolation truncation factor 0.3
>>       Interpolation: max elements per row 2
>>       Number of levels of aggressive coarsening 4
>>       Number of paths for aggressive coarsening 5
>>       Maximum row sums 0.9
>>       Sweeps down         1
>>       Sweeps up           1
>>       Sweeps on coarse    1
>>       Relax down          sequential-Gauss-Seidel
>>       Relax up            sequential-Gauss-Seidel
>>       Relax on coarse     Gaussian-elimination
>>       Relax weight  (all)      1.
>>       Outer relax weight (all) 1.
>>       Using CF-relaxation
>>       Smooth type          Euclid
>>       Smooth num levels    25
>>       Euclid ILU(k) levels 0
>>       Euclid ILU(k) drop tolerance 0.
>>       Euclid ILU use Block-Jacobi? 1
>>       Measure type        local
>>       Coarsen type        HMIS
>>       Interpolation type  ext+i
>>   linear system matrix = precond matrix:
>>   Mat Object: Mat_0x1e88040_0 1 MPI processes
>>     type: seqaij
>>     rows=153600, cols=153600, bs=24
>>     total: nonzeros=18247680, allocated nonzeros=18247680
>>     total number of mallocs used during MatSetValues calls =0
>>       using I-node routines: found 32000 nodes, limit used is 5
>>    Solution completed.
>>    Time elapsed 6.93336s
>>  Solution information...
>>     Error (L2)   : 32.8359
>>     Error (Linf) : 1.93278
>>
>> ************************************************************************************************************************
>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
>> -fCourier9' to print this document            ***
>>
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance Summary:
>> ----------------------------------------------
>>
>> bitpit_system_solver on a  named xxx with 1 processor, by xxx Tue Jun 30
>> 09:43:46 2020
>> Using Petsc Release Version 3.10.3, unknown
>>
>>                          Max       Max/Min     Avg       Total
>> Time (sec):           9.718e+01     1.000   9.718e+01
>> Objects:              5.300e+01     1.000   5.300e+01
>> Flop:                 1.107e+08     1.000   1.107e+08  1.107e+08
>> Flop/sec:             1.139e+06     1.000   1.139e+06  1.139e+06
>> MPI Messages:         0.000e+00     0.000   0.000e+00  0.000e+00
>> MPI Message Lengths:  0.000e+00     0.000   0.000e+00  0.000e+00
>> MPI Reductions:       0.000e+00     0.000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type
>> (multiply/divide/add/subtract)
>>                             e.g., VecAXPY() for real vectors of length N
>> --> 2N flop
>>                             and VecAXPY() for complex vectors of length N
>> --> 8N flop
>>
>> Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages
>> ---  -- Message Lengths --  -- Reductions --
>>                         Avg     %Total     Avg     %Total    Count
>> %Total     Avg         %Total    Count   %Total
>>  0:      Main Stage: 9.7178e+01 100.0%  1.1071e+08 100.0%  0.000e+00
>> 0.0%  0.000e+00        0.0%  0.000e+00   0.0%
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on
>> interpreting output.
>> Phase summary info:
>>    Count: number of times phase was executed
>>    Time and Flop: Max - maximum over all processors
>>                   Ratio - ratio of maximum to minimum over all processors
>>    Mess: number of messages sent
>>    AvgLen: average message length (bytes)
>>    Reduct: number of global reductions
>>    Global: entire computation
>>    Stage: stages of a computation. Set stages with PetscLogStagePush()
>> and PetscLogStagePop().
>>       %T - percent time in this phase         %F - percent flop in this
>> phase
>>       %M - percent messages in this phase     %L - percent message
>> lengths in this phase
>>       %R - percent reductions in this phase
>>    Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time
>> over all processors)
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> Event                Count      Time (sec)
>> Flop                              --- Global ---  --- Stage ----  Total
>>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen
>> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> BuildTwoSidedF         2 1.0 1.8471e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatMult                3 1.0 6.8688e-02 1.0 1.09e+08 1.0 0.0e+00 0.0e+00
>> 0.0e+00  0 98  0  0  0   0 98  0  0  0  1587
>> MatConvert             2 1.0 3.8534e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatAssemblyBegin       3 1.0 2.8112e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatAssemblyEnd         3 1.0 7.8563e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatGetRowIJ            2 1.0 3.6100e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatView                2 1.0 5.4618e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 56  0  0  0  0  56  0  0  0  0     0
>> VecView                2 1.0 7.6981e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> VecMDot                1 1.0 1.7635e-04 1.0 3.07e+05 1.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0  1742
>> VecNorm                3 1.0 5.4832e-04 1.0 9.22e+05 1.0 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0  1681
>> VecScale               1 1.0 1.5216e-04 1.0 1.54e+05 1.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0  1009
>> VecCopy                1 1.0 1.5406e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet                21 1.0 7.1106e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAYPX                1 1.0 1.9940e-04 1.0 1.54e+05 1.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0   770
>> VecWAXPY               1 1.0 7.8802e-04 1.0 1.54e+05 1.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0   195
>> KSPSetUp               2 1.0 6.8995e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> PCSetUp                2 1.0 1.3190e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 14  0  0  0  0  14  0  0  0  0     0
>> PCApply                1 1.0 2.2728e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions     Memory  Descendants'
>> Mem.
>> Reports information only for process 0.
>>
>> --- Event Stage 0: Main Stage
>>
>>               Matrix     3              1         2872     0.
>>               Vector    32             20     17234800     0.
>>            Index Set     4              4         3200     0.
>>    IS L to G Mapping     2              0            0     0.
>>          Vec Scatter     2              0            0     0.
>>               Viewer     6              4         3392     0.
>>        Krylov Solver     2              1        61676     0.
>>       Preconditioner     2              1         1432     0.
>>
>> ========================================================================================================================
>> Average time to get PetscTime(): 3.36e-08
>> #PETSc Option Table entries:
>> -ksp_converged_reason
>> -ksp_error_if_not_converged
>> -ksp_monitor_singular_value
>> -ksp_monitor_true_residual
>> -log_view
>> -pc_hypre_boomeramg_agg_nl 4
>> -pc_hypre_boomeramg_agg_num_paths 5
>> -pc_hypre_boomeramg_coarsen_type HMIS
>> -pc_hypre_boomeramg_eu_bj true
>> -pc_hypre_boomeramg_interp_type ext+i
>> -pc_hypre_boomeramg_P_max 2
>> -pc_hypre_boomeramg_relax_type_all sequential-Gauss-Seidel
>> -pc_hypre_boomeramg_smooth_type Euclid
>> -pc_hypre_boomeramg_strong_threshold 0.7
>> -pc_hypre_boomeramg_truncfactor 0.3
>> #End of PETSc Option Table entries
>> Compiled without FORTRAN kernels
>> Compiled with full precision matrices (default)
>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
>> Configure options: PETSC_ARCH=arch-linux2-c-opt-gcc7-hypre
>> --with-blaslapack-dir=/opt/lapack/3.8.0-gcc.7.4.0/lib64 --with-debugging=0
>> COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-valgrind=1
>> --with-valgrind-dir=/opt/valgrind/3.14.0/
>> --prefix=/opt/petsc/3.10.3_gcc-7.4.0 --with-mpi=1 --download-hypre
>> -----------------------------------------
>> Libraries compiled on 2020-06-26 12:35:46 on xxx
>> Machine characteristics:
>> Linux-3.10.0-1127.8.2.el7.x86_64-x86_64-with-centos-7.6.1810-Core
>> Using PETSc directory: /opt/petsc/3.10.3_gcc-7.4.0
>> Using PETSc arch:
>> -----------------------------------------
>>
>> Using C compiler: mpicc  -fPIC  -Wall -Wwrite-strings
>> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector
>> -fvisibility=hidden -O3
>> Using Fortran compiler: mpif90  -fPIC -Wall -ffree-line-length-0
>> -Wno-unused-dummy-argument -O3
>> -----------------------------------------
>>
>> Using include paths: -I/opt/petsc/3.10.3_gcc-7.4.0/include
>> -I/opt/valgrind/3.14.0/include
>> -----------------------------------------
>>
>> Using C linker: mpicc
>> Using Fortran linker: mpif90
>> Using libraries: -Wl,-rpath,/opt/petsc/3.10.3_gcc-7.4.0/lib
>> -L/opt/petsc/3.10.3_gcc-7.4.0/lib -lpetsc
>> -Wl,-rpath,/opt/petsc/3.10.3_gcc-7.4.0/lib
>> -L/opt/petsc/3.10.3_gcc-7.4.0/lib
>> -Wl,-rpath,/opt/lapack/3.8.0-gcc.7.4.0/lib64
>> -L/opt/lapack/3.8.0-gcc.7.4.0/lib64
>> -Wl,-rpath,/opt/openmpi4.0.0-gcc7.4.0_slurm/lib
>> -L/opt/openmpi4.0.0-gcc7.4.0_slurm/lib
>> -Wl,-rpath,/opt/gnu/7.4/lib/gcc/x86_64-pc-linux-gnu/7.4.0
>> -L/opt/gnu/7.4/lib/gcc/x86_64-pc-linux-gnu/7.4.0
>> -Wl,-rpath,/opt/gnu/7.4/lib/gcc -L/opt/gnu/7.4/lib/gcc
>> -Wl,-rpath,/opt/gnu/7.4/lib64 -L/opt/gnu/7.4/lib64
>> -Wl,-rpath,/opt/gnu/7.4/lib -L/opt/gnu/7.4/lib -lHYPRE -llapack -lblas -lm
>> -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh
>> -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++
>> -ldl
>> -----------------------------------------
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200701/33d3683e/attachment-0001.html>