[petsc-users] GAMG and zero pivots follow up

David Knezevic david.knezevic at akselos.com
Fri Nov 13 08:52:31 CST 2015


On Fri, Nov 13, 2015 at 9:45 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Nov 12, 2015 at 10:02 PM, David Knezevic <
> david.knezevic at akselos.com> wrote:
>
>> On Thu, Nov 12, 2015 at 9:58 AM, David Knezevic <
>> david.knezevic at akselos.com> wrote:
>>
>>> On Thu, Nov 12, 2015 at 9:50 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>> Note, I suspect the zero pivot is coming from a coarse grid.  I don't
>>>> know why no-inode fixed it.  N.B., this might not be deterministic.
>>>>
>>>> If you run with -info and grep on 'GAMG' you will see the block sizes.
>>>> They should be 3 on the fine grid and the 6 on the coarse grids if you set
>>>> everything up correctly.
>>>>
>>>
>>> OK, thanks for the info, I'll look into this further.
>>>
>>> Though note that I got it to work now without needing no-inode now. The
>>> change I made was to make sure that I matched the order of function calls
>>> from the PETSc GAMG examples. The libMesh code I was using was doing things
>>> somewhat out of order, apparently.
>>>
>>
>>
>> As I mentioned above, GAMG seems to be working fine for me now. Thanks
>> for the help on this.
>>
>> However, I wanted to ask about another 3D elasticity test case I ran in
>> which GAMG didn't converge. The "-ksp_view -ksp_monitor" output is below.
>> Any insights into what options I should set in order to get it to converge
>> in this case would be appreciated. (By the way, ML worked in this case with
>> the default options.)
>>
>> Thanks,
>> David
>>
>> --------------------------------------------
>>
>>   0 KSP Residual norm 2.475892877233e-03
>>   1 KSP Residual norm 6.181785698923e-05
>>   2 KSP Residual norm 9.439050821656e-04
>>
>
> Something very strange is happening here. CG should converge
> monotonically, but above it does not. What could be happening?
>
>   a) We could have lost orthogonality
>
>       This seems really unlikely in 2 iterates.
>
>   b) The preconditioner could be nonlinear
>
>      In order to test this, could you run with both GMRES and FGMRES? If
> the convergence is different, then something is wrong
>      with the preconditioner.
>
>      I have looked below, and it seems like this should be linear. Mark,
> could something be happening in GAMG?
>
> Also, you should not be using CG/V-cycle. You should be using 1 iterate of
> FMG, -pc_mg_type full.  I have just been going over
> this in class. It is normal for people to solve way past discretization
> error, but that does not make much sense. Its not hard to
> get a sense of discretization error using a manufactured solution.
>


OK, thanks for this info. I'll do some tests based on your comments above
when I get some time and I'll let you know what happens.

Thanks,
David




> KSP Object: 8 MPI processes
>>   type: cg
>>   maximum iterations=5000
>>   tolerances:  relative=1e-12, absolute=1e-50, divergence=10000
>>   left preconditioning
>>   using nonzero initial guess
>>   using PRECONDITIONED norm type for convergence test
>> PC Object: 8 MPI processes
>>   type: gamg
>>     MG: type is MULTIPLICATIVE, levels=5 cycles=v
>>       Cycles per PCApply=1
>>       Using Galerkin computed coarse grid matrices
>>       GAMG specific options
>>         Threshold for dropping small values from graph 0
>>         AGG specific options
>>           Symmetric graph false
>>   Coarse grid solver -- level -------------------------------
>>     KSP Object:    (mg_coarse_)     8 MPI processes
>>       type: gmres
>>         GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>         GMRES: happy breakdown tolerance 1e-30
>>       maximum iterations=1, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>       using NONE norm type for convergence test
>>     PC Object:    (mg_coarse_)     8 MPI processes
>>       type: bjacobi
>>         block Jacobi: number of blocks = 8
>>         Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>       KSP Object:      (mg_coarse_sub_)       1 MPI processes
>>         type: preonly
>>         maximum iterations=1, initial guess is zero
>>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>         left preconditioning
>>         using NONE norm type for convergence test
>>       PC Object:      (mg_coarse_sub_)       1 MPI processes
>>         type: lu
>>           LU: out-of-place factorization
>>           tolerance for zero pivot 2.22045e-14
>>           using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>>           matrix ordering: nd
>>           factor fill ratio given 5, needed 1
>>             Factored matrix follows:
>>               Mat Object:               1 MPI processes
>>                 type: seqaij
>>                 rows=6, cols=6, bs=6
>>                 package used to perform factorization: petsc
>>                 total: nonzeros=36, allocated nonzeros=36
>>                 total number of mallocs used during MatSetValues calls =0
>>                   using I-node routines: found 2 nodes, limit used is 5
>>         linear system matrix = precond matrix:
>>         Mat Object:         1 MPI processes
>>           type: seqaij
>>           rows=6, cols=6, bs=6
>>           total: nonzeros=36, allocated nonzeros=36
>>           total number of mallocs used during MatSetValues calls =0
>>             using I-node routines: found 2 nodes, limit used is 5
>>       linear system matrix = precond matrix:
>>       Mat Object:       8 MPI processes
>>         type: mpiaij
>>         rows=6, cols=6, bs=6
>>         total: nonzeros=36, allocated nonzeros=36
>>         total number of mallocs used during MatSetValues calls =0
>>           using I-node (on process 0) routines: found 2 nodes, limit used
>> is 5
>>   Down solver (pre-smoother) on level 1 -------------------------------
>>     KSP Object:    (mg_levels_1_)     8 MPI processes
>>       type: chebyshev
>>         Chebyshev: eigenvalue estimates:  min = 0.146922, max = 1.61615
>>         Chebyshev: eigenvalues estimated using gmres with translations
>>  [0 0.1; 0 1.1]
>>         KSP Object:        (mg_levels_1_esteig_)         8 MPI processes
>>           type: gmres
>>             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>             GMRES: happy breakdown tolerance 1e-30
>>           maximum iterations=10, initial guess is zero
>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>           left preconditioning
>>           using NONE norm type for convergence test
>>       maximum iterations=2
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>       using nonzero initial guess
>>       using NONE norm type for convergence test
>>     PC Object:    (mg_levels_1_)     8 MPI processes
>>       type: sor
>>         SOR: type = local_symmetric, iterations = 1, local iterations =
>> 1, omega = 1
>>       linear system matrix = precond matrix:
>>       Mat Object:       8 MPI processes
>>         type: mpiaij
>>         rows=162, cols=162, bs=6
>>         total: nonzeros=26244, allocated nonzeros=26244
>>         total number of mallocs used during MatSetValues calls =0
>>           using I-node (on process 0) routines: found 5 nodes, limit used
>> is 5
>>   Up solver (post-smoother) same as down solver (pre-smoother)
>>   Down solver (pre-smoother) on level 2 -------------------------------
>>     KSP Object:    (mg_levels_2_)     8 MPI processes
>>       type: chebyshev
>>         Chebyshev: eigenvalue estimates:  min = 0.143941, max = 1.58335
>>         Chebyshev: eigenvalues estimated using gmres with translations
>>  [0 0.1; 0 1.1]
>>         KSP Object:        (mg_levels_2_esteig_)         8 MPI processes
>>           type: gmres
>>             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>             GMRES: happy breakdown tolerance 1e-30
>>           maximum iterations=10, initial guess is zero
>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>           left preconditioning
>>           using NONE norm type for convergence test
>>       maximum iterations=2
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>       using nonzero initial guess
>>       using NONE norm type for convergence test
>>     PC Object:    (mg_levels_2_)     8 MPI processes
>>       type: sor
>>         SOR: type = local_symmetric, iterations = 1, local iterations =
>> 1, omega = 1
>>       linear system matrix = precond matrix:
>>       Mat Object:       8 MPI processes
>>         type: mpiaij
>>         rows=4668, cols=4668, bs=6
>>         total: nonzeros=3.00254e+06, allocated nonzeros=3.00254e+06
>>         total number of mallocs used during MatSetValues calls =0
>>           using I-node (on process 0) routines: found 158 nodes, limit
>> used is 5
>>   Up solver (post-smoother) same as down solver (pre-smoother)
>>   Down solver (pre-smoother) on level 3 -------------------------------
>>     KSP Object:    (mg_levels_3_)     8 MPI processes
>>       type: chebyshev
>>         Chebyshev: eigenvalue estimates:  min = 0.128239, max = 1.41063
>>         Chebyshev: eigenvalues estimated using gmres with translations
>>  [0 0.1; 0 1.1]
>>         KSP Object:        (mg_levels_3_esteig_)         8 MPI processes
>>           type: gmres
>>             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>             GMRES: happy breakdown tolerance 1e-30
>>           maximum iterations=10, initial guess is zero
>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>           left preconditioning
>>           using NONE norm type for convergence test
>>       maximum iterations=2
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>       using nonzero initial guess
>>       using NONE norm type for convergence test
>>     PC Object:    (mg_levels_3_)     8 MPI processes
>>       type: sor
>>         SOR: type = local_symmetric, iterations = 1, local iterations =
>> 1, omega = 1
>>       linear system matrix = precond matrix:
>>       Mat Object:       8 MPI processes
>>         type: mpiaij
>>         rows=66390, cols=66390, bs=6
>>         total: nonzeros=1.65982e+07, allocated nonzeros=1.65982e+07
>>         total number of mallocs used during MatSetValues calls =0
>>           using I-node (on process 0) routines: found 2668 nodes, limit
>> used is 5
>>   Up solver (post-smoother) same as down solver (pre-smoother)
>>   Down solver (pre-smoother) on level 4 -------------------------------
>>     KSP Object:    (mg_levels_4_)     8 MPI processes
>>       type: chebyshev
>>         Chebyshev: eigenvalue estimates:  min = 0.1, max = 1.1
>>         Chebyshev: eigenvalues estimated using gmres with translations
>>  [0 0.1; 0 1.1]
>>         KSP Object:        (mg_levels_4_esteig_)         8 MPI processes
>>           type: gmres
>>             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>             GMRES: happy breakdown tolerance 1e-30
>>           maximum iterations=10, initial guess is zero
>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>           left preconditioning
>>           using NONE norm type for convergence test
>>       maximum iterations=2
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>       using nonzero initial guess
>>       using NONE norm type for convergence test
>>     PC Object:    (mg_levels_4_)     8 MPI processes
>>       type: sor
>>         SOR: type = local_symmetric, iterations = 1, local iterations =
>> 1, omega = 1
>>       linear system matrix = precond matrix:
>>       Mat Object:      ()       8 MPI processes
>>         type: mpiaij
>>         rows=1532187, cols=1532187, bs=3
>>         total: nonzeros=1.21304e+08, allocated nonzeros=1.21304e+08
>>         total number of mallocs used during MatSetValues calls =0
>>           has attached near null space
>>           using I-node (on process 0) routines: found 66403 nodes, limit
>> used is 5
>>   Up solver (post-smoother) same as down solver (pre-smoother)
>>   linear system matrix = precond matrix:
>>   Mat Object:  ()   8 MPI processes
>>     type: mpiaij
>>     rows=1532187, cols=1532187, bs=3
>>     total: nonzeros=1.21304e+08, allocated nonzeros=1.21304e+08
>>     total number of mallocs used during MatSetValues calls =0
>>       has attached near null space
>>       using I-node (on process 0) routines: found 66403 nodes, limit used
>> is 5
>> Error, conv_flag < 0!
>>
>>
>>
>>
>>
>>
>>
>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Nov 11, 2015 at 1:36 PM, David Knezevic <
>>> david.knezevic at akselos.com> wrote:
>>>
>>>> On Wed, Nov 11, 2015 at 12:57 PM, David Knezevic <
>>>> david.knezevic at akselos.com> wrote:
>>>>
>>>>> On Wed, Nov 11, 2015 at 12:24 PM, David Knezevic <
>>>>> david.knezevic at akselos.com> wrote:
>>>>>
>>>>>> On Tue, Nov 10, 2015 at 10:28 PM, David Knezevic <
>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>
>>>>>>> On Tue, Nov 10, 2015 at 10:24 PM, Matthew Knepley <knepley at gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> On Tue, Nov 10, 2015 at 9:21 PM, David Knezevic <
>>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Nov 10, 2015 at 10:00 PM, Matthew Knepley <
>>>>>>>>> knepley at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Tue, Nov 10, 2015 at 8:39 PM, David Knezevic <
>>>>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I'm looking into using GAMG, so I wanted to start with a simple
>>>>>>>>>>> 3D elasticity problem. When I first tried this, I got the following "zero
>>>>>>>>>>> pivot" error:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>> [0]PETSC ERROR: Zero pivot in LU factorization:
>>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>>>>>>>>>>> [0]PETSC ERROR: Zero pivot, row 3
>>>>>>>>>>> [0]PETSC ERROR: See
>>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>>>>>>>>>> shooting.
>>>>>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.1, Jul, 22, 2015
>>>>>>>>>>> [0]PETSC ERROR:
>>>>>>>>>>> /home/dknez/akselos-dev/scrbe/build/bin/fe_solver-opt_real on a
>>>>>>>>>>> arch-linux2-c-opt named david-Lenovo by dknez Tue Nov 10 21:26:39 2015
>>>>>>>>>>> [0]PETSC ERROR: Configure options --with-shared-libraries=1
>>>>>>>>>>> --with-debugging=0 --download-suitesparse --download-parmetis
>>>>>>>>>>> --download-blacs
>>>>>>>>>>> --with-blas-lapack-dir=/opt/intel/system_studio_2015.2.050/mkl
>>>>>>>>>>> --CXXFLAGS=-Wl,--no-as-needed --download-scalapack --download-mumps
>>>>>>>>>>> --download-metis --download-superlu_dist
>>>>>>>>>>> --prefix=/home/dknez/software/libmesh_install/opt_real/petsc
>>>>>>>>>>> --download-hypre --download-ml
>>>>>>>>>>> [0]PETSC ERROR: #1 PetscKernel_A_gets_inverse_A_5() line 48 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/mat/impls/baij/seq/dgefa5.c
>>>>>>>>>>> [0]PETSC ERROR: #2 MatSOR_SeqAIJ_Inode() line 2808 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/mat/impls/aij/seq/inode.c
>>>>>>>>>>> [0]PETSC ERROR: #3 MatSOR() line 3697 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/mat/interface/matrix.c
>>>>>>>>>>> [0]PETSC ERROR: #4 PCApply_SOR() line 37 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/pc/impls/sor/sor.c
>>>>>>>>>>> [0]PETSC ERROR: #5 PCApply() line 482 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/pc/interface/precon.c
>>>>>>>>>>> [0]PETSC ERROR: #6 KSP_PCApply() line 242 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/include/petsc/private/kspimpl.h
>>>>>>>>>>> [0]PETSC ERROR: #7 KSPInitialResidual() line 63 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/ksp/interface/itres.c
>>>>>>>>>>> [0]PETSC ERROR: #8 KSPSolve_GMRES() line 235 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/ksp/impls/gmres/gmres.c
>>>>>>>>>>> [0]PETSC ERROR: #9 KSPSolve() line 604 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/ksp/interface/itfunc.c
>>>>>>>>>>> [0]PETSC ERROR: #10 KSPSolve_Chebyshev() line 381 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/ksp/impls/cheby/cheby.c
>>>>>>>>>>> [0]PETSC ERROR: #11 KSPSolve() line 604 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/ksp/interface/itfunc.c
>>>>>>>>>>> [0]PETSC ERROR: #12 PCMGMCycle_Private() line 19 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/pc/impls/mg/mg.c
>>>>>>>>>>> [0]PETSC ERROR: #13 PCMGMCycle_Private() line 48 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/pc/impls/mg/mg.c
>>>>>>>>>>> [0]PETSC ERROR: #14 PCApply_MG() line 338 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/pc/impls/mg/mg.c
>>>>>>>>>>> [0]PETSC ERROR: #15 PCApply() line 482 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/pc/interface/precon.c
>>>>>>>>>>> [0]PETSC ERROR: #16 KSP_PCApply() line 242 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/include/petsc/private/kspimpl.h
>>>>>>>>>>> [0]PETSC ERROR: #17 KSPSolve_CG() line 139 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/ksp/impls/cg/cg.c
>>>>>>>>>>> [0]PETSC ERROR: #18 KSPSolve() line 604 in
>>>>>>>>>>> /home/dknez/software/petsc-3.6.1/src/ksp/ksp/interface/itfunc.c
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>> I saw that there was a thread about this in September (subject:
>>>>>>>>>>> "gamg and zero pivots"), and that the fix is to use "-mg_levels_pc_type
>>>>>>>>>>> jacobi." When I do that, the solve succeeds (I pasted the -ksp_view at the
>>>>>>>>>>> end of this email).
>>>>>>>>>>>
>>>>>>>>>>> So I have two questions about this:
>>>>>>>>>>>
>>>>>>>>>>> 1. Is it surprising that I hit this issue for a 3D elasticity
>>>>>>>>>>> problem? Note that matrix assembly was done in libMesh, I can look into the
>>>>>>>>>>> structure of the assembled matrix more carefully, if needed. Also, note
>>>>>>>>>>> that I can solve this problem with direct solvers just fine.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, this seems like a bug, but it could be some strange BC thing
>>>>>>>>>> I do not understand.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> OK, I can look into the matrix in more detail. I agree that it
>>>>>>>>> should have a non-zero diagonal, so I'll have a look at what's happening
>>>>>>>>> with that.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Naively, the elastic element matrix has a nonzero diagonal. I see
>>>>>>>>>> that you are doing LU
>>>>>>>>>> of size 5. That seems strange for 3D elasticity. Am I missing
>>>>>>>>>> something? I would expect
>>>>>>>>>> block size 3.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm not sure what is causing the LU of size 5. Is there a setting
>>>>>>>>> to control that?
>>>>>>>>>
>>>>>>>>> Regarding the block size: I set the vector and matrix block size
>>>>>>>>> to 3 via VecSetBlockSize and MatSetBlockSize. I also
>>>>>>>>> used MatNullSpaceCreateRigidBody on a vector with block size of 3, and set
>>>>>>>>> the matrix's near nullspace using that.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Can you run this same example with -mat_no_inode? I think it may be
>>>>>>>> a strange blocking that is causing this.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That works. The -ksp_view output is below.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> I just wanted to follow up on this. I had a more careful look at the
>>>>>> matrix, and confirmed that there are no zero entries on the diagonal (as
>>>>>> expected for elasticity). The matrix is from one of libMesh's example
>>>>>> problems: a simple cantilever model using HEX8 elements.
>>>>>>
>>>>>> Do you have any further thoughts about what might cause the "strange
>>>>>> blocking" that you referred to? If there's something non-standard that
>>>>>> libMesh is doing with the blocks, I'd be interested to look into that. I
>>>>>> can send over the matrix if that would be helpful.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>
>>>>> P.S. I was previously calling VecSetBlockSize and MatSetBlockSize to
>>>>> set the block size to 3. When I don't do that, I no longer need to call
>>>>> -mat_no_inodes. I've pasted the -ksp_view output below. Does it look like
>>>>> that's working OK?
>>>>>
>>>>
>>>>
>>>> Sorry for the multiple messages, but I think I found the issue. libMesh
>>>> internally sets the block size to 1 earlier on (in PetscMatrix::init()). I
>>>> guess it'll work fine if I get it to set the block size to 3 instead, so
>>>> I'll look into that. (libMesh has an enable-blocked-storage configure
>>>> option that should take care of this automatically.)
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20151113/7db0027b/attachment-0001.html>


More information about the petsc-users mailing list