From aldo.bonfiglioli at unibas.it  Mon Dec  1 07:22:21 2025
From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli)
Date: Mon, 1 Dec 2025 14:22:21 +0100
Subject: [petsc-users] Trouble when viewing a subDM in vtk format
Message-ID: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it>

Dear developers,

I wrote a code that extracts subDMs corresponding to the various strata 
in the Face Sets.

I run into troubles when I view a subDM or a Vec attached to the subDM 
using the VTK format.

More precisely, the problem only occurs on more than one processor when 
the rank=0 processor has no points on a given subDM.

For instance, when the attached reproducer is run on 2 procs, the 
u_01.vtu file (global u Vec mapped to the subDM corresponding to stratum=1)

only includes the header, but no data. All other u_0?.vtu files can 
successfully be loaded and viewed in paraview.

The problem does NOT arise when I view the same objects in HDF5 format.

However, my problem in using the HDF5 lies in the fact that:

while the hdf5 file obtained with DMView can be post-processed with 
"petsc/lib/petsc/bin/petsc_gen_xdmf.py" to create a xmf file readable by 
paraview

I do not know how to view the field(s) associated with the DM when the 
hdf5 file is obtained from VecView.

The reproducer compiles with the latest petsc release.

Thanks,

Aldo

-- 
Dr. Aldo Bonfiglioli
Associate professor of Fluid Mechanics
Dipartimento di Ingegneria
Universita' della Basilicata
V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
tel:+39.0971.205203 fax:+39.0971.205215
web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!eXE2csxUb1J5bnRosf9LIGxLs17TdstMxQcGs0mbXFroRjzLN81jVmeAWfMr41X8JGEuVm286lVTmDgmi5s1zfsfegJNuWVVWTM$ 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rgmsh.F90
Type: text/x-fortran
Size: 26554 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251201/aeae59e0/attachment-0001.bin>
-------------- next part --------------
-dm_plex_dim 3
-dm_plex_shape box 
-dm_plex_box_faces 20,20,20
-dm_plex_box_lower 0.,0.,0. 
-dm_plex_box_upper 1.,1.,1.
##-dm_plex_filename cube6.msh
##-dm_plex_simplex false
-dm_plex_simplex true
-dm_plex_interpolate 
##-dm_plex_check_all
##-dm_plex_filename /home/abonfi/grids/3D/MASA_ns3d/unnested/cube1/cube1.msh
#
# read a solution from an existing file
#
#-viewer_type hdf5
#-viewer_format vtk
#-viewer_filename /home/abonfi/testcases/3D/scalar/advdiff/hiro/DMPlex/Re100/cube0/u.h5 
#
##-viewer_binary_filename /home/abonfi/testcases/3D/scalar/advdiff/hiro/DMPlex/Re100/cube1/sol.bin 
#
# and write it to u.*
#
###-vec_view hdf5:u.h5
-vec_view vtk:u.vtu
##
## I can read both mesh.h5 and mesh.vtu in paraview 
##
-dm_view hdf5:mesh.h5
##-dm_view vtk:mesh.vtu
#
# uncomment the following to write each boundary patch separately in an HDF5 file
# works both serial and parallel 
#
-patch_01_dm_view hdf5:patch_01.h5 
-patch_02_dm_view hdf5:patch_02.h5 
-patch_03_dm_view hdf5:patch_03.h5 
-patch_04_dm_view hdf5:patch_04.h5 
-patch_05_dm_view hdf5:patch_05.h5 
-patch_06_dm_view hdf5:patch_06.h5 
#
# uncomment the following to write each boundary patch separately in a VTK file
# it work on one processor, but fails whenever the rank=0 processor has no points
# on a given submesh 
#
#
#-patch_01_dm_view vtk:patch_01.vtu 
#-patch_02_dm_view vtk:patch_02.vtu 
#-patch_03_dm_view vtk:patch_03.vtu 
#-patch_04_dm_view vtk:patch_04.vtu 
#-patch_05_dm_view vtk:patch_05.vtu 
#-patch_06_dm_view vtk:patch_06.vtu 
#
#
#
##-dm_plex_view_labels "marker"
##-dm_plex_view_labels "Face Sets"
-petscpartitioner_view
####-dm_petscsection_view
-options_left

From knepley at gmail.com  Tue Dec  2 08:02:18 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 2 Dec 2025 09:02:18 -0500
Subject: [petsc-users] Trouble when viewing a subDM in vtk format
In-Reply-To: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it>
References: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it>
Message-ID: <CAMYG4G=6C+u6u7dQ1TgW2BMc5cU52b66Uc7W6XFLukZgRx7TAw@mail.gmail.com>

On Mon, Dec 1, 2025 at 8:22?AM Aldo Bonfiglioli <aldo.bonfiglioli at unibas.it>
wrote:

> Dear developers,
>
> I wrote a code that extracts subDMs corresponding to the various strata
> in the Face Sets.
>
> I run into troubles when I view a subDM or a Vec attached to the subDM
> using the VTK format.
>
> More precisely, the problem only occurs on more than one processor when
> the rank=0 processor has no points on a given subDM.
>
> For instance, when the attached reproducer is run on 2 procs, the
> u_01.vtu file (global u Vec mapped to the subDM corresponding to stratum=1)
>
> only includes the header, but no data. All other u_0?.vtu files can
> successfully be loaded and viewed in paraview.
>
> The problem does NOT arise when I view the same objects in HDF5 format.
>
> However, my problem in using the HDF5 lies in the fact that:
>
> while the hdf5 file obtained with DMView can be post-processed with
> "petsc/lib/petsc/bin/petsc_gen_xdmf.py" to create a xmf file readable by
> paraview
>

Hi Aldo,

Sorry about this, I would like to make it more intuitive. First, the
solution (I think)

  -dm_plex_view_hdf5_storage_version 1.1.0

will write the Viz field by default, so that PAraview will see it. Can you
try this?

Why do we need this? I have now made version-controlled output formats.
There is something about
this in the manual, but not enough. Paraview only supports vertex-based
fields and cell-based fields
(at least that I understand), so we need to write a separate copy of the
field (since Plex supports any
layout). Lots of people do not want a separate copy, since they are
checkpointing, so we control this
with a format (PETSC_VIEWER_HDF5_VIZ). You can pass this for specific
output, or use the format
version that does it automatically.

Let me know if this works.

  Thanks,

     Matt


> I do not know how to view the field(s) associated with the DM when the
> hdf5 file is obtained from VecView.
>
> The reproducer compiles with the latest petsc release.
>
> Thanks,
>
> Aldo
>
> --
> Dr. Aldo Bonfiglioli
> Associate professor of Fluid Mechanics
> Dipartimento di Ingegneria
> Universita' della Basilicata
> V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
> tel:+39.0971.205203 fax:+39.0971.205215
> web:
> https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!eXE2csxUb1J5bnRosf9LIGxLs17TdstMxQcGs0mbXFroRjzLN81jVmeAWfMr41X8JGEuVm286lVTmDgmi5s1zfsfegJNuWVVWTM$
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f1RvXE2KMeUsh5sgUBZzIIBhluwlYswPKT9rJ68dK6QgBnEwc3sSJnMK0IaiZzswk8NZNJjy-jh0mT2gw3zP$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f1RvXE2KMeUsh5sgUBZzIIBhluwlYswPKT9rJ68dK6QgBnEwc3sSJnMK0IaiZzswk8NZNJjy-jh0mdAuSKFQ$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251202/6fc759ff/attachment.html>

From Elena.Moral.Sanchez at ipp.mpg.de  Tue Dec  2 10:53:22 2025
From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena)
Date: Tue, 2 Dec 2025 16:53:22 +0000
Subject: [petsc-users] error setting the type of the TAO solver
Message-ID: <0e75a579f52348bb9eee6d26636885c1@ipp.mpg.de>

Hi,
I am trying to initialize a LCL TAO solver with petsc4py:

from petsc4py import PETSc
solver = PETSc.TAO().create()
solver.setType(PETSc.TAO.Type.LCL)

The last line throws the following error:

Traceback (most recent call last):
  File "<string>", line 3, in <module>
  File "petsc4py/PETSc/TAO.pyx", line 183, in petsc4py.PETSc.TAO.setType
petsc4py.PETSc.Error: error code 86
[0] TaoSetType() at /petsc/src/tao/interface/taosolver.c:2164
[0] Unknown type. Check for miss-spelling or missing package: https://urldefense.us/v3/__https://petsc.org/release/install/install/*external-packages__;Iw!!G_uCfscf7eWS!bi3UN8Pwci-Vryovl2zHhUj6yCPxh-3xwyOp74MnoU6mnVpJN8twrV3OQEGKWOU6UtghBOlXVbBW_TAta4L0NMGih55H4vncwyyG$ 
[0] Unable to find requested Tao type lcl

However, hasattr(solver.Type(), 'LCL') returns True. The same happens with any other PETSc.TAO.Type. What am I missing here?

Cheers,
Elena

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251202/20ac6190/attachment.html>

From liufield at gmail.com  Wed Dec  3 15:00:34 2025
From: liufield at gmail.com (neil liu)
Date: Wed, 3 Dec 2025 16:00:34 -0500
Subject: [petsc-users] Questions about memory usage in Petsc
Message-ID: <CAGVJNHDVrf5+P9sLHJu1aS3JCvWctLCHBOa3OV3X=PBjkvf-YQ@mail.gmail.com>

Dear users and developers,

I am recently running a large system from Nedelec element, 14 million dofs
(complex number).
A little confused about the memory there. Then I tried a small system
(34,000 dofs) to see the memory usage. It was solved with MUMPS with 1
rank.
Then I used  PetscMemoryGetCurrentUsage() to show the memory used there.
The pseudocode is
PetscMemoryGetCurrentUsage (*Memory 1:* 64.237M)
KSPset
KSPsolve  (INFOG(18) (size of all MUMPS internal data allocated during
factorization: value on the most memory consuming processor): *Memory 2:*  408
MB)
PetscMemoryGetCurrentUsage (*Memory 3: *54.307M)

[0] Maximum memory PetscMalloc()ed 49.45MB maximum size of entire process
424MB (*Memory 4:* 54.307M)
The following is my understanding, please correct me if I am wrong,
It seems the difference between Memory 1 and 3 is approximately the size of
30 Krylov vectors (complex).
It seems Memory 4 is not the summation of Memory 2 and 3; but on the same
order of magnitude. It is a little confusing here.

Thanks,
Xiaodong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251203/e3e29524/attachment.html>

From bsmith at petsc.dev  Wed Dec  3 15:18:45 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 3 Dec 2025 16:18:45 -0500
Subject: [petsc-users] Questions about memory usage in Petsc
In-Reply-To: <CAGVJNHDVrf5+P9sLHJu1aS3JCvWctLCHBOa3OV3X=PBjkvf-YQ@mail.gmail.com>
References: <CAGVJNHDVrf5+P9sLHJu1aS3JCvWctLCHBOa3OV3X=PBjkvf-YQ@mail.gmail.com>
Message-ID: <9EC264D4-C478-48CB-9FB1-96A4A2ED7974@petsc.dev>


   Unix/Linux has never had a good API for tracking process memory usage. PetscMemoryGetCurrentUsage() gets what it can from the OS, but the exact number should not be considered a true measure of process memory usage at that point in time.  Jumps up and down are not accurate measures of changes in memory usage.

  PetscMallocGetCurrentUsage() and the number from MUMPS are (assuming no bugs in our code and MUMPS counting space) accurate values of memory usage. You should use these to see how memory usage is scaling with your problem size.

  Barry


> On Dec 3, 2025, at 4:00?PM, neil liu <liufield at gmail.com> wrote:
> 
> Dear users and developers, 
> 
> I am recently running a large system from Nedelec element, 14 million dofs (complex number). 
> A little confused about the memory there. Then I tried a small system (34,000 dofs) to see the memory usage. It was solved with MUMPS with 1 rank. 
> Then I used  PetscMemoryGetCurrentUsage() to show the memory used there. 
> The pseudocode is 
> PetscMemoryGetCurrentUsage (Memory 1: 64.237M)
> KSPset
> KSPsolve  (INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): Memory 2:  408 MB)
> PetscMemoryGetCurrentUsage (Memory 3: 54.307M)
> 
> [0] Maximum memory PetscMalloc()ed 49.45MB maximum size of entire process 424MB (Memory 4: 54.307M)
> The following is my understanding, please correct me if I am wrong, 
> It seems the difference between Memory 1 and 3 is approximately the size of 30 Krylov vectors (complex).
> It seems Memory 4 is not the summation of Memory 2 and 3; but on the same order of magnitude. It is a little confusing here. 
> 
> Thanks,
> Xiaodong 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251203/6d8b2d73/attachment.html>

From liufield at gmail.com  Wed Dec  3 17:32:47 2025
From: liufield at gmail.com (neil liu)
Date: Wed, 3 Dec 2025 18:32:47 -0500
Subject: [petsc-users] Questions about memory usage in Petsc
In-Reply-To: <9EC264D4-C478-48CB-9FB1-96A4A2ED7974@petsc.dev>
References: <CAGVJNHDVrf5+P9sLHJu1aS3JCvWctLCHBOa3OV3X=PBjkvf-YQ@mail.gmail.com>
	<9EC264D4-C478-48CB-9FB1-96A4A2ED7974@petsc.dev>
Message-ID: <CAGVJNHD=vzn_xjHTZi040sTmbX6wg8_2SoCBaNUS4scqFQdWYQ@mail.gmail.com>

Thanks a lot for this advice. Will do.

On Wed, Dec 3, 2025 at 4:18?PM Barry Smith <bsmith at petsc.dev> wrote:

>
>    Unix/Linux has never had a good API for tracking process memory usage. PetscMemoryGetCurrentUsage()
> gets what it can from the OS, but the exact number should not be considered
> a true measure of process memory usage at that point in time.  Jumps up and
> down are not accurate measures of changes in memory usage.
>
>   PetscMallocGetCurrentUsage() and the number from MUMPS are (assuming no
> bugs in our code and MUMPS counting space) accurate values of memory usage.
> You should use these to see how memory usage is scaling with your problem
> size.
>
>   Barry
>
>
> On Dec 3, 2025, at 4:00?PM, neil liu <liufield at gmail.com> wrote:
>
> Dear users and developers,
>
> I am recently running a large system from Nedelec element, 14 million dofs
> (complex number).
> A little confused about the memory there. Then I tried a small system
> (34,000 dofs) to see the memory usage. It was solved with MUMPS with 1
> rank.
> Then I used  PetscMemoryGetCurrentUsage() to show the memory used there.
> The pseudocode is
> PetscMemoryGetCurrentUsage (*Memory 1:* 64.237M)
> KSPset
> KSPsolve  (INFOG(18) (size of all MUMPS internal data allocated during
> factorization: value on the most memory consuming processor): *Memory 2:*  408
> MB)
> PetscMemoryGetCurrentUsage (*Memory 3: *54.307M)
>
> [0] Maximum memory PetscMalloc()ed 49.45MB maximum size of entire process
> 424MB (*Memory 4:* 54.307M)
> The following is my understanding, please correct me if I am wrong,
> It seems the difference between Memory 1 and 3 is approximately the size
> of 30 Krylov vectors (complex).
> It seems Memory 4 is not the summation of Memory 2 and 3; but on the same
> order of magnitude. It is a little confusing here.
>
> Thanks,
> Xiaodong
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251203/0f08bbce/attachment-0001.html>

From simon.wiesheier at gmail.com  Thu Dec  4 02:03:22 2025
From: simon.wiesheier at gmail.com (Simon Wiesheier)
Date: Thu, 4 Dec 2025 09:03:22 +0100
Subject: [petsc-users] TAO PDIPM handling of objective evaluation failures
 (NaN / PDE non-convergence)
Message-ID: <CAM50jEse3=HpA2T9K6+Z08KYZ-puVEqzeFz9CsbW+BiJ-BAWcQ@mail.gmail.com>

Dear PETSc developers and users,

I am considering using TAO?s Primal-Dual Interior-Point Method (PDIPM) for
a constrained optimization problem in solid mechanics. The objective
involves solving a nonlinear PDE (hyperelasticity) for each parameter
vector, and for some parameter combinations the PDE solver may fail to
converge or produce non-physical states.

With MATLAB?s fmincon, it is possible to signal such failures by returning
NaN/Inf for the objective, and the solver will then backtrack or try a
different step without crashing.

My questions are:

   1.

   How does TAO?s PDIPM handle cases where the user objective or gradient
   callback returns NaN/Inf (e.g., due to PDE solver failure)?
   2.

   Is there a recommended way in TAO/PETSc to gracefully signal an
   evaluation failure (like ?bad point in parameter space?) so that the
   algorithm can back off and try a smaller step, instead of aborting?
   3.

   If the recommended pattern is *not* to return NaNs, what is the best
   practice in TAO for such PDE-constrained problems?

Any guidance on how TAO/PDIPM is intended to behave in the presence of
evaluation failures would be greatly appreciated.

Best regards,

Simon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251204/6dd18d47/attachment.html>

From bsmith at petsc.dev  Thu Dec  4 09:31:55 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 4 Dec 2025 10:31:55 -0500
Subject: [petsc-users] TAO PDIPM handling of objective evaluation
 failures (NaN / PDE non-convergence)
In-Reply-To: <CAM50jEse3=HpA2T9K6+Z08KYZ-puVEqzeFz9CsbW+BiJ-BAWcQ@mail.gmail.com>
References: <CAM50jEse3=HpA2T9K6+Z08KYZ-puVEqzeFz9CsbW+BiJ-BAWcQ@mail.gmail.com>
Message-ID: <E786EAE0-74AA-4AFD-BCA6-7B92CF964836@petsc.dev>

  Simon,

    Thanks for the question. We would love to have such functionality in TAO, but we do not currently have it. 

    For SNES, we provide SNESSetFunctionDomainError() and SNESSetJacobianDomainError() which can be called (on any subset of the MPI processes in the MPI communicator) to indicate such a domain error. Currently, SNES simply checks this way (in a collective, friendly manner without extra communication) and returns a clean SNES_DIVERGED_FUNCTION_DOMAIN via the SNESConvergedReason.  We would like at least some of the SNES solvers to handle this better, as you suggest, by backtracking to find a valid point in the domain.

  For TS we provide TSSetFunctionDomainError(), which allows the user to pass in a function to check if a point is in the domain. It is currently unused. I think it should use the same approach as SNES.

  So for TAO, I envision a similar TaoSetFunctionDomainError(), TaoSetJacobianDomainError(), and TaoSetHessianDomainError(), which would allow particular TAO solvers to "back off" but continue running as you request. 

  Would you be interested in collaborating with us on adding such support to Tao? In particular, focusing exactly on the Tao solver algorithm you are using?  We don't currently have PETSc developers focusing on Tao, so can only make progress on it by actively collaborating with others who need new/improved functionality.


   Barry


> On Dec 4, 2025, at 3:03?AM, Simon Wiesheier <simon.wiesheier at gmail.com> wrote:
> 
> Dear PETSc developers and users,
> I am considering using TAO?s Primal-Dual Interior-Point Method (PDIPM) for a constrained optimization problem in solid mechanics. The objective involves solving a nonlinear PDE (hyperelasticity) for each parameter vector, and for some parameter combinations the PDE solver may fail to converge or produce non-physical states.
> 
> With MATLAB?s fmincon, it is possible to signal such failures by returning NaN/Inf for the objective, and the solver will then backtrack or try a different step without crashing.
> 
> My questions are:
> 
> How does TAO?s PDIPM handle cases where the user objective or gradient callback returns NaN/Inf (e.g., due to PDE solver failure)?
> 
> Is there a recommended way in TAO/PETSc to gracefully signal an evaluation failure (like ?bad point in parameter space?) so that the algorithm can back off and try a smaller step, instead of aborting?
> 
> If the recommended pattern is not to return NaNs, what is the best practice in TAO for such PDE-constrained problems?
> 
> Any guidance on how TAO/PDIPM is intended to behave in the presence of evaluation failures would be greatly appreciated.
> 
> Best regards,
> 
> Simon
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251204/694da685/attachment.html>

From simon.wiesheier at gmail.com  Fri Dec  5 10:50:28 2025
From: simon.wiesheier at gmail.com (Simon Wiesheier)
Date: Fri, 5 Dec 2025 17:50:28 +0100
Subject: [petsc-users] TAO PDIPM handling of objective evaluation
 failures (NaN / PDE non-convergence)
In-Reply-To: <E786EAE0-74AA-4AFD-BCA6-7B92CF964836@petsc.dev>
References: <CAM50jEse3=HpA2T9K6+Z08KYZ-puVEqzeFz9CsbW+BiJ-BAWcQ@mail.gmail.com>
	<E786EAE0-74AA-4AFD-BCA6-7B92CF964836@petsc.dev>
Message-ID: <CAM50jEsW9qJBwaKmOQXL5vQK16B5xU8HhN-+iigeG6KOgLLkzQ@mail.gmail.com>

Thank you very much for the detailed explanation.
I would in principle be interested in collaborating on adding such
functionality to TAO, especially since robust handling of domain errors
would be very valuable for PDE-constrained inverse problems.

That said, my experience with PETSc is mostly through petsc4py and some
interfacing from deal.II, so I am not very familiar with the internals of
TAO or the C-level API. I would therefore need some guidance on where such
functionality should live inside the TAO infrastructure, and how similar
mechanisms are implemented in SNES or TS.

If you can point me to the relevant parts of the TAO codebase and outline
the expected design (e.g., how the domain-error flag should propagate
through the solver), I would be happy to explore how I could contribute.

Thanks again for your support and for considering this extension to TAO.

Best,
Simon

Am Do., 4. Dez. 2025 um 16:32 Uhr schrieb Barry Smith <bsmith at petsc.dev>:

>   Simon,
>
>     Thanks for the question. We would love to have such functionality in
> TAO, but we do not currently have it.
>
>     For SNES, we provide SNESSetFunctionDomainError() and
> SNESSetJacobianDomainError() which can be called (on any subset of the MPI
> processes in the MPI communicator) to indicate such a domain error.
> Currently, SNES simply checks this way (in a collective, friendly manner
> without extra communication) and returns a clean
> SNES_DIVERGED_FUNCTION_DOMAIN via the SNESConvergedReason.  We would like
> at least some of the SNES solvers to handle this better, as you suggest, by
> backtracking to find a valid point in the domain.
>
>   For TS we provide TSSetFunctionDomainError(), which allows the user to
> pass in a function to check if a point is in the domain. It is currently
> unused. I think it should use the same approach as SNES.
>
>   So for TAO, I envision a similar TaoSetFunctionDomainError(),
> TaoSetJacobianDomainError(), and TaoSetHessianDomainError(), which would
> allow particular TAO solvers to "back off" but continue running as you
> request.
>
>   Would you be interested in collaborating with us on adding such support
> to Tao? In particular, focusing exactly on the Tao solver algorithm you are
> using?  We don't currently have PETSc developers focusing on Tao, so can
> only make progress on it by actively collaborating with others who need
> new/improved functionality.
>
>
>    Barry
>
>
> On Dec 4, 2025, at 3:03?AM, Simon Wiesheier <simon.wiesheier at gmail.com>
> wrote:
>
> Dear PETSc developers and users,
>
> I am considering using TAO?s Primal-Dual Interior-Point Method (PDIPM) for
> a constrained optimization problem in solid mechanics. The objective
> involves solving a nonlinear PDE (hyperelasticity) for each parameter
> vector, and for some parameter combinations the PDE solver may fail to
> converge or produce non-physical states.
>
> With MATLAB?s fmincon, it is possible to signal such failures by returning
> NaN/Inf for the objective, and the solver will then backtrack or try a
> different step without crashing.
>
> My questions are:
>
>    1.
>
>    How does TAO?s PDIPM handle cases where the user objective or gradient
>    callback returns NaN/Inf (e.g., due to PDE solver failure)?
>    2.
>
>    Is there a recommended way in TAO/PETSc to gracefully signal an
>    evaluation failure (like ?bad point in parameter space?) so that the
>    algorithm can back off and try a smaller step, instead of aborting?
>    3.
>
>    If the recommended pattern is *not* to return NaNs, what is the best
>    practice in TAO for such PDE-constrained problems?
>
> Any guidance on how TAO/PDIPM is intended to behave in the presence of
> evaluation failures would be greatly appreciated.
>
> Best regards,
>
> Simon
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251205/f3f7f551/attachment.html>

From zhaowenbo.npic at gmail.com  Sun Dec  7 22:03:37 2025
From: zhaowenbo.npic at gmail.com (Wenbo Zhao)
Date: Mon, 8 Dec 2025 12:03:37 +0800
Subject: [petsc-users] Petsc veccuda device to host copy
Message-ID: <CAKxb76s+EdiaS5+3=Qh3K2yZKwNwKqcfEAdyRh35V0=ZK8STfw@mail.gmail.com>

Hi,

we are using petsc's veccuda and found that the data in the host array
obtained via VecGetArrayRead is partially updated sometime.


Vec vgpu, vcpu;

iterations:

     // ksp solve a * vgpu=b

     const PetscScalar * agpu;

      PetscScalar * acpu;

      VecGetArrayRead(vgpu, &agpu);

      VecGetArray(vcpu, &acpu);

      PetscArraycpy (acpu, agpu,size);

      // check updating

      std::cout << agpu[0] << agpu [size-1]<<std::endl;

      //  we found that agpu[0] is last iterations value, agpu[size-1]
updated from device value, randomly

      // use acpu values to  update matrix a ....


Petsc 3.21.1 is used. And manual said,

For vectors that may also have array data in GPU memory, for example,
VECCUDA, this call ensures the CPU array has the most recent array values
by copying the data from the GPU memory if needed.


Best wishes,

Wenbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251208/ecc49792/attachment.html>

From bsmith at petsc.dev  Sun Dec  7 22:35:26 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Sun, 7 Dec 2025 23:35:26 -0500
Subject: [petsc-users] Petsc veccuda device to host copy
In-Reply-To: <CAKxb76s+EdiaS5+3=Qh3K2yZKwNwKqcfEAdyRh35V0=ZK8STfw@mail.gmail.com>
References: <CAKxb76s+EdiaS5+3=Qh3K2yZKwNwKqcfEAdyRh35V0=ZK8STfw@mail.gmail.com>
Message-ID: <CD3C22C9-6C4B-4D03-80C7-42ADBE03E55F@petsc.dev>


   I am sorry to hear you are having difficulties. Please send a full reproducer so we can track down the problem using the latest PETSc release. Software changes very rapidly for GPUs so we cannot support or debug PETSc 3.21.1, which is a couple of years old. But if the problem persists in 3.24, we will definitely track it down if you provide a reproducer.


  Barry

> On Dec 7, 2025, at 11:03?PM, Wenbo Zhao <zhaowenbo.npic at gmail.com> wrote:
> 
> Hi,
> 
> we are using petsc's veccuda and found that the data in the host array obtained via VecGetArrayRead is partially updated sometime. 
> 
> 
> 
> 
> Vec vgpu, vcpu;
> 
> iterations:
> 
>      // ksp solve a * vgpu=b
> 
>      const PetscScalar * agpu;
> 
>       PetscScalar * acpu;
> 
>       VecGetArrayRead(vgpu, &agpu);
> 
>       VecGetArray(vcpu, &acpu);
> 
> 
>       PetscArraycpy (acpu, agpu,size);
> 
>       // check updating
> 
>       std::cout << agpu[0] << agpu [size-1]<<std::endl;
> 
>       //  we found that agpu[0] is last iterations value, agpu[size-1] updated from device value, randomly
> 
>       // use acpu values to  update matrix a ....
> 
> 
> 
> 
> 
> 
> 
> Petsc 3.21.1 is used. And manual said,
> 
> For vectors that may also have array data in GPU memory, for example, VECCUDA, this call ensures the CPU array has the most recent array values by copying the data from the GPU memory if needed.
> 
> 
> 
> 
> 
> Best wishes,
> 
> Wenbo
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251207/14f3c4ca/attachment.html>

From dontbugthedevs at proton.me  Wed Dec 10 07:14:06 2025
From: dontbugthedevs at proton.me (Noam T.)
Date: Wed, 10 Dec 2025 13:14:06 +0000
Subject: [petsc-users] Use of flag dm_plex_high_order_view
Message-ID: <AMbFq0QooOh4FPRIwwn-R9rg_ADOoOX-KjplmolVYAS8zcFgyWWLWjzVFhO56qUV4m-a3OW3Eiw-N1YLc4QunR8AfuvS9yfYfsMAx7ZyHw4=@proton.me>

Hello,

In an old question about obtaining the node connectivity of a cell with a high order approximation space, the use of the flag "-dm_plex_high_order_view" for visualization purposes was brought up, as a way to refine the grid down to linear space.

Link: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plex.c?ref_type=heads*L2021__;Iw!!G_uCfscf7eWS!ZZVaar4OGVdTXOAAjkEadcw-CAZUpAAzibpD_2SKpRupYG4oxSH-W7u5necsgTYDii6v-1kjfV4aqsllxL9XI_hK2YITfNz-$ 

I'm trying to make use of this flag to see the refinement, but I see no difference with higher order approximations. Perhaps I am misunderstanding its use? I thought that by using it, one could see a "subdivision" of each element. Say, a single triangle, FE approximation space order 2 (3 corner nodes, 3 mid-edge nodes), would be refined into e.g. 4 linear triangles.

The code (Fortran) looks something along these lines:

! Create a DM from the mesh file

DM :: cdm
PetscInt :: K, cdim, fedim
PetscBool :: simplex
PetscFE :: cfe
PetscViewer :: viewer
PetscErrorCode :: ierr

PetsCallA(DMSetFromOptions(cdm, ierr))
PetsCallA(DMGetDimension(cdm, cdim, ierr))
PetsCallA(DMPlexSimplex(cdm, simplex, ierr))
PetsCallA(PetscFECreateDefault(PETSC_COMM_WORLD, cdim, cdim, simplex, "ho_", K, cfe, ierr))
PetsCallA(PetscFEGetDimension(cfe, fedim, ierr))

PetsCallA(PetscViewerCreate(PETSC_COMM_WORLD, viewer, ierr))
PetsCallA(PetscViewerDrawOpen(PETSC_COMM_WORLD, PETSC_NULL_CHARACTER, PETSC_NULL_CHARACTER, 100, 100, 800, 800, viewer, ierr))
PetsCallA(PetscViewerSetFromOptions(viewer, ierr)) ! also using flag -draw_pause 5
PetsCallA(DMView(cdm, viewer, ierr))
[...]

With the following list of flags:

-ho_petscspace_degree K
-ho_petscdualspace_lagrange_node_type equispaced
-ho_petscdualspace_lagrange_node_endpoints 1
-dm_plex_high_order_view
-options_left

Using "-options_left" shows that "there are no unused options"; so "-dm_plex_high_order_view" is used somehow; it is at least required for the call to "DMPlexCreateHighOrderSurrogate_Internal" within the "draw" functions.

>From the CoordinateDM I can see the additional nodes created for the higher order approximation (e.g. mid-edge, mid-face nodes), so it seems the FE space is correct.

Regardless of the order "K", the mesh plotted with "DMView" is always the same, corresponding to the linear case.

Thank you.

Noam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251210/177d88f2/attachment.html>

From knepley at gmail.com  Thu Dec 11 11:00:48 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 11 Dec 2025 12:00:48 -0500
Subject: [petsc-users] Use of flag dm_plex_high_order_view
In-Reply-To: <AMbFq0QooOh4FPRIwwn-R9rg_ADOoOX-KjplmolVYAS8zcFgyWWLWjzVFhO56qUV4m-a3OW3Eiw-N1YLc4QunR8AfuvS9yfYfsMAx7ZyHw4=@proton.me>
References: <AMbFq0QooOh4FPRIwwn-R9rg_ADOoOX-KjplmolVYAS8zcFgyWWLWjzVFhO56qUV4m-a3OW3Eiw-N1YLc4QunR8AfuvS9yfYfsMAx7ZyHw4=@proton.me>
Message-ID: <CAMYG4GkMLvFU3Zj0viBUKUTfVK3Hg8gRsT0oQ4788unEC-UhHw@mail.gmail.com>

On Wed, Dec 10, 2025 at 8:14?AM Noam T. via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hello,
>
> In an old question about obtaining the node connectivity of a cell with a
> high order approximation space,
>

This is the misunderstanding here. What I implemented was visualization for
meshes with high order _coordinate_ spaces. I am "guaranteed" that this
will work because I know that coordinates are discretized with Lagrange
spaces. I did not do this for approximation spaces because I can't
guarantee this.

We could do the same thing for approximation spaces, and I would be happy
to help you factor it out. In general, I would need to figure out the
smallest DG space containing the approximation space (don't know how to do
yet, but possible), then project to that first and start the descent as
before.

  Thanks,

      Matt


> the use of the flag "-dm_plex_high_order_view"  for visualization purposes
> was brought up, as a way to refine the grid down to linear space.
>
> Link:
> https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plex.c?ref_type=heads*L2021__;Iw!!G_uCfscf7eWS!ekDGozJlgGi9X5H88PnWzpu8atTA6FPYDxbKBZbEfCAjP3O-HlOcRx5bJdDfiII0H3rwmPSpEjZTGy6TVz1M$ 
> <https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plex.c?ref_type=heads*L2021__;Iw!!G_uCfscf7eWS!ZZVaar4OGVdTXOAAjkEadcw-CAZUpAAzibpD_2SKpRupYG4oxSH-W7u5necsgTYDii6v-1kjfV4aqsllxL9XI_hK2YITfNz-$>
>
> I'm trying to make use of this flag to see the refinement, but I see no
> difference with higher order approximations.  Perhaps I am misunderstanding
> its use? I thought that by using it, one could see a "subdivision" of each
> element. Say, a single triangle, FE approximation space order 2 (3 corner
> nodes, 3 mid-edge nodes), would be refined into e.g. 4 linear triangles.
>
> The code (Fortran) looks something along these lines:
>
> ! Create a DM from the mesh file
>
>   DM :: cdm
>   PetscInt :: K, cdim, fedim
>   PetscBool :: simplex
>   PetscFE :: cfe
>   PetscViewer :: viewer
>   PetscErrorCode :: ierr
>
>   PetsCallA(DMSetFromOptions(cdm, ierr))
>   PetsCallA(DMGetDimension(cdm, cdim, ierr))
>   PetsCallA(DMPlexSimplex(cdm, simplex, ierr))
>   PetsCallA(PetscFECreateDefault(PETSC_COMM_WORLD, cdim, cdim, simplex,
> "ho_", K, cfe, ierr))
>   PetsCallA(PetscFEGetDimension(cfe, fedim, ierr))
>
>   PetsCallA(PetscViewerCreate(PETSC_COMM_WORLD, viewer, ierr))
>   PetsCallA(PetscViewerDrawOpen(PETSC_COMM_WORLD, PETSC_NULL_CHARACTER,
> PETSC_NULL_CHARACTER, 100, 100, 800, 800, viewer, ierr))
>   PetsCallA(PetscViewerSetFromOptions(viewer, ierr))    ! also using flag
> -draw_pause 5
>   PetsCallA(DMView(cdm, viewer, ierr))
>   [...]
>
>
> With the following list of flags:
>
>     -ho_petscspace_degree K
>     -ho_petscdualspace_lagrange_node_type equispaced
>     -ho_petscdualspace_lagrange_node_endpoints 1
>     -dm_plex_high_order_view
>     -options_left
>
> Using "-options_left" shows that "there are no unused options"; so
> "-dm_plex_high_order_view" is used somehow; it is at least required for the
> call to "DMPlexCreateHighOrderSurrogate_Internal" within the "draw"
> functions.
>
> From the CoordinateDM I can see the additional nodes created for the
> higher order approximation (e.g. mid-edge, mid-face nodes), so it seems the
> FE space is correct.
>
> Regardless of the order "K", the mesh plotted with "DMView" is always the
> same, corresponding to the linear case.
>
> Thank you.
>
> Noam
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ekDGozJlgGi9X5H88PnWzpu8atTA6FPYDxbKBZbEfCAjP3O-HlOcRx5bJdDfiII0H3rwmPSpEjZTG4wF9UiD$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ekDGozJlgGi9X5H88PnWzpu8atTA6FPYDxbKBZbEfCAjP3O-HlOcRx5bJdDfiII0H3rwmPSpEjZTG1crZ2Aa$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251211/e4ef3269/attachment.html>

From aldo.bonfiglioli at unibas.it  Mon Dec 15 02:43:45 2025
From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli)
Date: Mon, 15 Dec 2025 09:43:45 +0100
Subject: [petsc-users] Trouble when viewing a subDM in vtk format
In-Reply-To: <CAMYG4G=6C+u6u7dQ1TgW2BMc5cU52b66Uc7W6XFLukZgRx7TAw@mail.gmail.com>
References: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it>
	<CAMYG4G=6C+u6u7dQ1TgW2BMc5cU52b66Uc7W6XFLukZgRx7TAw@mail.gmail.com>
Message-ID: <8c7c0794-3ede-4aa5-a515-03af3187fb4c@unibas.it>

On 12/2/25 15:02, Matthew Knepley wrote:
> On Mon, Dec 1, 2025 at 8:22?AM Aldo Bonfiglioli 
> <aldo.bonfiglioli at unibas.it> wrote:
>
>     Dear developers,
>
>     I wrote a code that extracts subDMs corresponding to the various
>     strata
>     in the Face Sets.
>
>     I run into troubles when I view a subDM or a Vec attached to the
>     subDM
>     using the VTK format.
>
>     More precisely, the problem only occurs on more than one processor
>     when
>     the rank=0 processor has no points on a given subDM.
>
>     For instance, when the attached reproducer is run on 2 procs, the
>     u_01.vtu file (global u Vec mapped to the subDM corresponding to
>     stratum=1)
>
>     only includes the header, but no data. All other u_0?.vtu files can
>     successfully be loaded and viewed in paraview.
>
>     The problem does NOT arise when I view the same objects in HDF5
>     format.
>
>     However, my problem in using the HDF5 lies in the fact that:
>
>     while the hdf5 file obtained with DMView can be post-processed with
>     "petsc/lib/petsc/bin/petsc_gen_xdmf.py" to create a xmf file
>     readable by
>     paraview
>
>
> Hi Aldo,
>
> Sorry about this, I would like to make it more intuitive. First, the 
> solution (I think)
>
> ??-dm_plex_view_hdf5_storage_version 1.1.0
>
> will write the Viz field by default, so that PAraview will see it. Can 
> you try this?
>
> Why do we need this? I have now made version-controlled output 
> formats. There is something about
> this in the manual, but not enough. Paraview only supports 
> vertex-based fields and cell-based fields
> (at least that I understand), so we need to write a separate copy of 
> the field (since Plex supports any
> layout). Lots of people do not want a separate copy, since they are 
> checkpointing, so we control this
> with a format (PETSC_VIEWER_HDF5_VIZ). You can pass this for specific 
> output, or use the format
> version that does it automatically.
>
> Let me know if this works.
>
> ? Thanks,
>
> ? ? ?Matt
>
>     I do not know how to view the field(s) associated with the DM when
>     the
>     hdf5 file is obtained from VecView.
>
>     The reproducer compiles with the latest petsc release.
>
>     Thanks,
>
>     Aldo
>
>     -- 
>     Dr. Aldo Bonfiglioli
>     Associate professor of Fluid Mechanics
>     Dipartimento di Ingegneria
>     Universita' della Basilicata
>     V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
>     tel:+39.0971.205203 fax:+39.0971.205215
>     web:
>     https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!eXE2csxUb1J5bnRosf9LIGxLs17TdstMxQcGs0mbXFroRjzLN81jVmeAWfMr41X8JGEuVm286lVTmDgmi5s1zfsfegJNuWVVWTM$
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Zla-FYPeT9khxXRG5MuCbh6NLunsaK0ljFdIlNk-3XDnbGEJPBARoshmLdtN7p3kHgPJbQDyNgvHIi-GT6IyAxxdaZ2jTWw_kEg$  
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Zla-FYPeT9khxXRG5MuCbh6NLunsaK0ljFdIlNk-3XDnbGEJPBARoshmLdtN7p3kHgPJbQDyNgvHIi-GT6IyAxxdaZ2jObGUdDk$ >

Matt,

use of the option "-dm_plex_view_hdf5_storage_version 1.1.0" is indeed 
necessary,

but I also realized, thanks to a suggestion from 
matteo.semplice at uninsubria.it, that I have to View BOTH the dm and vec 
in the same hdf5 file, i.e.

> !
> ! ???dump the dm+u to the same HDF5 file
> !
>
> filename ="test.h5"
> ??PetscCall(PetscViewerHDF5Open(PETSC_COMM_WORLD, trim(filename), 
> FILE_MODE_WRITE, viewer, ierr))
> ??PetscCall(DMView(dm, viewer, ierr))
> ??PetscCall(PetscViewerDestroy(viewer, ierr))
> ??PetscCall(PetscViewerHDF5Open(PETSC_COMM_WORLD, trim(filename), 
> FILE_MODE_APPEND, viewer, ierr))
> ??PetscCall(VecView(u, viewer, ierr))
> ??PetscCall(PetscViewerDestroy(viewer, ierr))
>
>
Once test.h5 has been processed with "petsc_gen_xdmf.py", I can load the 
xmf file in paraview e see the solution (there is NO solution unless 
-dm_plex_view_hdf5_storage_version 1.1.0 is in the options db).

I was probably misled by the fact that a single VecView in VTK format 
gives both the mesh and solution in the same file.

Does this make sense?

Final question:

is it possible to specify, using command line options or the options db, 
that vecview should be appended to an existing file?

Thanks,

Aldo

-- 
Dr. Aldo Bonfiglioli
Associate professor of Fluid Mechanics
Dipartimento di Ingegneria
Universita' della Basilicata
V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
tel:+39.0971.205203 fax:+39.0971.205215
web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!Zla-FYPeT9khxXRG5MuCbh6NLunsaK0ljFdIlNk-3XDnbGEJPBARoshmLdtN7p3kHgPJbQDyNgvHIi-GT6IyAxxdaZ2jJl4AAbM$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251215/3a8c67f3/attachment.html>

From knepley at gmail.com  Mon Dec 15 06:41:07 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 15 Dec 2025 07:41:07 -0500
Subject: [petsc-users] Trouble when viewing a subDM in vtk format
In-Reply-To: <8c7c0794-3ede-4aa5-a515-03af3187fb4c@unibas.it>
References: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it>
	<CAMYG4G=6C+u6u7dQ1TgW2BMc5cU52b66Uc7W6XFLukZgRx7TAw@mail.gmail.com>
	<8c7c0794-3ede-4aa5-a515-03af3187fb4c@unibas.it>
Message-ID: <CAMYG4G=9zdppqfXCk_yM4JojVJQYbRSh4ya=M0C1VPkXNaoUnQ@mail.gmail.com>

On Mon, Dec 15, 2025 at 3:43?AM Aldo Bonfiglioli <aldo.bonfiglioli at unibas.it>
wrote:

> On 12/2/25 15:02, Matthew Knepley wrote:
>
> On Mon, Dec 1, 2025 at 8:22?AM Aldo Bonfiglioli <
> aldo.bonfiglioli at unibas.it> wrote:
>
>> Dear developers,
>>
>> I wrote a code that extracts subDMs corresponding to the various strata
>> in the Face Sets.
>>
>> I run into troubles when I view a subDM or a Vec attached to the subDM
>> using the VTK format.
>>
>> More precisely, the problem only occurs on more than one processor when
>> the rank=0 processor has no points on a given subDM.
>>
>> For instance, when the attached reproducer is run on 2 procs, the
>> u_01.vtu file (global u Vec mapped to the subDM corresponding to
>> stratum=1)
>>
>> only includes the header, but no data. All other u_0?.vtu files can
>> successfully be loaded and viewed in paraview.
>>
>> The problem does NOT arise when I view the same objects in HDF5 format.
>>
>> However, my problem in using the HDF5 lies in the fact that:
>>
>> while the hdf5 file obtained with DMView can be post-processed with
>> "petsc/lib/petsc/bin/petsc_gen_xdmf.py" to create a xmf file readable by
>> paraview
>>
>
> Hi Aldo,
>
> Sorry about this, I would like to make it more intuitive. First, the
> solution (I think)
>
>   -dm_plex_view_hdf5_storage_version 1.1.0
>
> will write the Viz field by default, so that PAraview will see it. Can you
> try this?
>
> Why do we need this? I have now made version-controlled output formats.
> There is something about
> this in the manual, but not enough. Paraview only supports vertex-based
> fields and cell-based fields
> (at least that I understand), so we need to write a separate copy of the
> field (since Plex supports any
> layout). Lots of people do not want a separate copy, since they are
> checkpointing, so we control this
> with a format (PETSC_VIEWER_HDF5_VIZ). You can pass this for specific
> output, or use the format
> version that does it automatically.
>
> Let me know if this works.
>
>   Thanks,
>
>      Matt
>
>
>> I do not know how to view the field(s) associated with the DM when the
>> hdf5 file is obtained from VecView.
>>
>> The reproducer compiles with the latest petsc release.
>>
>> Thanks,
>>
>> Aldo
>>
>> --
>> Dr. Aldo Bonfiglioli
>> Associate professor of Fluid Mechanics
>> Dipartimento di Ingegneria
>> Universita' della Basilicata
>> V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
>> tel:+39.0971.205203 fax:+39.0971.205215
>> web:
>> https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!eXE2csxUb1J5bnRosf9LIGxLs17TdstMxQcGs0mbXFroRjzLN81jVmeAWfMr41X8JGEuVm286lVTmDgmi5s1zfsfegJNuWVVWTM$
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e3ZEg0fMK55wMqFxwA8I9trdBUMCc27ap7Q8V8beAiyV258mz8N43KSQa8MUBZMDUTVJw33Yay2EgEb4psDr$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e3ZEg0fMK55wMqFxwA8I9trdBUMCc27ap7Q8V8beAiyV258mz8N43KSQa8MUBZMDUTVJw33Yay2EgGGjuvYF$ >
>
> Matt,
>
> use of the option "-dm_plex_view_hdf5_storage_version 1.1.0" is indeed
> necessary,
>
> but I also realized, thanks to a suggestion from
> matteo.semplice at uninsubria.it, that I have to View BOTH the dm and vec in
> the same hdf5 file, i.e.
>
> !
> !    dump the dm+u to the same HDF5 file
> !
>
>   filename = "test.h5"
>   PetscCall(PetscViewerHDF5Open(PETSC_COMM_WORLD, trim(filename),
> FILE_MODE_WRITE, viewer, ierr))
>   PetscCall(DMView(dm, viewer, ierr))
>   PetscCall(PetscViewerDestroy(viewer, ierr))
>   PetscCall(PetscViewerHDF5Open(PETSC_COMM_WORLD, trim(filename),
> FILE_MODE_APPEND, viewer, ierr))
>   PetscCall(VecView(u, viewer, ierr))
>   PetscCall(PetscViewerDestroy(viewer, ierr))
>
>
> Once test.h5 has been processed with "petsc_gen_xdmf.py", I can load the
> xmf file in paraview e see the solution (there is NO solution unless
> -dm_plex_view_hdf5_storage_version 1.1.0 is in the options db).
>
> I was probably misled by the fact that a single VecView in VTK format
> gives both the mesh and solution in the same file.
>
> Does this make sense?
>
Ah, yes. VTK does not have a way to construct the file without the DM, so
we force it. Our HDF5 format can actually handle multiple DMs (so the
adaptive refinement can be visualized), so you need to specify what to put
in.


> Final question:
>
> is it possible to specify, using command line options or the options db,
> that vecview should be appended to an existing file?
>
> Yes. You use the mode

  -vec_view hdf5:sol.h5::append

  Thanks,

     Matt

> Thanks,
>
> Aldo
>
> --
> Dr. Aldo Bonfiglioli
> Associate professor of Fluid Mechanics
> Dipartimento di Ingegneria
> Universita' della Basilicata
> V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
> tel:+39.0971.205203 fax:+39.0971.205215
> web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!e3ZEg0fMK55wMqFxwA8I9trdBUMCc27ap7Q8V8beAiyV258mz8N43KSQa8MUBZMDUTVJw33Yay2EgLcMeEyC$ 
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e3ZEg0fMK55wMqFxwA8I9trdBUMCc27ap7Q8V8beAiyV258mz8N43KSQa8MUBZMDUTVJw33Yay2EgEb4psDr$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e3ZEg0fMK55wMqFxwA8I9trdBUMCc27ap7Q8V8beAiyV258mz8N43KSQa8MUBZMDUTVJw33Yay2EgGGjuvYF$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251215/9cff4aee/attachment-0001.html>

From mmolinos at us.es  Tue Dec 16 11:39:07 2025
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Tue, 16 Dec 2025 17:39:07 +0000
Subject: [petsc-users] Handling inactive (zero-occupancy) equations in large
 SNES systems
References: <CB8C3EB7-8E1C-4198-BFB1-1E83236AB1FE@upm.es>
Message-ID: <8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es>


Dear all,

I am working with a large nonlinear system solved with SNES, where a significant fraction of the unknowns are temporarily inactive due to a physical parameter being zero (e.g. zero occupancy / zero weight).


For those DOF the corresponding equilibrium equation is physically inactive, but the unknown still appears in the global vector and in couplings of neighboring particles (Im using dmswarm).

At the moment, these inactive equations contribute with a zero residual (F_i=0), which (I think) leads to poor conditioning and convergence issues for large problems.


My question is about best numerical practice in this situation. For the position field, should I do something like F_i = q_i - q_(i,n)? Where q_(i,n) is the position of the particle at the previous configuration.


Best regards,

Miguel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251216/211eb2e0/attachment.html>

From knepley at gmail.com  Tue Dec 16 12:23:45 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 16 Dec 2025 13:23:45 -0500
Subject: [petsc-users] Handling inactive (zero-occupancy) equations in
 large SNES systems
In-Reply-To: <8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es>
References: <CB8C3EB7-8E1C-4198-BFB1-1E83236AB1FE@upm.es>
	<8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es>
Message-ID: <CAMYG4GkarCN196DHYbyBUJUq80Ttygup=p1JWJbc6wt07RFegQ@mail.gmail.com>

On Tue, Dec 16, 2025 at 12:39?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es>
wrote:

>
> Dear all,
>
> I am working with a large nonlinear system solved with SNES, where a
> significant fraction of the unknowns are temporarily inactive due to a
> physical parameter being zero (e.g. zero occupancy / zero weight).
>
>
> For those DOF the corresponding equilibrium equation is physically
> inactive, but the unknown still appears in the global vector and in
> couplings of neighboring particles (Im using dmswarm).
>
> At the moment, these inactive equations contribute with a zero residual
> (F_i=0), which (I think) leads to poor conditioning and convergence issues
> for large problems.
>
>
> My question is about best numerical practice in this situation. For the
> position field, should I do something like F_i = q_i - q_(i,n)? Where q_(i,n)
> is the position of the particle at the previous configuration.
>
This puts a 1 on the diagonal, which is usually what you want (esp for
particle problems).

However, there could be convergence problems with Newton, with these
directions swamping other descent directions. That is the argument for
eliminating these unknowns. It sounds like it would be worth trying to see
if this is the case.

  Thanks,

     Matt


> Best regards,
>
> Miguel
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a9ge_7Blw6FP4XR8osFatvOvy7Q2pdIyLX8lVZs3eFcKKQhLJ0TRkrPMAXOBllnlR6EP1Oa_qkH_pcJzgpX-$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a9ge_7Blw6FP4XR8osFatvOvy7Q2pdIyLX8lVZs3eFcKKQhLJ0TRkrPMAXOBllnlR6EP1Oa_qkH_pf-AU3HW$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251216/0c9b52fc/attachment.html>

From mmolinos at us.es  Tue Dec 16 13:04:05 2025
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Tue, 16 Dec 2025 19:04:05 +0000
Subject: [petsc-users] Handling inactive (zero-occupancy) equations in
 large SNES systems
In-Reply-To: <CAMYG4GkarCN196DHYbyBUJUq80Ttygup=p1JWJbc6wt07RFegQ@mail.gmail.com>
References: <CB8C3EB7-8E1C-4198-BFB1-1E83236AB1FE@upm.es>
	<8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es>
	<CAMYG4GkarCN196DHYbyBUJUq80Ttygup=p1JWJbc6wt07RFegQ@mail.gmail.com>
Message-ID: <642C9300-38AB-4906-AEE6-FE5DE9C715A0@us.es>

I?ll give it a try to F_i = q_i - q_(i,n). The problem with the dof elimination is that it messes up with local-to-global numbering and ghost particles creation too.

Miguel


On 16 Dec 2025, at 19:24, Matthew Knepley <knepley at gmail.com> wrote:

?
On Tue, Dec 16, 2025 at 12:39?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:


Dear all,

I am working with a large nonlinear system solved with SNES, where a significant fraction of the unknowns are temporarily inactive due to a physical parameter being zero (e.g. zero occupancy / zero weight).


For those DOF the corresponding equilibrium equation is physically inactive, but the unknown still appears in the global vector and in couplings of neighboring particles (Im using dmswarm).

At the moment, these inactive equations contribute with a zero residual (F_i=0), which (I think) leads to poor conditioning and convergence issues for large problems.


My question is about best numerical practice in this situation. For the position field, should I do something like F_i = q_i - q_(i,n)? Where q_(i,n) is the position of the particle at the previous configuration.

This puts a 1 on the diagonal, which is usually what you want (esp for particle problems).

However, there could be convergence problems with Newton, with these directions swamping other descent directions. That is the argument for eliminating these unknowns. It sounds like it would be worth trying to see if this is the case.

  Thanks,

     Matt


Best regards,

Miguel


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dXRvo4zLp_xH8xtPz1XikKeBlmIcplblRVj-9N1BTV2H0XI0cUS-2cnQoPgCRz5QvCkTONlTpwY_jFYEtSoYLw$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dXRvo4zLp_xH8xtPz1XikKeBlmIcplblRVj-9N1BTV2H0XI0cUS-2cnQoPgCRz5QvCkTONlTpwY_jFbkCQl3kA$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251216/355d724f/attachment-0001.html>

From bsmith at petsc.dev  Tue Dec 16 13:52:10 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 16 Dec 2025 14:52:10 -0500
Subject: [petsc-users] Handling inactive (zero-occupancy) equations in
 large SNES systems
In-Reply-To: <CAMYG4GkarCN196DHYbyBUJUq80Ttygup=p1JWJbc6wt07RFegQ@mail.gmail.com>
References: <CB8C3EB7-8E1C-4198-BFB1-1E83236AB1FE@upm.es>
	<8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es>
	<CAMYG4GkarCN196DHYbyBUJUq80Ttygup=p1JWJbc6wt07RFegQ@mail.gmail.com>
Message-ID: <0B95E014-6132-4194-BEAC-EF8F303BCCC4@petsc.dev>


> On Dec 16, 2025, at 1:23?PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Dec 16, 2025 at 12:39?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es <mailto:mmolinos at us.es>> wrote:
>> 
>> Dear all, 
>> 
>> I am working with a large nonlinear system solved with SNES, where a significant fraction of the unknowns are temporarily inactive due to a physical parameter being zero (e.g. zero occupancy / zero weight).
>> 
>> 
>> 
>> For those DOF the corresponding equilibrium equation is physically inactive, but the unknown still appears in the global vector and in couplings of neighboring particles (Im using dmswarm).
>> 
>> At the moment, these inactive equations contribute with a zero residual (F_i=0), which (I think) leads to poor conditioning and convergence issues for large problems.
>> 
>> 
>> 
>> My question is about best numerical practice in this situation. For the position field, should I do something like F_i = q_i - q_(i,n)? Where q_(i,n) is the position of the particle at the previous configuration.
>> 
> This puts a 1 on the diagonal, which is usually what you want (esp for particle problems).
> 
> However, there could be convergence problems with Newton, with these directions swamping other descent directions. That is the argument for eliminating these unknowns. It sounds like it would be worth trying to see if this is the case.

   Instead of putting 1 on the diagonal you can put a value on the diagonal that is "near" the other diagonal values of the matrix. This is usally (always?) better than using 1

> 
>   Thanks,
> 
>      Matt
> 
>  
>> Best regards,
>> 
>> Miguel
>> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e7ZjkKLVfmP45FsSYYPcHnoVEh9Kv6xPWA0L7i3BWuJkZi6jqWFQQujUIpV08-8TqGp0djoNntSxHUMhMeBAx5U$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a9ge_7Blw6FP4XR8osFatvOvy7Q2pdIyLX8lVZs3eFcKKQhLJ0TRkrPMAXOBllnlR6EP1Oa_qkH_pf-AU3HW$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251216/496e303a/attachment.html>

From david.knezevic at akselos.com  Wed Dec 17 11:50:18 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Wed, 17 Dec 2025 12:50:18 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
Message-ID: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>

Hi all,

I have a question about this error:

> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in
> unknown_function() at unknown file:0 (line numbers only accurate to
> function begin)


I'm encountering this error in an FE solve where there is an error
encountered during the residual/jacobian assembly, and what we normally do
in that situation is shrink the load step and continue, starting from the
"last converged solution". However, in this case I'm running on 32
processes, and 5 of the processes report the error above about a "locked
vector".

We clear the SNES object (via SNESDestroy) before we reset the solution to
the "last converged solution", and then we make a new SNES object
subsequently. But it seems to me that somehow the solution vector is still
marked as "locked" on 5 of the processes when we modify the solution
vector, which leads to the error above.

I was wondering if someone could advise on what the best way to handle this
would be? I thought one option could be to add an MPI barrier call prior to
updating the solution vector to "last converged solution", to make sure
that the SNES object is destroyed on all procs (and hence the locks
cleared) before editing the solution vector, but I'm unsure if that would
make a difference. Any  help would be most appreciated!

Thanks,
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/30e819e4/attachment.html>

From stefano.zampini at gmail.com  Wed Dec 17 13:02:21 2025
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Wed, 17 Dec 2025 22:02:21 +0300
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
Message-ID: <CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>

You are not allowed to call VecGetArray on the solution vector of an SNES
object within a user callback, nor to modify its values in any other way.
Put in C++ lingo, the solution vector is a "const" argument
It would be great if you could provide an MWE to help us understand your
problem


Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <
petsc-users at mcs.anl.gov> ha scritto:

> Hi all,
>
> I have a question about this error:
>
>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access
>> in unknown_function() at unknown file:0 (line numbers only accurate to
>> function begin)
>
>
> I'm encountering this error in an FE solve where there is an error
> encountered during the residual/jacobian assembly, and what we normally do
> in that situation is shrink the load step and continue, starting from the
> "last converged solution". However, in this case I'm running on 32
> processes, and 5 of the processes report the error above about a "locked
> vector".
>
> We clear the SNES object (via SNESDestroy) before we reset the solution to
> the "last converged solution", and then we make a new SNES object
> subsequently. But it seems to me that somehow the solution vector is still
> marked as "locked" on 5 of the processes when we modify the solution
> vector, which leads to the error above.
>
> I was wondering if someone could advise on what the best way to handle
> this would be? I thought one option could be to add an MPI barrier call
> prior to updating the solution vector to "last converged solution", to make
> sure that the SNES object is destroyed on all procs (and hence the locks
> cleared) before editing the solution vector, but I'm unsure if that would
> make a difference. Any  help would be most appreciated!
>
> Thanks,
> David
>


-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/4074371e/attachment.html>

From david.knezevic at akselos.com  Wed Dec 17 13:08:55 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Wed, 17 Dec 2025 14:08:55 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
Message-ID: <CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>

Hi,

I'm using PETSc via the libMesh framework, so creating a MWE is
complicated by that, unfortunately.

The situation is that I am not modifying the solution vector in a callback.
The SNES solve has terminated, with PetscErrorCode 82, and I then want to
update the solution vector (reset it to the "previously converged value")
and then try to solve again with a smaller load increment. This is a
typical "auto load stepping" strategy in FE.

I think the key piece of info I'd like to know is, at what point is the
solution vector "unlocked" by the SNES object? Should it be unlocked as
soon as the SNES solve has terminated with PetscErrorCode 82? Since it
seems to me that it hasn't been unlocked yet (maybe just on a subset of the
processes). Should I manually "unlock" the solution vector by
calling VecLockWriteSet?

Thanks,
David


On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <stefano.zampini at gmail.com>
wrote:

> You are not allowed to call VecGetArray on the solution vector of an SNES
> object within a user callback, nor to modify its values in any other way.
> Put in C++ lingo, the solution vector is a "const" argument
> It would be great if you could provide an MWE to help us understand your
> problem
>
>
> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <
> petsc-users at mcs.anl.gov> ha scritto:
>
>> Hi all,
>>
>> I have a question about this error:
>>
>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access
>>> in unknown_function() at unknown file:0 (line numbers only accurate to
>>> function begin)
>>
>>
>> I'm encountering this error in an FE solve where there is an error
>> encountered during the residual/jacobian assembly, and what we normally do
>> in that situation is shrink the load step and continue, starting from the
>> "last converged solution". However, in this case I'm running on 32
>> processes, and 5 of the processes report the error above about a "locked
>> vector".
>>
>> We clear the SNES object (via SNESDestroy) before we reset the solution
>> to the "last converged solution", and then we make a new SNES object
>> subsequently. But it seems to me that somehow the solution vector is still
>> marked as "locked" on 5 of the processes when we modify the solution
>> vector, which leads to the error above.
>>
>> I was wondering if someone could advise on what the best way to handle
>> this would be? I thought one option could be to add an MPI barrier call
>> prior to updating the solution vector to "last converged solution", to make
>> sure that the SNES object is destroyed on all procs (and hence the locks
>> cleared) before editing the solution vector, but I'm unsure if that would
>> make a difference. Any  help would be most appreciated!
>>
>> Thanks,
>> David
>>
>
>
> --
> Stefano
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/096a1d05/attachment.html>

From bsmith at petsc.dev  Wed Dec 17 13:12:39 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 17 Dec 2025 14:12:39 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
Message-ID: <B9410D30-D856-401E-A076-6CCB9C373322@petsc.dev>


> On Dec 17, 2025, at 12:50?PM, David Knezevic via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi all,
> 
> I have a question about this error:
>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin)
> 
> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector".

  It is very surprising that the vector is only locked on a subset of processes; this not normally expected behavior and likely indicates memory corruption or a logic error in the code. So the first thing to do is determine where the locking took place. You can build another instance of the PETSc libraries with debugging turned on by setting PETSC_ARCH to a new value and using the same ./configure options you used before but with --with-debugging=1 (instead of --with-debugging=0). If you are using a prebuilt PETSc from a package manager I don't know if there is an easy way to get a version with debugging turned on; I would hope so but some package managers may not provide it, in that case you need to build PETSc yourself from source.)

 Then run your code again and it should give detailed information (a stack trace) about where the locking took place.

  Why does PETSc lock vectors?  During a nonlinear solver, for example, the solver algorithm will request your function to be evaluated using the function you pass to SNESSetFunction(). To ensure user code does not corrupt the solution process the input vector (often called u in the documentation and code) is locked. It is locked because if the user code changes the values of the input vector it will "break" the iterative solver code (that is incorrect answers could be produced) During a standard Newton solver the user never needs to "adjust" the Newton proposed solutions, that is all handled by the PETSc solver code (and line search etc). But for some difficult problems, the user may want to have custom code that "messes around" directly with the proposed Newton steps. SNES provides "hooks" where the user can provide such custom code (the "messing around" should never take place with the SNESSetFunction() callback, only within the "hooks"). For SNES Newton's method with line search (the default) one can provide hook functions using `SNESLineSearchSetPostCheck()`, `SNESLineSearchSetPreCheck()`, where the linesearch object is obtained with SNESGetLineSearch(). The Newton trust region methods have their own set of hooks. Based on your mention of "shrink the load step" I am speculating that for your code standard Newton's method may not converge so you are adding additional code to help get Newton to converge and this is triggering the error you are seeing. But it is possible my guess is incorrect and their is some other cause for the error; in either case running with the debug version will help indicate where the locking issues is occurring.

  Barry


> 
> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above.
> 
> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any  help would be most appreciated!
> 
> Thanks,
> David

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/4955531a/attachment-0001.html>

From stefano.zampini at gmail.com  Wed Dec 17 13:20:51 2025
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Wed, 17 Dec 2025 22:20:51 +0300
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
Message-ID: <CAGPUisjPnu2n6Ja7n8N3F7xaNMek+KWNE-D+8YrYjruExEcGiQ@mail.gmail.com>

Note that all error codes in PETSc are fatal, and we don't handle these
cases gracefully. So, code like this

ierr = SNESSolve(snes,...);
if (ierr) {
 " do something and expect PETSc will keep working as usual"
}

is not supported in general.

However, an SNES that does not converge should not generate an error, but
rather return a negative SNESConvergedReason.
Maybe you are using -snes_error_if_not_converged (via command line) or
calling SNESSetErrorIfNotConverged at some point in the code.
You can then do

PetscCall(SNESSolve(snes,...);)
PetscCall(SNESGetConvergedReason(snes,&reason...);)

if (reason < 0) {
 " now you can do whatever you want and PETSc will keep working as usual"
}


Il giorno mer 17 dic 2025 alle ore 22:09 David Knezevic <
david.knezevic at akselos.com> ha scritto:

> Hi,
>
> I'm using PETSc via the libMesh framework, so creating a MWE is
> complicated by that, unfortunately.
>
> The situation is that I am not modifying the solution vector in a
> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
> want to update the solution vector (reset it to the "previously converged
> value") and then try to solve again with a smaller load increment. This is
> a typical "auto load stepping" strategy in FE.
>
> I think the key piece of info I'd like to know is, at what point is the
> solution vector "unlocked" by the SNES object? Should it be unlocked as
> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
> processes). Should I manually "unlock" the solution vector by
> calling VecLockWriteSet?
>
> Thanks,
> David
>
>
>
> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <stefano.zampini at gmail.com>
> wrote:
>
>> You are not allowed to call VecGetArray on the solution vector of an SNES
>> object within a user callback, nor to modify its values in any other way.
>> Put in C++ lingo, the solution vector is a "const" argument
>> It would be great if you could provide an MWE to help us understand your
>> problem
>>
>>
>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <
>> petsc-users at mcs.anl.gov> ha scritto:
>>
>>> Hi all,
>>>
>>> I have a question about this error:
>>>
>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access
>>>> in unknown_function() at unknown file:0 (line numbers only accurate to
>>>> function begin)
>>>
>>>
>>> I'm encountering this error in an FE solve where there is an error
>>> encountered during the residual/jacobian assembly, and what we normally do
>>> in that situation is shrink the load step and continue, starting from the
>>> "last converged solution". However, in this case I'm running on 32
>>> processes, and 5 of the processes report the error above about a "locked
>>> vector".
>>>
>>> We clear the SNES object (via SNESDestroy) before we reset the solution
>>> to the "last converged solution", and then we make a new SNES object
>>> subsequently. But it seems to me that somehow the solution vector is still
>>> marked as "locked" on 5 of the processes when we modify the solution
>>> vector, which leads to the error above.
>>>
>>> I was wondering if someone could advise on what the best way to handle
>>> this would be? I thought one option could be to add an MPI barrier call
>>> prior to updating the solution vector to "last converged solution", to make
>>> sure that the SNES object is destroyed on all procs (and hence the locks
>>> cleared) before editing the solution vector, but I'm unsure if that would
>>> make a difference. Any  help would be most appreciated!
>>>
>>> Thanks,
>>> David
>>>
>>
>>
>> --
>> Stefano
>>
>

-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/5c164644/attachment.html>

From bsmith at petsc.dev  Wed Dec 17 13:25:08 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 17 Dec 2025 14:25:08 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
Message-ID: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>


> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi,
> 
> I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately.
> 
> The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE.

   Once a PetscError is generated you CANNOT continue the PETSc program, it is not designed to allow this and trying to continue will lead to further problems. 

   So what you need to do is prevent PETSc from getting to the point where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve() returns without generating an error even if the nonlinear solver failed (for example did not converge). One then uses SNESGetConvergedReason to check if it converged or not. Normally when SNESSolve() returns, regardless of whether the converged reason is negative or positive, there will be no locked vectors and one can modify the SNES object and call SNESSolve again. 

  So my guess is that an actual PETSc error is being generated because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by either your code or libMesh or the option -snes_error_if_not_converged is being used. In your case when you wish the code to work after a non-converged SNESSolve() these options should never be set instead you should check the result of SNESGetConvergedReason() to check if SNESSolve has failed. If SNESSetErrorIfNotConverged() is never being set that may indicate you are using an old version of PETSc or have it a bug inside PETSc's SNES that does not handle errors correctly and we can help fix the problem if you can provide a full debug output version of when the error occurs.

  Barry


> 
> I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet?
> 
> Thanks,
> David
> 
> 
> 
> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>> wrote:
>> You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way.
>> Put in C++ lingo, the solution vector is a "const" argument
>> It would be great if you could provide an MWE to help us understand your problem
>> 
>> 
>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> ha scritto:
>>> Hi all,
>>> 
>>> I have a question about this error:
>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin)
>>> 
>>> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector".
>>> 
>>> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above.
>>> 
>>> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any  help would be most appreciated!
>>> 
>>> Thanks,
>>> David
>> 
>> 
>> 
>> --
>> Stefano

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/f30259d0/attachment-0001.html>

From david.knezevic at akselos.com  Wed Dec 17 13:47:58 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Wed, 17 Dec 2025 14:47:58 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
Message-ID: <CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>

Stefano and Barry: Thank you, this is very helpful.

I'll give some more info here which may help to clarify further. Normally
we do just get a negative "converged reason", as you described. But in this
specific case where I'm having issues the solve is a numerically sensitive
creep solve, which has exponential terms in the residual and jacobian
callback that can "blow up" and give NaN values. In this case, the root
cause is that we hit a NaN value during a callback, and then we throw an
exception (in libMesh C++ code) which I gather leads to the SNES solve
exiting with this error code.

Is there a way to tell the SNES to terminate with a negative "converged
reason" because we've encountered some issue during the callback?

Thanks,
David


On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev> wrote:

>
>
> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hi,
>
> I'm using PETSc via the libMesh framework, so creating a MWE is
> complicated by that, unfortunately.
>
> The situation is that I am not modifying the solution vector in a
> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
> want to update the solution vector (reset it to the "previously converged
> value") and then try to solve again with a smaller load increment. This is
> a typical "auto load stepping" strategy in FE.
>
>
>    Once a PetscError is generated you CANNOT continue the PETSc program,
> it is not designed to allow this and trying to continue will lead to
> further problems.
>
>    So what you need to do is prevent PETSc from getting to the point where
> an actual PetscErrorCode of 82 is generated.  Normally SNESSolve() returns
> without generating an error even if the nonlinear solver failed (for
> example did not converge). One then uses SNESGetConvergedReason to check if
> it converged or not. Normally when SNESSolve() returns, regardless of
> whether the converged reason is negative or positive, there will be no
> locked vectors and one can modify the SNES object and call SNESSolve again.
>
>   So my guess is that an actual PETSc error is being generated
> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by
> either your code or libMesh or the option -snes_error_if_not_converged is
> being used. In your case when you wish the code to work after a
> non-converged SNESSolve() these options should never be set instead you
> should check the result of SNESGetConvergedReason() to check if SNESSolve
> has failed. If SNESSetErrorIfNotConverged() is never being set that may
> indicate you are using an old version of PETSc or have it a bug inside
> PETSc's SNES that does not handle errors correctly and we can help fix the
> problem if you can provide a full debug output version of when the error
> occurs.
>
>   Barry
>
>
>
>
>
>
>
>
> I think the key piece of info I'd like to know is, at what point is the
> solution vector "unlocked" by the SNES object? Should it be unlocked as
> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
> processes). Should I manually "unlock" the solution vector by
> calling VecLockWriteSet?
>
> Thanks,
> David
>
>
>
> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <stefano.zampini at gmail.com>
> wrote:
>
>> You are not allowed to call VecGetArray on the solution vector of an SNES
>> object within a user callback, nor to modify its values in any other way.
>> Put in C++ lingo, the solution vector is a "const" argument
>> It would be great if you could provide an MWE to help us understand your
>> problem
>>
>>
>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <
>> petsc-users at mcs.anl.gov> ha scritto:
>>
>>> Hi all,
>>>
>>> I have a question about this error:
>>>
>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access
>>>> in unknown_function() at unknown file:0 (line numbers only accurate to
>>>> function begin)
>>>
>>>
>>> I'm encountering this error in an FE solve where there is an error
>>> encountered during the residual/jacobian assembly, and what we normally do
>>> in that situation is shrink the load step and continue, starting from the
>>> "last converged solution". However, in this case I'm running on 32
>>> processes, and 5 of the processes report the error above about a "locked
>>> vector".
>>>
>>> We clear the SNES object (via SNESDestroy) before we reset the solution
>>> to the "last converged solution", and then we make a new SNES object
>>> subsequently. But it seems to me that somehow the solution vector is still
>>> marked as "locked" on 5 of the processes when we modify the solution
>>> vector, which leads to the error above.
>>>
>>> I was wondering if someone could advise on what the best way to handle
>>> this would be? I thought one option could be to add an MPI barrier call
>>> prior to updating the solution vector to "last converged solution", to make
>>> sure that the SNES object is destroyed on all procs (and hence the locks
>>> cleared) before editing the solution vector, but I'm unsure if that would
>>> make a difference. Any  help would be most appreciated!
>>>
>>> Thanks,
>>> David
>>>
>>
>>
>> --
>> Stefano
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/689fea5f/attachment.html>

From david.knezevic at akselos.com  Wed Dec 17 14:17:55 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Wed, 17 Dec 2025 15:17:55 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
Message-ID: <CAJCWK9AmNdesxYrRnmSb=G4q0dBsXawYS6ZQGgUgdZ+ozN1s+Q@mail.gmail.com>

P.S. I checked our code more carefully, and I see that the PETSC_ERROR_CODE
82 is coming from our code. Sorry for not realizing that earlier. We
encounter the NaN I mentioned in my previous email, which leads to us
returning that error code 82 from the "residual assembly" callback. I guess
instead of doing that, we should just set the "converged reason" to be a
negative value (e.g. SNES_DIVERGED_USER), and that should let PETSc exit
the solve properly? Is it possible to set "converged reason" to
SNES_DIVERGED_USER, or is there a better way to handle this?

Thanks,
David


On Wed, Dec 17, 2025 at 2:47?PM David Knezevic <david.knezevic at akselos.com>
wrote:

> Stefano and Barry: Thank you, this is very helpful.
>
> I'll give some more info here which may help to clarify further. Normally
> we do just get a negative "converged reason", as you described. But in this
> specific case where I'm having issues the solve is a numerically sensitive
> creep solve, which has exponential terms in the residual and jacobian
> callback that can "blow up" and give NaN values. In this case, the root
> cause is that we hit a NaN value during a callback, and then we throw an
> exception (in libMesh C++ code) which I gather leads to the SNES solve
> exiting with this error code.
>
> Is there a way to tell the SNES to terminate with a negative "converged
> reason" because we've encountered some issue during the callback?
>
> Thanks,
> David
>
>
> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>
>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>> Hi,
>>
>> I'm using PETSc via the libMesh framework, so creating a MWE is
>> complicated by that, unfortunately.
>>
>> The situation is that I am not modifying the solution vector in a
>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
>> want to update the solution vector (reset it to the "previously converged
>> value") and then try to solve again with a smaller load increment. This is
>> a typical "auto load stepping" strategy in FE.
>>
>>
>>    Once a PetscError is generated you CANNOT continue the PETSc program,
>> it is not designed to allow this and trying to continue will lead to
>> further problems.
>>
>>    So what you need to do is prevent PETSc from getting to the point
>> where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve()
>> returns without generating an error even if the nonlinear solver failed
>> (for example did not converge). One then uses SNESGetConvergedReason to
>> check if it converged or not. Normally when SNESSolve() returns, regardless
>> of whether the converged reason is negative or positive, there will be no
>> locked vectors and one can modify the SNES object and call SNESSolve again.
>>
>>   So my guess is that an actual PETSc error is being generated
>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by
>> either your code or libMesh or the option -snes_error_if_not_converged is
>> being used. In your case when you wish the code to work after a
>> non-converged SNESSolve() these options should never be set instead you
>> should check the result of SNESGetConvergedReason() to check if SNESSolve
>> has failed. If SNESSetErrorIfNotConverged() is never being set that may
>> indicate you are using an old version of PETSc or have it a bug inside
>> PETSc's SNES that does not handle errors correctly and we can help fix the
>> problem if you can provide a full debug output version of when the error
>> occurs.
>>
>>   Barry
>>
>>
>>
>>
>>
>>
>>
>>
>> I think the key piece of info I'd like to know is, at what point is the
>> solution vector "unlocked" by the SNES object? Should it be unlocked as
>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
>> processes). Should I manually "unlock" the solution vector by
>> calling VecLockWriteSet?
>>
>> Thanks,
>> David
>>
>>
>>
>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <
>> stefano.zampini at gmail.com> wrote:
>>
>>> You are not allowed to call VecGetArray on the solution vector of an
>>> SNES object within a user callback, nor to modify its values in any other
>>> way.
>>> Put in C++ lingo, the solution vector is a "const" argument
>>> It would be great if you could provide an MWE to help us understand your
>>> problem
>>>
>>>
>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <
>>> petsc-users at mcs.anl.gov> ha scritto:
>>>
>>>> Hi all,
>>>>
>>>> I have a question about this error:
>>>>
>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only
>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate
>>>>> to function begin)
>>>>
>>>>
>>>> I'm encountering this error in an FE solve where there is an error
>>>> encountered during the residual/jacobian assembly, and what we normally do
>>>> in that situation is shrink the load step and continue, starting from the
>>>> "last converged solution". However, in this case I'm running on 32
>>>> processes, and 5 of the processes report the error above about a "locked
>>>> vector".
>>>>
>>>> We clear the SNES object (via SNESDestroy) before we reset the solution
>>>> to the "last converged solution", and then we make a new SNES object
>>>> subsequently. But it seems to me that somehow the solution vector is still
>>>> marked as "locked" on 5 of the processes when we modify the solution
>>>> vector, which leads to the error above.
>>>>
>>>> I was wondering if someone could advise on what the best way to handle
>>>> this would be? I thought one option could be to add an MPI barrier call
>>>> prior to updating the solution vector to "last converged solution", to make
>>>> sure that the SNES object is destroyed on all procs (and hence the locks
>>>> cleared) before editing the solution vector, but I'm unsure if that would
>>>> make a difference. Any  help would be most appreciated!
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>
>>>
>>> --
>>> Stefano
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/94287f69/attachment-0001.html>

From bsmith at petsc.dev  Wed Dec 17 14:43:07 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 17 Dec 2025 15:43:07 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
Message-ID: <DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>


> On Dec 17, 2025, at 2:47?PM, David Knezevic <david.knezevic at akselos.com> wrote:
> 
> Stefano and Barry: Thank you, this is very helpful.
> 
> I'll give some more info here which may help to clarify further. Normally we do just get a negative "converged reason", as you described. But in this specific case where I'm having issues the solve is a numerically sensitive creep solve, which has exponential terms in the residual and jacobian callback that can "blow up" and give NaN values. In this case, the root cause is that we hit a NaN value during a callback, and then we throw an exception (in libMesh C++ code) which I gather leads to the SNES solve exiting with this error code.
> 
> Is there a way to tell the SNES to terminate with a negative "converged reason" because we've encountered some issue during the callback?

   In your callback you should call SNESSetFunctionDomainError() and make sure the function value has an infinity or NaN in it (you can call VecFlag() for this purpose)). 

   Now SNESConvergedReason will be a completely reasonable SNES_DIVERGED_FUNCTION_DOMAIN

  Barry

If you are using an ancient version of PETSc (I hope you are using the latest since that always has more bug fixes and features) that does not have SNESSetFunctionDomainError then just make sure the function vector result has an infinity or NaN in it and then SNESConvergedReason will be SNES_DIVERGED_FNORM_NAN


> 
> Thanks,
> David
> 
> 
> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>> 
>> 
>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>> 
>>> Hi,
>>> 
>>> I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately.
>>> 
>>> The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE.
>> 
>>    Once a PetscError is generated you CANNOT continue the PETSc program, it is not designed to allow this and trying to continue will lead to further problems. 
>> 
>>    So what you need to do is prevent PETSc from getting to the point where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve() returns without generating an error even if the nonlinear solver failed (for example did not converge). One then uses SNESGetConvergedReason to check if it converged or not. Normally when SNESSolve() returns, regardless of whether the converged reason is negative or positive, there will be no locked vectors and one can modify the SNES object and call SNESSolve again. 
>> 
>>   So my guess is that an actual PETSc error is being generated because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by either your code or libMesh or the option -snes_error_if_not_converged is being used. In your case when you wish the code to work after a non-converged SNESSolve() these options should never be set instead you should check the result of SNESGetConvergedReason() to check if SNESSolve has failed. If SNESSetErrorIfNotConverged() is never being set that may indicate you are using an old version of PETSc or have it a bug inside PETSc's SNES that does not handle errors correctly and we can help fix the problem if you can provide a full debug output version of when the error occurs.
>> 
>>   Barry
>> 
>> 
>>   
>> 
>> 
>> 
>> 
>>> 
>>> I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet?
>>> 
>>> Thanks,
>>> David
>>> 
>>> 
>>> 
>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>> wrote:
>>>> You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way.
>>>> Put in C++ lingo, the solution vector is a "const" argument
>>>> It would be great if you could provide an MWE to help us understand your problem
>>>> 
>>>> 
>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> ha scritto:
>>>>> Hi all,
>>>>> 
>>>>> I have a question about this error:
>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin)
>>>>> 
>>>>> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector".
>>>>> 
>>>>> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above.
>>>>> 
>>>>> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any  help would be most appreciated!
>>>>> 
>>>>> Thanks,
>>>>> David
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Stefano
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251217/aefafa56/attachment.html>

From david.knezevic at akselos.com  Thu Dec 18 07:10:14 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Thu, 18 Dec 2025 08:10:14 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
	<DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>
Message-ID: <CAJCWK9DWYBLj0cE6x+D2X-fMOw8mQ-7KTattef7+fNbBfmJLvg@mail.gmail.com>

Thank you very much for this guidance. I switched to use
SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now.

Thanks!
David


On Wed, Dec 17, 2025 at 3:43?PM Barry Smith <bsmith at petsc.dev> wrote:

>
>
> On Dec 17, 2025, at 2:47?PM, David Knezevic <david.knezevic at akselos.com>
> wrote:
>
> Stefano and Barry: Thank you, this is very helpful.
>
> I'll give some more info here which may help to clarify further. Normally
> we do just get a negative "converged reason", as you described. But in this
> specific case where I'm having issues the solve is a numerically sensitive
> creep solve, which has exponential terms in the residual and jacobian
> callback that can "blow up" and give NaN values. In this case, the root
> cause is that we hit a NaN value during a callback, and then we throw an
> exception (in libMesh C++ code) which I gather leads to the SNES solve
> exiting with this error code.
>
> Is there a way to tell the SNES to terminate with a negative "converged
> reason" because we've encountered some issue during the callback?
>
>
>    In your callback you should call SNESSetFunctionDomainError() and make
> sure the function value has an infinity or NaN in it (you can call
> VecFlag() for this purpose)).
>
>    Now SNESConvergedReason will be a completely
> reasonable SNES_DIVERGED_FUNCTION_DOMAIN
>
>   Barry
>
> If you are using an ancient version of PETSc (I hope you are using the
> latest since that always has more bug fixes and features) that does not
> have SNESSetFunctionDomainError then just make sure the function vector
> result has an infinity or NaN in it and then SNESConvergedReason will be
> SNES_DIVERGED_FNORM_NAN
>
>
>
>
> Thanks,
> David
>
>
> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>
>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>> Hi,
>>
>> I'm using PETSc via the libMesh framework, so creating a MWE is
>> complicated by that, unfortunately.
>>
>> The situation is that I am not modifying the solution vector in a
>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
>> want to update the solution vector (reset it to the "previously converged
>> value") and then try to solve again with a smaller load increment. This is
>> a typical "auto load stepping" strategy in FE.
>>
>>
>>    Once a PetscError is generated you CANNOT continue the PETSc program,
>> it is not designed to allow this and trying to continue will lead to
>> further problems.
>>
>>    So what you need to do is prevent PETSc from getting to the point
>> where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve()
>> returns without generating an error even if the nonlinear solver failed
>> (for example did not converge). One then uses SNESGetConvergedReason to
>> check if it converged or not. Normally when SNESSolve() returns, regardless
>> of whether the converged reason is negative or positive, there will be no
>> locked vectors and one can modify the SNES object and call SNESSolve again.
>>
>>   So my guess is that an actual PETSc error is being generated
>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by
>> either your code or libMesh or the option -snes_error_if_not_converged is
>> being used. In your case when you wish the code to work after a
>> non-converged SNESSolve() these options should never be set instead you
>> should check the result of SNESGetConvergedReason() to check if SNESSolve
>> has failed. If SNESSetErrorIfNotConverged() is never being set that may
>> indicate you are using an old version of PETSc or have it a bug inside
>> PETSc's SNES that does not handle errors correctly and we can help fix the
>> problem if you can provide a full debug output version of when the error
>> occurs.
>>
>>   Barry
>>
>>
>>
>>
>>
>>
>>
>>
>> I think the key piece of info I'd like to know is, at what point is the
>> solution vector "unlocked" by the SNES object? Should it be unlocked as
>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
>> processes). Should I manually "unlock" the solution vector by
>> calling VecLockWriteSet?
>>
>> Thanks,
>> David
>>
>>
>>
>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <
>> stefano.zampini at gmail.com> wrote:
>>
>>> You are not allowed to call VecGetArray on the solution vector of an
>>> SNES object within a user callback, nor to modify its values in any other
>>> way.
>>> Put in C++ lingo, the solution vector is a "const" argument
>>> It would be great if you could provide an MWE to help us understand your
>>> problem
>>>
>>>
>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <
>>> petsc-users at mcs.anl.gov> ha scritto:
>>>
>>>> Hi all,
>>>>
>>>> I have a question about this error:
>>>>
>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only
>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate
>>>>> to function begin)
>>>>
>>>>
>>>> I'm encountering this error in an FE solve where there is an error
>>>> encountered during the residual/jacobian assembly, and what we normally do
>>>> in that situation is shrink the load step and continue, starting from the
>>>> "last converged solution". However, in this case I'm running on 32
>>>> processes, and 5 of the processes report the error above about a "locked
>>>> vector".
>>>>
>>>> We clear the SNES object (via SNESDestroy) before we reset the solution
>>>> to the "last converged solution", and then we make a new SNES object
>>>> subsequently. But it seems to me that somehow the solution vector is still
>>>> marked as "locked" on 5 of the processes when we modify the solution
>>>> vector, which leads to the error above.
>>>>
>>>> I was wondering if someone could advise on what the best way to handle
>>>> this would be? I thought one option could be to add an MPI barrier call
>>>> prior to updating the solution vector to "last converged solution", to make
>>>> sure that the SNES object is destroyed on all procs (and hence the locks
>>>> cleared) before editing the solution vector, but I'm unsure if that would
>>>> make a difference. Any  help would be most appreciated!
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>
>>>
>>> --
>>> Stefano
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251218/80b18508/attachment-0001.html>

From junchao.zhang at gmail.com  Fri Dec 19 11:27:57 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Fri, 19 Dec 2025 11:27:57 -0600
Subject: [petsc-users] Calling for short user presentations in a PETSc BoF
Message-ID: <CA+MQGp8PV-fivFY8755n5ehY9-Uk=L9DWZZz2sNBrHcFuN-+ww@mail.gmail.com>

Dear PETSc/TAO Community,

We are soliciting PETSc users to share their usage experiences, application
successes, and ongoing challenges in an online Zoom Birds-of-a-Feather
(BoF) session, to be held between February 10~12, 2026. We are seeking
approximately five short user presentations, each consisting of a 5-minute
talk followed by 2 minutes of questions. If you are interested in
presenting, please contact petsc-maint at mcs.anl.gov with your talk title, a
brief abstract, and your preferred time slot (11:00 AM, 1:00 PM, or 3:00 PM
EST).

The BoF is hosted by the Consortium for the Advancement of Scientific
Software (CASS <https://urldefense.us/v3/__https://cass.community/__;!!G_uCfscf7eWS!cGKRFA2mxYbJqy-t7sHjX6cRv9WDCHQ-N-ixpHDjW-ay1XCEKAIN7P0VeudLwn37g34P0Ye6pzzV8MvZeFz00D_bMBfS$ >) and led by the PESO
<https://urldefense.us/v3/__https://pesoproject.org/__;!!G_uCfscf7eWS!cGKRFA2mxYbJqy-t7sHjX6cRv9WDCHQ-N-ixpHDjW-ay1XCEKAIN7P0VeudLwn37g34P0Ye6pzzV8MvZeFz00J2p4TVU$ > (Partnering for Scientific Software Ecosystem
Stewardship) project. The PETSc session will last 90 minutes and will take
place on one day between February 10 and 12, 2026 (exact date to be
finalized). Please note that the session will not be recorded.

Our preferred time slot is 11:00 AM EST (5:00 PM UTC) to better accommodate
European participants, although alternative options at 1:00 PM or 3:00 PM
EST are also under consideration.

In addition to user presentations, during the session, PETSc developers
will highlight recent advances developed following the Exascale Computing
Project, including the new PETSc Fortran bindings, PetscRegressor, TaoTerm,
updates to PETSc GPU backends, mixed-precision support in PETSc/MUMPS, and
integration with OpenFOAM, among other topics. The program will also
include an open discussion of emerging PETSc research directions, such as
leveraging agentic artificial intelligence to enhance and exploit the PETSc
knowledge base.

The BoF will provide insight into PETSc?s near-term development roadmap and
offer a forum for user feedback on desired features and improvements.
Active participation and questions from the audience are strongly
encouraged, enabling the PETSc team to better align future development with
community needs.

The agenda will be posted once the program is finalized.

Thank you, and we look forward to your participation.

Junchao Zhang
On behalf of the PETSc team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251219/4e5da319/attachment.html>

From david.knezevic at akselos.com  Sun Dec 21 16:53:25 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Sun, 21 Dec 2025 17:53:25 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9DWYBLj0cE6x+D2X-fMOw8mQ-7KTattef7+fNbBfmJLvg@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
	<DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>
	<CAJCWK9DWYBLj0cE6x+D2X-fMOw8mQ-7KTattef7+fNbBfmJLvg@mail.gmail.com>
Message-ID: <CAJCWK9C2bsTCf-iPa1Udeo2jaLJ_WwvgoY=MFszV03BRSNjPcw@mail.gmail.com>

Hi, actually, I have a follow up on this topic.

I noticed that when I call SNESSetFunctionDomainError(), it exits the solve
as expected, but it leads to a converged reason "DIVERGED_LINE_SEARCH"
instead of "DIVERGED_FUNCTION_DOMAIN". If I also
set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the
callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged
reason, so that's what I'm doing now. I was surprised by this behavior,
though, since I expected that calling SNESSetFunctionDomainError woudld
lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to
check on what could be causing this.

FYI, I'm using PETSc 3.23.4

Thanks,
David


On Thu, Dec 18, 2025 at 8:10?AM David Knezevic <david.knezevic at akselos.com>
wrote:

> Thank you very much for this guidance. I switched to use
> SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now.
>
> Thanks!
> David
>
>
> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>
>> On Dec 17, 2025, at 2:47?PM, David Knezevic <david.knezevic at akselos.com>
>> wrote:
>>
>> Stefano and Barry: Thank you, this is very helpful.
>>
>> I'll give some more info here which may help to clarify further. Normally
>> we do just get a negative "converged reason", as you described. But in this
>> specific case where I'm having issues the solve is a numerically sensitive
>> creep solve, which has exponential terms in the residual and jacobian
>> callback that can "blow up" and give NaN values. In this case, the root
>> cause is that we hit a NaN value during a callback, and then we throw an
>> exception (in libMesh C++ code) which I gather leads to the SNES solve
>> exiting with this error code.
>>
>> Is there a way to tell the SNES to terminate with a negative "converged
>> reason" because we've encountered some issue during the callback?
>>
>>
>>    In your callback you should call SNESSetFunctionDomainError() and make
>> sure the function value has an infinity or NaN in it (you can call
>> VecFlag() for this purpose)).
>>
>>    Now SNESConvergedReason will be a completely
>> reasonable SNES_DIVERGED_FUNCTION_DOMAIN
>>
>>   Barry
>>
>> If you are using an ancient version of PETSc (I hope you are using the
>> latest since that always has more bug fixes and features) that does not
>> have SNESSetFunctionDomainError then just make sure the function vector
>> result has an infinity or NaN in it and then SNESConvergedReason will be
>> SNES_DIVERGED_FNORM_NAN
>>
>>
>>
>>
>> Thanks,
>> David
>>
>>
>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>>
>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>>
>>> Hi,
>>>
>>> I'm using PETSc via the libMesh framework, so creating a MWE is
>>> complicated by that, unfortunately.
>>>
>>> The situation is that I am not modifying the solution vector in a
>>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
>>> want to update the solution vector (reset it to the "previously converged
>>> value") and then try to solve again with a smaller load increment. This is
>>> a typical "auto load stepping" strategy in FE.
>>>
>>>
>>>    Once a PetscError is generated you CANNOT continue the PETSc program,
>>> it is not designed to allow this and trying to continue will lead to
>>> further problems.
>>>
>>>    So what you need to do is prevent PETSc from getting to the point
>>> where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve()
>>> returns without generating an error even if the nonlinear solver failed
>>> (for example did not converge). One then uses SNESGetConvergedReason to
>>> check if it converged or not. Normally when SNESSolve() returns, regardless
>>> of whether the converged reason is negative or positive, there will be no
>>> locked vectors and one can modify the SNES object and call SNESSolve again.
>>>
>>>   So my guess is that an actual PETSc error is being generated
>>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by
>>> either your code or libMesh or the option -snes_error_if_not_converged is
>>> being used. In your case when you wish the code to work after a
>>> non-converged SNESSolve() these options should never be set instead you
>>> should check the result of SNESGetConvergedReason() to check if SNESSolve
>>> has failed. If SNESSetErrorIfNotConverged() is never being set that may
>>> indicate you are using an old version of PETSc or have it a bug inside
>>> PETSc's SNES that does not handle errors correctly and we can help fix the
>>> problem if you can provide a full debug output version of when the error
>>> occurs.
>>>
>>>   Barry
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> I think the key piece of info I'd like to know is, at what point is the
>>> solution vector "unlocked" by the SNES object? Should it be unlocked as
>>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
>>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
>>> processes). Should I manually "unlock" the solution vector by
>>> calling VecLockWriteSet?
>>>
>>> Thanks,
>>> David
>>>
>>>
>>>
>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <
>>> stefano.zampini at gmail.com> wrote:
>>>
>>>> You are not allowed to call VecGetArray on the solution vector of an
>>>> SNES object within a user callback, nor to modify its values in any other
>>>> way.
>>>> Put in C++ lingo, the solution vector is a "const" argument
>>>> It would be great if you could provide an MWE to help us understand
>>>> your problem
>>>>
>>>>
>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users
>>>> <petsc-users at mcs.anl.gov> ha scritto:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a question about this error:
>>>>>
>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only
>>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate
>>>>>> to function begin)
>>>>>
>>>>>
>>>>> I'm encountering this error in an FE solve where there is an error
>>>>> encountered during the residual/jacobian assembly, and what we normally do
>>>>> in that situation is shrink the load step and continue, starting from the
>>>>> "last converged solution". However, in this case I'm running on 32
>>>>> processes, and 5 of the processes report the error above about a "locked
>>>>> vector".
>>>>>
>>>>> We clear the SNES object (via SNESDestroy) before we reset the
>>>>> solution to the "last converged solution", and then we make a new SNES
>>>>> object subsequently. But it seems to me that somehow the solution vector is
>>>>> still marked as "locked" on 5 of the processes when we modify the solution
>>>>> vector, which leads to the error above.
>>>>>
>>>>> I was wondering if someone could advise on what the best way to handle
>>>>> this would be? I thought one option could be to add an MPI barrier call
>>>>> prior to updating the solution vector to "last converged solution", to make
>>>>> sure that the SNES object is destroyed on all procs (and hence the locks
>>>>> cleared) before editing the solution vector, but I'm unsure if that would
>>>>> make a difference. Any  help would be most appreciated!
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>
>>>>
>>>> --
>>>> Stefano
>>>>
>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251221/265107a4/attachment.html>

From liluo at um.edu.mo  Mon Dec 22 04:46:29 2025
From: liluo at um.edu.mo (liluo)
Date: Mon, 22 Dec 2025 10:46:29 +0000
Subject: [petsc-users] A partition of DMPlex mesh similar to what DMDA
 provides?
Message-ID: <f7523c19066340e08db4131e0fcf1f59@um.edu.mo>

Dear PETSc developers,


I?m using DMPlex to manage an unstructured mesh. However, in my case, the input mesh is actually a structured tetrahedral mesh, and its geometric domain is just a simple box.


Is there any PETSc functionality or recommended approach to obtain a partition similar to what DMDA provides?i.e., a simple Cartesian block partition?when working with such a mesh in DMPlex?

Any guidance or best practices would be greatly appreciated.


Thank you!


Bests,

Li Luo

<https://urldefense.us/v3/__https://www.fst.um.edu.mo/personal/liluo/__;!!G_uCfscf7eWS!a-Z3XuBEDtS9pyiXP2bJDeQK-wotmIUURxFJ-UithWms0lkQHP3QChyD8EVvLS3vBWbOf0dk5FLjRGs29VHW3w$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251222/8d890882/attachment-0001.html>

From bsmith at petsc.dev  Mon Dec 22 09:25:39 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 22 Dec 2025 10:25:39 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9C2bsTCf-iPa1Udeo2jaLJ_WwvgoY=MFszV03BRSNjPcw@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
	<DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>
	<CAJCWK9DWYBLj0cE6x+D2X-fMOw8mQ-7KTattef7+fNbBfmJLvg@mail.gmail.com>
	<CAJCWK9C2bsTCf-iPa1Udeo2jaLJ_WwvgoY=MFszV03BRSNjPcw@mail.gmail.com>
Message-ID: <FFA9D9F1-C74B-40F1-9B21-6D0C14CB38AA@petsc.dev>

  David,

    This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was added long after the origins of SNES and, in places, the code was never fully updated to handle function domain problems. In particular, parts of the line search don't handle it correctly. Can you run with -snes_view and that will help us find the spot that needs to be updated. 

   Barry


> On Dec 21, 2025, at 5:53?PM, David Knezevic <david.knezevic at akselos.com> wrote:
> 
> Hi, actually, I have a follow up on this topic.
> 
> I noticed that when I call SNESSetFunctionDomainError(), it exits the solve as expected, but it leads to a converged reason "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged reason, so that's what I'm doing now. I was surprised by this behavior, though, since I expected that calling SNESSetFunctionDomainError woudld lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to check on what could be causing this.
> 
> FYI, I'm using PETSc 3.23.4
> 
> Thanks,
> David
> 
> 
> On Thu, Dec 18, 2025 at 8:10?AM David Knezevic <david.knezevic at akselos.com <mailto:david.knezevic at akselos.com>> wrote:
>> Thank you very much for this guidance. I switched to use SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now.
>> 
>> Thanks!
>> David
>> 
>> 
>> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>> 
>>> 
>>>> On Dec 17, 2025, at 2:47?PM, David Knezevic <david.knezevic at akselos.com <mailto:david.knezevic at akselos.com>> wrote:
>>>> 
>>>> Stefano and Barry: Thank you, this is very helpful.
>>>> 
>>>> I'll give some more info here which may help to clarify further. Normally we do just get a negative "converged reason", as you described. But in this specific case where I'm having issues the solve is a numerically sensitive creep solve, which has exponential terms in the residual and jacobian callback that can "blow up" and give NaN values. In this case, the root cause is that we hit a NaN value during a callback, and then we throw an exception (in libMesh C++ code) which I gather leads to the SNES solve exiting with this error code.
>>>> 
>>>> Is there a way to tell the SNES to terminate with a negative "converged reason" because we've encountered some issue during the callback?
>>> 
>>>    In your callback you should call SNESSetFunctionDomainError() and make sure the function value has an infinity or NaN in it (you can call VecFlag() for this purpose)). 
>>> 
>>>    Now SNESConvergedReason will be a completely reasonable SNES_DIVERGED_FUNCTION_DOMAIN
>>> 
>>>   Barry
>>> 
>>> If you are using an ancient version of PETSc (I hope you are using the latest since that always has more bug fixes and features) that does not have SNESSetFunctionDomainError then just make sure the function vector result has an infinity or NaN in it and then SNESConvergedReason will be SNES_DIVERGED_FNORM_NAN
>>> 
>>> 
>>> 
>>>> 
>>>> Thanks,
>>>> David
>>>> 
>>>> 
>>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>> 
>>>>> 
>>>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately.
>>>>>> 
>>>>>> The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE.
>>>>> 
>>>>>    Once a PetscError is generated you CANNOT continue the PETSc program, it is not designed to allow this and trying to continue will lead to further problems. 
>>>>> 
>>>>>    So what you need to do is prevent PETSc from getting to the point where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve() returns without generating an error even if the nonlinear solver failed (for example did not converge). One then uses SNESGetConvergedReason to check if it converged or not. Normally when SNESSolve() returns, regardless of whether the converged reason is negative or positive, there will be no locked vectors and one can modify the SNES object and call SNESSolve again. 
>>>>> 
>>>>>   So my guess is that an actual PETSc error is being generated because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by either your code or libMesh or the option -snes_error_if_not_converged is being used. In your case when you wish the code to work after a non-converged SNESSolve() these options should never be set instead you should check the result of SNESGetConvergedReason() to check if SNESSolve has failed. If SNESSetErrorIfNotConverged() is never being set that may indicate you are using an old version of PETSc or have it a bug inside PETSc's SNES that does not handle errors correctly and we can help fix the problem if you can provide a full debug output version of when the error occurs.
>>>>> 
>>>>>   Barry
>>>>> 
>>>>> 
>>>>>   
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet?
>>>>>> 
>>>>>> Thanks,
>>>>>> David
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>> wrote:
>>>>>>> You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way.
>>>>>>> Put in C++ lingo, the solution vector is a "const" argument
>>>>>>> It would be great if you could provide an MWE to help us understand your problem
>>>>>>> 
>>>>>>> 
>>>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> ha scritto:
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> I have a question about this error:
>>>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin)
>>>>>>>> 
>>>>>>>> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector".
>>>>>>>> 
>>>>>>>> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above.
>>>>>>>> 
>>>>>>>> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any  help would be most appreciated!
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Stefano
>>>>> 
>>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251222/389272d6/attachment.html>

From knepley at gmail.com  Mon Dec 22 10:21:22 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 22 Dec 2025 11:21:22 -0500
Subject: [petsc-users] A partition of DMPlex mesh similar to what DMDA
 provides?
In-Reply-To: <f7523c19066340e08db4131e0fcf1f59@um.edu.mo>
References: <f7523c19066340e08db4131e0fcf1f59@um.edu.mo>
Message-ID: <CAMYG4G=kA1iHT2XOfssX88QW0mcfXW38fdaF8ENunwxhKSGMPA@mail.gmail.com>

On Mon, Dec 22, 2025 at 5:46?AM liluo <liluo at um.edu.mo> wrote:

> Dear PETSc developers,
>
>
> I?m using DMPlex to manage an unstructured mesh. However, in my case, the
> input mesh is actually a structured tetrahedral mesh, and its geometric
> domain is just a simple box.
>
>
> Is there any PETSc functionality or recommended approach to obtain a
> partition similar to what DMDA provides?i.e., a simple Cartesian block
> partition?when working with such a mesh in DMPlex?
>
> Any guidance or best practices would be greatly appreciated.
>

This is trivial in 2D because triangles nicely tile the box, but in 3D
tetrahedra are harder to handle.I can see three avenues:

1) Manually

You can use PlexPartitioner type user, which allows you to explicitly
indicate the cell numbers that go to each process. This is probably more
work than you want.

2) Mesh Partitioner + Refinement

You can run a partitioner on a small mesh, for which they are pretty good,
and then refine that. This is mostly what I do.

3) New algorithm

Amal Timalsina published a nice algorithm for converting hexes to tets, so
you could create a hex mesh that is partitioned exactly as you want, and
then convert it to tets, but this would mean writing new code.

Why are you using tets instead of hexes for this problem?

  Thanks,

      Matt


> Thank you!
>
>
> Bests,
>
> Li Luo
>
> <https://urldefense.us/v3/__https://www.fst.um.edu.mo/personal/liluo/__;!!G_uCfscf7eWS!a-Z3XuBEDtS9pyiXP2bJDeQK-wotmIUURxFJ-UithWms0lkQHP3QChyD8EVvLS3vBWbOf0dk5FLjRGs29VHW3w$>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a8JIUtZ9kWgwf5HLe7vrUozP6RnDa-KxLqpAyxrAnKFhl_wgCNxF1SgnsC3wHJFY61YTVZF3nYa7ruCuM9mC$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a8JIUtZ9kWgwf5HLe7vrUozP6RnDa-KxLqpAyxrAnKFhl_wgCNxF1SgnsC3wHJFY61YTVZF3nYa7riP_iZ-P$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251222/2cbb782d/attachment-0001.html>

From david.knezevic at akselos.com  Mon Dec 22 13:58:49 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Mon, 22 Dec 2025 14:58:49 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <FFA9D9F1-C74B-40F1-9B21-6D0C14CB38AA@petsc.dev>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
	<DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>
	<CAJCWK9DWYBLj0cE6x+D2X-fMOw8mQ-7KTattef7+fNbBfmJLvg@mail.gmail.com>
	<CAJCWK9C2bsTCf-iPa1Udeo2jaLJ_WwvgoY=MFszV03BRSNjPcw@mail.gmail.com>
	<FFA9D9F1-C74B-40F1-9B21-6D0C14CB38AA@petsc.dev>
Message-ID: <CAJCWK9AR7AoNVzv=kiy=74mQ_C=Gh5v-dae6Y6gptogMFW_AeA@mail.gmail.com>

The print out I get from -snes_view is shown below. I wonder if the issue
is related to "using user-defined postcheck step"?


SNES Object: 1 MPI process
  type: newtonls
  maximum iterations=5, maximum function evaluations=10000
  tolerances: relative=0., absolute=0., solution=0.
  total number of linear solver iterations=3
  total number of function evaluations=4
  norm schedule ALWAYS
  SNESLineSearch Object: 1 MPI process
    type: basic
    maxstep=1.000000e+08, minlambda=1.000000e-12
    tolerances: relative=1.000000e-08, absolute=1.000000e-15,
lambda=1.000000e-08
    maximum iterations=40
    using user-defined postcheck step
  KSP Object: 1 MPI process
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: 1 MPI process
    type: cholesky
      out-of-place factorization
      tolerance for zero pivot 2.22045e-14
      matrix ordering: external
      factor fill ratio given 0., needed 0.
        Factored matrix follows:
          Mat Object: 1 MPI process
            type: mumps
            rows=1152, cols=1152
            package used to perform factorization: mumps
            total: nonzeros=126936, allocated nonzeros=126936
              MUMPS run parameters:
                Use -ksp_view ::ascii_info_detail to display information
for all processes
                RINFOG(1) (global estimated flops for the elimination after
analysis): 1.63461e+07
                RINFOG(2) (global estimated flops for the assembly after
factorization): 74826.
                RINFOG(3) (global estimated flops for the elimination after
factorization): 1.63461e+07
                (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):
(0.,0.)*(2^0)
                INFOG(3) (estimated real workspace for factors on all
processors after analysis): 150505
                INFOG(4) (estimated integer workspace for factors on all
processors after analysis): 6276
                INFOG(5) (estimated maximum front size in the complete
tree): 216
                INFOG(6) (number of nodes in the complete tree): 24
                INFOG(7) (ordering option effectively used after analysis):
2
                INFOG(8) (structural symmetry in percent of the permuted
matrix after analysis): 100
                INFOG(9) (total real/complex workspace to store the matrix
factors after factorization): 150505
                INFOG(10) (total integer space store the matrix factors
after factorization): 6276
                INFOG(11) (order of largest frontal matrix after
factorization): 216
                INFOG(12) (number of off-diagonal pivots): 1044
                INFOG(13) (number of delayed pivots after factorization): 0
                INFOG(14) (number of memory compress after factorization): 0
                INFOG(15) (number of steps of iterative refinement after
solution): 0
                INFOG(16) (estimated size (in MB) of all MUMPS internal
data for factorization after analysis: value on the most memory consuming
processor): 2
                INFOG(17) (estimated size of all MUMPS internal data for
factorization after analysis: sum over all processors): 2
                INFOG(18) (size of all MUMPS internal data allocated during
factorization: value on the most memory consuming processor): 2
                INFOG(19) (size of all MUMPS internal data allocated during
factorization: sum over all processors): 2
                INFOG(20) (estimated number of entries in the factors):
126936
                INFOG(21) (size in MB of memory effectively used during
factorization - value on the most memory consuming processor): 2
                INFOG(22) (size in MB of memory effectively used during
factorization - sum over all processors): 2
                INFOG(23) (after analysis: value of ICNTL(6) effectively
used): 0
                INFOG(24) (after analysis: value of ICNTL(12) effectively
used): 1
                INFOG(25) (after factorization: number of pivots modified
by static pivoting): 0
                INFOG(28) (after factorization: number of null pivots
encountered): 0
                INFOG(29) (after factorization: effective number of entries
in the factors (sum over all processors)): 126936
                INFOG(30, 31) (after solution: size in Mbytes of memory
used during solution phase): 2, 2
                INFOG(32) (after analysis: type of analysis done): 1
                INFOG(33) (value used for ICNTL(8)): 7
                INFOG(34) (exponent of the determinant if determinant is
requested): 0
                INFOG(35) (after factorization: number of entries taking
into account BLR factor compression - sum over all processors): 126936
                INFOG(36) (after analysis: estimated size of all MUMPS
internal data for running BLR in-core - value on the most memory consuming
processor): 0
                INFOG(37) (after analysis: estimated size of all MUMPS
internal data for running BLR in-core - sum over all processors): 0
                INFOG(38) (after analysis: estimated size of all MUMPS
internal data for running BLR out-of-core - value on the most memory
consuming processor): 0
                INFOG(39) (after analysis: estimated size of all MUMPS
internal data for running BLR out-of-core - sum over all processors): 0
    linear system matrix = precond matrix:
    Mat Object: 1 MPI process
      type: seqaij
      rows=1152, cols=1152
      total: nonzeros=60480, allocated nonzeros=60480
      total number of mallocs used during MatSetValues calls=0
        using I-node routines: found 384 nodes, limit used is 5


On Mon, Dec 22, 2025 at 9:25?AM Barry Smith <bsmith at petsc.dev> wrote:

>   David,
>
>     This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was
> added long after the origins of SNES and, in places, the code was never
> fully updated to handle function domain problems. In particular, parts of
> the line search don't handle it correctly. Can you run with -snes_view and
> that will help us find the spot that needs to be updated.
>
>    Barry
>
>
> On Dec 21, 2025, at 5:53?PM, David Knezevic <david.knezevic at akselos.com>
> wrote:
>
> Hi, actually, I have a follow up on this topic.
>
> I noticed that when I call SNESSetFunctionDomainError(), it exits the
> solve as expected, but it leads to a converged reason
> "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also
> set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the
> callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged
> reason, so that's what I'm doing now. I was surprised by this behavior,
> though, since I expected that calling SNESSetFunctionDomainError woudld
> lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to
> check on what could be causing this.
>
> FYI, I'm using PETSc 3.23.4
>
> Thanks,
> David
>
>
> On Thu, Dec 18, 2025 at 8:10?AM David Knezevic <david.knezevic at akselos.com>
> wrote:
>
>> Thank you very much for this guidance. I switched to use
>> SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now.
>>
>> Thanks!
>> David
>>
>>
>> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>>
>>> On Dec 17, 2025, at 2:47?PM, David Knezevic <david.knezevic at akselos.com>
>>> wrote:
>>>
>>> Stefano and Barry: Thank you, this is very helpful.
>>>
>>> I'll give some more info here which may help to clarify further.
>>> Normally we do just get a negative "converged reason", as you described.
>>> But in this specific case where I'm having issues the solve is a
>>> numerically sensitive creep solve, which has exponential terms in the
>>> residual and jacobian callback that can "blow up" and give NaN values. In
>>> this case, the root cause is that we hit a NaN value during a callback, and
>>> then we throw an exception (in libMesh C++ code) which I gather leads to
>>> the SNES solve exiting with this error code.
>>>
>>> Is there a way to tell the SNES to terminate with a negative "converged
>>> reason" because we've encountered some issue during the callback?
>>>
>>>
>>>    In your callback you should call SNESSetFunctionDomainError() and
>>> make sure the function value has an infinity or NaN in it (you can call
>>> VecFlag() for this purpose)).
>>>
>>>    Now SNESConvergedReason will be a completely
>>> reasonable SNES_DIVERGED_FUNCTION_DOMAIN
>>>
>>>   Barry
>>>
>>> If you are using an ancient version of PETSc (I hope you are using the
>>> latest since that always has more bug fixes and features) that does not
>>> have SNESSetFunctionDomainError then just make sure the function vector
>>> result has an infinity or NaN in it and then SNESConvergedReason will be
>>> SNES_DIVERGED_FNORM_NAN
>>>
>>>
>>>
>>>
>>> Thanks,
>>> David
>>>
>>>
>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>>
>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <
>>>> petsc-users at mcs.anl.gov> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm using PETSc via the libMesh framework, so creating a MWE is
>>>> complicated by that, unfortunately.
>>>>
>>>> The situation is that I am not modifying the solution vector in a
>>>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
>>>> want to update the solution vector (reset it to the "previously converged
>>>> value") and then try to solve again with a smaller load increment. This is
>>>> a typical "auto load stepping" strategy in FE.
>>>>
>>>>
>>>>    Once a PetscError is generated you CANNOT continue the PETSc
>>>> program, it is not designed to allow this and trying to continue will lead
>>>> to further problems.
>>>>
>>>>    So what you need to do is prevent PETSc from getting to the point
>>>> where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve()
>>>> returns without generating an error even if the nonlinear solver failed
>>>> (for example did not converge). One then uses SNESGetConvergedReason to
>>>> check if it converged or not. Normally when SNESSolve() returns, regardless
>>>> of whether the converged reason is negative or positive, there will be no
>>>> locked vectors and one can modify the SNES object and call SNESSolve again.
>>>>
>>>>   So my guess is that an actual PETSc error is being generated
>>>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by
>>>> either your code or libMesh or the option -snes_error_if_not_converged is
>>>> being used. In your case when you wish the code to work after a
>>>> non-converged SNESSolve() these options should never be set instead you
>>>> should check the result of SNESGetConvergedReason() to check if SNESSolve
>>>> has failed. If SNESSetErrorIfNotConverged() is never being set that may
>>>> indicate you are using an old version of PETSc or have it a bug inside
>>>> PETSc's SNES that does not handle errors correctly and we can help fix the
>>>> problem if you can provide a full debug output version of when the error
>>>> occurs.
>>>>
>>>>   Barry
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I think the key piece of info I'd like to know is, at what point is the
>>>> solution vector "unlocked" by the SNES object? Should it be unlocked as
>>>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
>>>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
>>>> processes). Should I manually "unlock" the solution vector by
>>>> calling VecLockWriteSet?
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>>
>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <
>>>> stefano.zampini at gmail.com> wrote:
>>>>
>>>>> You are not allowed to call VecGetArray on the solution vector of an
>>>>> SNES object within a user callback, nor to modify its values in any other
>>>>> way.
>>>>> Put in C++ lingo, the solution vector is a "const" argument
>>>>> It would be great if you could provide an MWE to help us understand
>>>>> your problem
>>>>>
>>>>>
>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via
>>>>> petsc-users <petsc-users at mcs.anl.gov> ha scritto:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have a question about this error:
>>>>>>
>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only
>>>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate
>>>>>>> to function begin)
>>>>>>
>>>>>>
>>>>>> I'm encountering this error in an FE solve where there is an error
>>>>>> encountered during the residual/jacobian assembly, and what we normally do
>>>>>> in that situation is shrink the load step and continue, starting from the
>>>>>> "last converged solution". However, in this case I'm running on 32
>>>>>> processes, and 5 of the processes report the error above about a "locked
>>>>>> vector".
>>>>>>
>>>>>> We clear the SNES object (via SNESDestroy) before we reset the
>>>>>> solution to the "last converged solution", and then we make a new SNES
>>>>>> object subsequently. But it seems to me that somehow the solution vector is
>>>>>> still marked as "locked" on 5 of the processes when we modify the solution
>>>>>> vector, which leads to the error above.
>>>>>>
>>>>>> I was wondering if someone could advise on what the best way to
>>>>>> handle this would be? I thought one option could be to add an MPI barrier
>>>>>> call prior to updating the solution vector to "last converged solution", to
>>>>>> make sure that the SNES object is destroyed on all procs (and hence the
>>>>>> locks cleared) before editing the solution vector, but I'm unsure if that
>>>>>> would make a difference. Any  help would be most appreciated!
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Stefano
>>>>>
>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251222/a87e412b/attachment-0001.html>

From david.knezevic at akselos.com  Mon Dec 22 14:03:03 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Mon, 22 Dec 2025 15:03:03 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9AR7AoNVzv=kiy=74mQ_C=Gh5v-dae6Y6gptogMFW_AeA@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
	<DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>
	<CAJCWK9DWYBLj0cE6x+D2X-fMOw8mQ-7KTattef7+fNbBfmJLvg@mail.gmail.com>
	<CAJCWK9C2bsTCf-iPa1Udeo2jaLJ_WwvgoY=MFszV03BRSNjPcw@mail.gmail.com>
	<FFA9D9F1-C74B-40F1-9B21-6D0C14CB38AA@petsc.dev>
	<CAJCWK9AR7AoNVzv=kiy=74mQ_C=Gh5v-dae6Y6gptogMFW_AeA@mail.gmail.com>
Message-ID: <CAJCWK9B0f_BRsRVPM6nhEWjzo11V0zd_+tgGuLR+YWnM5q+o1w@mail.gmail.com>

P.S. As a test I removed the "postcheck" callback, and I still get the same
behavior with the DIVERGED_LINE_SEARCH converged reason, so I guess the
"postcheck" is not related.


On Mon, Dec 22, 2025 at 1:58?PM David Knezevic <david.knezevic at akselos.com>
wrote:

> The print out I get from -snes_view is shown below. I wonder if the issue
> is related to "using user-defined postcheck step"?
>
>
> SNES Object: 1 MPI process
>   type: newtonls
>   maximum iterations=5, maximum function evaluations=10000
>   tolerances: relative=0., absolute=0., solution=0.
>   total number of linear solver iterations=3
>   total number of function evaluations=4
>   norm schedule ALWAYS
>   SNESLineSearch Object: 1 MPI process
>     type: basic
>     maxstep=1.000000e+08, minlambda=1.000000e-12
>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
> lambda=1.000000e-08
>     maximum iterations=40
>     using user-defined postcheck step
>   KSP Object: 1 MPI process
>     type: preonly
>     maximum iterations=10000, initial guess is zero
>     tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>     left preconditioning
>     using NONE norm type for convergence test
>   PC Object: 1 MPI process
>     type: cholesky
>       out-of-place factorization
>       tolerance for zero pivot 2.22045e-14
>       matrix ordering: external
>       factor fill ratio given 0., needed 0.
>         Factored matrix follows:
>           Mat Object: 1 MPI process
>             type: mumps
>             rows=1152, cols=1152
>             package used to perform factorization: mumps
>             total: nonzeros=126936, allocated nonzeros=126936
>               MUMPS run parameters:
>                 Use -ksp_view ::ascii_info_detail to display information
> for all processes
>                 RINFOG(1) (global estimated flops for the elimination
> after analysis): 1.63461e+07
>                 RINFOG(2) (global estimated flops for the assembly after
> factorization): 74826.
>                 RINFOG(3) (global estimated flops for the elimination
> after factorization): 1.63461e+07
>                 (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):
> (0.,0.)*(2^0)
>                 INFOG(3) (estimated real workspace for factors on all
> processors after analysis): 150505
>                 INFOG(4) (estimated integer workspace for factors on all
> processors after analysis): 6276
>                 INFOG(5) (estimated maximum front size in the complete
> tree): 216
>                 INFOG(6) (number of nodes in the complete tree): 24
>                 INFOG(7) (ordering option effectively used after
> analysis): 2
>                 INFOG(8) (structural symmetry in percent of the permuted
> matrix after analysis): 100
>                 INFOG(9) (total real/complex workspace to store the matrix
> factors after factorization): 150505
>                 INFOG(10) (total integer space store the matrix factors
> after factorization): 6276
>                 INFOG(11) (order of largest frontal matrix after
> factorization): 216
>                 INFOG(12) (number of off-diagonal pivots): 1044
>                 INFOG(13) (number of delayed pivots after factorization): 0
>                 INFOG(14) (number of memory compress after factorization):
> 0
>                 INFOG(15) (number of steps of iterative refinement after
> solution): 0
>                 INFOG(16) (estimated size (in MB) of all MUMPS internal
> data for factorization after analysis: value on the most memory consuming
> processor): 2
>                 INFOG(17) (estimated size of all MUMPS internal data for
> factorization after analysis: sum over all processors): 2
>                 INFOG(18) (size of all MUMPS internal data allocated
> during factorization: value on the most memory consuming processor): 2
>                 INFOG(19) (size of all MUMPS internal data allocated
> during factorization: sum over all processors): 2
>                 INFOG(20) (estimated number of entries in the factors):
> 126936
>                 INFOG(21) (size in MB of memory effectively used during
> factorization - value on the most memory consuming processor): 2
>                 INFOG(22) (size in MB of memory effectively used during
> factorization - sum over all processors): 2
>                 INFOG(23) (after analysis: value of ICNTL(6) effectively
> used): 0
>                 INFOG(24) (after analysis: value of ICNTL(12) effectively
> used): 1
>                 INFOG(25) (after factorization: number of pivots modified
> by static pivoting): 0
>                 INFOG(28) (after factorization: number of null pivots
> encountered): 0
>                 INFOG(29) (after factorization: effective number of
> entries in the factors (sum over all processors)): 126936
>                 INFOG(30, 31) (after solution: size in Mbytes of memory
> used during solution phase): 2, 2
>                 INFOG(32) (after analysis: type of analysis done): 1
>                 INFOG(33) (value used for ICNTL(8)): 7
>                 INFOG(34) (exponent of the determinant if determinant is
> requested): 0
>                 INFOG(35) (after factorization: number of entries taking
> into account BLR factor compression - sum over all processors): 126936
>                 INFOG(36) (after analysis: estimated size of all MUMPS
> internal data for running BLR in-core - value on the most memory consuming
> processor): 0
>                 INFOG(37) (after analysis: estimated size of all MUMPS
> internal data for running BLR in-core - sum over all processors): 0
>                 INFOG(38) (after analysis: estimated size of all MUMPS
> internal data for running BLR out-of-core - value on the most memory
> consuming processor): 0
>                 INFOG(39) (after analysis: estimated size of all MUMPS
> internal data for running BLR out-of-core - sum over all processors): 0
>     linear system matrix = precond matrix:
>     Mat Object: 1 MPI process
>       type: seqaij
>       rows=1152, cols=1152
>       total: nonzeros=60480, allocated nonzeros=60480
>       total number of mallocs used during MatSetValues calls=0
>         using I-node routines: found 384 nodes, limit used is 5
>
>
>
> On Mon, Dec 22, 2025 at 9:25?AM Barry Smith <bsmith at petsc.dev> wrote:
>
>>   David,
>>
>>     This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was
>> added long after the origins of SNES and, in places, the code was never
>> fully updated to handle function domain problems. In particular, parts of
>> the line search don't handle it correctly. Can you run with -snes_view and
>> that will help us find the spot that needs to be updated.
>>
>>    Barry
>>
>>
>> On Dec 21, 2025, at 5:53?PM, David Knezevic <david.knezevic at akselos.com>
>> wrote:
>>
>> Hi, actually, I have a follow up on this topic.
>>
>> I noticed that when I call SNESSetFunctionDomainError(), it exits the
>> solve as expected, but it leads to a converged reason
>> "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also
>> set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the
>> callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged
>> reason, so that's what I'm doing now. I was surprised by this behavior,
>> though, since I expected that calling SNESSetFunctionDomainError woudld
>> lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to
>> check on what could be causing this.
>>
>> FYI, I'm using PETSc 3.23.4
>>
>> Thanks,
>> David
>>
>>
>> On Thu, Dec 18, 2025 at 8:10?AM David Knezevic <
>> david.knezevic at akselos.com> wrote:
>>
>>> Thank you very much for this guidance. I switched to use
>>> SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now.
>>>
>>> Thanks!
>>> David
>>>
>>>
>>> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>>
>>>> On Dec 17, 2025, at 2:47?PM, David Knezevic <david.knezevic at akselos.com>
>>>> wrote:
>>>>
>>>> Stefano and Barry: Thank you, this is very helpful.
>>>>
>>>> I'll give some more info here which may help to clarify further.
>>>> Normally we do just get a negative "converged reason", as you described.
>>>> But in this specific case where I'm having issues the solve is a
>>>> numerically sensitive creep solve, which has exponential terms in the
>>>> residual and jacobian callback that can "blow up" and give NaN values. In
>>>> this case, the root cause is that we hit a NaN value during a callback, and
>>>> then we throw an exception (in libMesh C++ code) which I gather leads to
>>>> the SNES solve exiting with this error code.
>>>>
>>>> Is there a way to tell the SNES to terminate with a negative "converged
>>>> reason" because we've encountered some issue during the callback?
>>>>
>>>>
>>>>    In your callback you should call SNESSetFunctionDomainError() and
>>>> make sure the function value has an infinity or NaN in it (you can call
>>>> VecFlag() for this purpose)).
>>>>
>>>>    Now SNESConvergedReason will be a completely
>>>> reasonable SNES_DIVERGED_FUNCTION_DOMAIN
>>>>
>>>>   Barry
>>>>
>>>> If you are using an ancient version of PETSc (I hope you are using the
>>>> latest since that always has more bug fixes and features) that does not
>>>> have SNESSetFunctionDomainError then just make sure the function vector
>>>> result has an infinity or NaN in it and then SNESConvergedReason will be
>>>> SNES_DIVERGED_FNORM_NAN
>>>>
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <
>>>>> petsc-users at mcs.anl.gov> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I'm using PETSc via the libMesh framework, so creating a MWE is
>>>>> complicated by that, unfortunately.
>>>>>
>>>>> The situation is that I am not modifying the solution vector in a
>>>>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
>>>>> want to update the solution vector (reset it to the "previously converged
>>>>> value") and then try to solve again with a smaller load increment. This is
>>>>> a typical "auto load stepping" strategy in FE.
>>>>>
>>>>>
>>>>>    Once a PetscError is generated you CANNOT continue the PETSc
>>>>> program, it is not designed to allow this and trying to continue will lead
>>>>> to further problems.
>>>>>
>>>>>    So what you need to do is prevent PETSc from getting to the point
>>>>> where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve()
>>>>> returns without generating an error even if the nonlinear solver failed
>>>>> (for example did not converge). One then uses SNESGetConvergedReason to
>>>>> check if it converged or not. Normally when SNESSolve() returns, regardless
>>>>> of whether the converged reason is negative or positive, there will be no
>>>>> locked vectors and one can modify the SNES object and call SNESSolve again.
>>>>>
>>>>>   So my guess is that an actual PETSc error is being generated
>>>>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by
>>>>> either your code or libMesh or the option -snes_error_if_not_converged is
>>>>> being used. In your case when you wish the code to work after a
>>>>> non-converged SNESSolve() these options should never be set instead you
>>>>> should check the result of SNESGetConvergedReason() to check if SNESSolve
>>>>> has failed. If SNESSetErrorIfNotConverged() is never being set that may
>>>>> indicate you are using an old version of PETSc or have it a bug inside
>>>>> PETSc's SNES that does not handle errors correctly and we can help fix the
>>>>> problem if you can provide a full debug output version of when the error
>>>>> occurs.
>>>>>
>>>>>   Barry
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I think the key piece of info I'd like to know is, at what point is
>>>>> the solution vector "unlocked" by the SNES object? Should it be unlocked as
>>>>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
>>>>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
>>>>> processes). Should I manually "unlock" the solution vector by
>>>>> calling VecLockWriteSet?
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <
>>>>> stefano.zampini at gmail.com> wrote:
>>>>>
>>>>>> You are not allowed to call VecGetArray on the solution vector of an
>>>>>> SNES object within a user callback, nor to modify its values in any other
>>>>>> way.
>>>>>> Put in C++ lingo, the solution vector is a "const" argument
>>>>>> It would be great if you could provide an MWE to help us understand
>>>>>> your problem
>>>>>>
>>>>>>
>>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via
>>>>>> petsc-users <petsc-users at mcs.anl.gov> ha scritto:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I have a question about this error:
>>>>>>>
>>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only
>>>>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate
>>>>>>>> to function begin)
>>>>>>>
>>>>>>>
>>>>>>> I'm encountering this error in an FE solve where there is an error
>>>>>>> encountered during the residual/jacobian assembly, and what we normally do
>>>>>>> in that situation is shrink the load step and continue, starting from the
>>>>>>> "last converged solution". However, in this case I'm running on 32
>>>>>>> processes, and 5 of the processes report the error above about a "locked
>>>>>>> vector".
>>>>>>>
>>>>>>> We clear the SNES object (via SNESDestroy) before we reset the
>>>>>>> solution to the "last converged solution", and then we make a new SNES
>>>>>>> object subsequently. But it seems to me that somehow the solution vector is
>>>>>>> still marked as "locked" on 5 of the processes when we modify the solution
>>>>>>> vector, which leads to the error above.
>>>>>>>
>>>>>>> I was wondering if someone could advise on what the best way to
>>>>>>> handle this would be? I thought one option could be to add an MPI barrier
>>>>>>> call prior to updating the solution vector to "last converged solution", to
>>>>>>> make sure that the SNES object is destroyed on all procs (and hence the
>>>>>>> locks cleared) before editing the solution vector, but I'm unsure if that
>>>>>>> would make a difference. Any  help would be most appreciated!
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Stefano
>>>>>>
>>>>>
>>>>>
>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251222/db7c579f/attachment-0001.html>

From liluo at um.edu.mo  Mon Dec 22 21:41:42 2025
From: liluo at um.edu.mo (liluo)
Date: Tue, 23 Dec 2025 03:41:42 +0000
Subject: [petsc-users] A partition of DMPlex mesh similar to what DMDA
 provides?
In-Reply-To: <CAMYG4G=kA1iHT2XOfssX88QW0mcfXW38fdaF8ENunwxhKSGMPA@mail.gmail.com>
References: <f7523c19066340e08db4131e0fcf1f59@um.edu.mo>,
	<CAMYG4G=kA1iHT2XOfssX88QW0mcfXW38fdaF8ENunwxhKSGMPA@mail.gmail.com>
Message-ID: <5e6806fe357b417e85969c8a0ae62418@um.edu.mo>

Thanks for your suggestions!


Since the code is already using tets for finite element discretization, I don't want to change it, but want a classical DMDA type partition.


Bests,

Li

________________________________
From: Matthew Knepley <knepley at gmail.com>
Sent: Tuesday, 23 December, 2025 00:21:22
To: liluo
Cc: petsc-users at mcs.anl.gov; Zhang Pai
Subject: Re: [petsc-users] A partition of DMPlex mesh similar to what DMDA provides?

On Mon, Dec 22, 2025 at 5:?46 AM liluo <liluo@?um.?edu.?mo> wrote: Dear PETSc developers, I?m using DMPlex to manage an unstructured mesh. However, in my case, the input mesh is actually a structured tetrahedral mesh, and its geometric domain

On Mon, Dec 22, 2025 at 5:46?AM liluo <liluo at um.edu.mo<mailto:liluo at um.edu.mo>> wrote:

Dear PETSc developers,


I?m using DMPlex to manage an unstructured mesh. However, in my case, the input mesh is actually a structured tetrahedral mesh, and its geometric domain is just a simple box.


Is there any PETSc functionality or recommended approach to obtain a partition similar to what DMDA provides?i.e., a simple Cartesian block partition?when working with such a mesh in DMPlex?

Any guidance or best practices would be greatly appreciated.

This is trivial in 2D because triangles nicely tile the box, but in 3D tetrahedra are harder to handle.I can see three avenues:

1) Manually

You can use PlexPartitioner type user, which allows you to explicitly indicate the cell numbers that go to each process. This is probably more work than you want.

2) Mesh Partitioner + Refinement

You can run a partitioner on a small mesh, for which they are pretty good, and then refine that. This is mostly what I do.

3) New algorithm

Amal Timalsina published a nice algorithm for converting hexes to tets, so you could create a hex mesh that is partitioned exactly as you want, and then convert it to tets, but this would mean writing new code.

Why are you using tets instead of hexes for this problem?

  Thanks,

      Matt


Thank you!


Bests,

Li Luo

<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.us_v3_-5F-5Fhttps-3A__www.fst.um.edu.mo_personal_liluo_-5F-5F-3B-21-21G-5FuCfscf7eWS-21a-2DZ3XuBEDtS9pyiXP2bJDeQK-2DwotmIUURxFJ-2DUithWms0lkQHP3QChyD8EVvLS3vBWbOf0dk5FLjRGs29VHW3w-24&d=DwMFaQ&c=KXXihdR8fRNGFkKiMQzstpt6drHDqSenG-8Qi3URQqo&r=vQEG4jaU8SMWlleSoH0Tmw&m=yDc_qSOXtu6tik_WQplgczTNPTAwCAzIIEkvN2PmNI_tjWzQRklwnlgkAYLF9TSE&s=y4nZzihmKU6TUoREMuCrWc7YKQLpLEXz39fJQ9Ps_Z4&e= >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f78XCziNeTCVb1AnVMYJVAf2Ped-KlffK4RRVwpO9gKfJ013n07va2DBl_SzI6is-rHSJwgZ3R8slp9jZfmZKg$ <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cse.buffalo.edu_-7Eknepley_&d=DwMFaQ&c=KXXihdR8fRNGFkKiMQzstpt6drHDqSenG-8Qi3URQqo&r=vQEG4jaU8SMWlleSoH0Tmw&m=yDc_qSOXtu6tik_WQplgczTNPTAwCAzIIEkvN2PmNI_tjWzQRklwnlgkAYLF9TSE&s=9KK3YfUKgJpJRenn_4j8JeXDH2C2upZWE3tQOS8CZk0&e= >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251223/523d463e/attachment.html>

From bsmith at petsc.dev  Wed Dec 24 22:02:17 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 24 Dec 2025 23:02:17 -0500
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CAJCWK9B0f_BRsRVPM6nhEWjzo11V0zd_+tgGuLR+YWnM5q+o1w@mail.gmail.com>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
	<DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>
	<CAJCWK9DWYBLj0cE6x+D2X-fMOw8mQ-7KTattef7+fNbBfmJLvg@mail.gmail.com>
	<CAJCWK9C2bsTCf-iPa1Udeo2jaLJ_WwvgoY=MFszV03BRSNjPcw@mail.gmail.com>
	<FFA9D9F1-C74B-40F1-9B21-6D0C14CB38AA@petsc.dev>
	<CAJCWK9AR7AoNVzv=kiy=74mQ_C=Gh5v-dae6Y6gptogMFW_AeA@mail.gmail.com>
	<CAJCWK9B0f_BRsRVPM6nhEWjzo11V0zd_+tgGuLR+YWnM5q+o1w@mail.gmail.com>
Message-ID: <CCF8534E-F8A1-4CE0-8D6E-7367D5820434@petsc.dev>

   I have started a merge request to properly propagate failure reasons up from the line search to the SNESSolve in https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8914__;!!G_uCfscf7eWS!b2nWDVWqfyc96V63w_2sLd0siVZ769Ztwal8rZgfCzJ3q3V3ALVEMdGDLu6IvbSPmudCO08cQL4r0J54oVEz12k$  Could you give it a try when you get the chance?


> On Dec 22, 2025, at 3:03?PM, David Knezevic <david.knezevic at akselos.com> wrote:
> 
> P.S. As a test I removed the "postcheck" callback, and I still get the same behavior with the DIVERGED_LINE_SEARCH converged reason, so I guess the "postcheck" is not related.
> 
> 
> On Mon, Dec 22, 2025 at 1:58?PM David Knezevic <david.knezevic at akselos.com <mailto:david.knezevic at akselos.com>> wrote:
>> The print out I get from -snes_view is shown below. I wonder if the issue is related to "using user-defined postcheck step"?
>> 
>> 
>> SNES Object: 1 MPI process
>>   type: newtonls
>>   maximum iterations=5, maximum function evaluations=10000
>>   tolerances: relative=0., absolute=0., solution=0.
>>   total number of linear solver iterations=3
>>   total number of function evaluations=4
>>   norm schedule ALWAYS
>>   SNESLineSearch Object: 1 MPI process
>>     type: basic
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
>>     maximum iterations=40
>>     using user-defined postcheck step
>>   KSP Object: 1 MPI process
>>     type: preonly
>>     maximum iterations=10000, initial guess is zero
>>     tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>     left preconditioning
>>     using NONE norm type for convergence test
>>   PC Object: 1 MPI process
>>     type: cholesky
>>       out-of-place factorization
>>       tolerance for zero pivot 2.22045e-14
>>       matrix ordering: external
>>       factor fill ratio given 0., needed 0.
>>         Factored matrix follows:
>>           Mat Object: 1 MPI process
>>             type: mumps
>>             rows=1152, cols=1152
>>             package used to perform factorization: mumps
>>             total: nonzeros=126936, allocated nonzeros=126936
>>               MUMPS run parameters:
>>                 Use -ksp_view ::ascii_info_detail to display information for all processes
>>                 RINFOG(1) (global estimated flops for the elimination after analysis): 1.63461e+07
>>                 RINFOG(2) (global estimated flops for the assembly after factorization): 74826.
>>                 RINFOG(3) (global estimated flops for the elimination after factorization): 1.63461e+07
>>                 (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0)
>>                 INFOG(3) (estimated real workspace for factors on all processors after analysis): 150505
>>                 INFOG(4) (estimated integer workspace for factors on all processors after analysis): 6276
>>                 INFOG(5) (estimated maximum front size in the complete tree): 216
>>                 INFOG(6) (number of nodes in the complete tree): 24
>>                 INFOG(7) (ordering option effectively used after analysis): 2
>>                 INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100
>>                 INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 150505
>>                 INFOG(10) (total integer space store the matrix factors after factorization): 6276
>>                 INFOG(11) (order of largest frontal matrix after factorization): 216
>>                 INFOG(12) (number of off-diagonal pivots): 1044
>>                 INFOG(13) (number of delayed pivots after factorization): 0
>>                 INFOG(14) (number of memory compress after factorization): 0
>>                 INFOG(15) (number of steps of iterative refinement after solution): 0
>>                 INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2
>>                 INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 2
>>                 INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2
>>                 INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 2
>>                 INFOG(20) (estimated number of entries in the factors): 126936
>>                 INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2
>>                 INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 2
>>                 INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0
>>                 INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1
>>                 INFOG(25) (after factorization: number of pivots modified by static pivoting): 0
>>                 INFOG(28) (after factorization: number of null pivots encountered): 0
>>                 INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 126936
>>                 INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2, 2
>>                 INFOG(32) (after analysis: type of analysis done): 1
>>                 INFOG(33) (value used for ICNTL(8)): 7
>>                 INFOG(34) (exponent of the determinant if determinant is requested): 0
>>                 INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 126936
>>                 INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0
>>                 INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0
>>                 INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0
>>                 INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0
>>     linear system matrix = precond matrix:
>>     Mat Object: 1 MPI process
>>       type: seqaij
>>       rows=1152, cols=1152
>>       total: nonzeros=60480, allocated nonzeros=60480
>>       total number of mallocs used during MatSetValues calls=0
>>         using I-node routines: found 384 nodes, limit used is 5
>> 
>> 
>> 
>> On Mon, Dec 22, 2025 at 9:25?AM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>   David,
>>> 
>>>     This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was added long after the origins of SNES and, in places, the code was never fully updated to handle function domain problems. In particular, parts of the line search don't handle it correctly. Can you run with -snes_view and that will help us find the spot that needs to be updated. 
>>> 
>>>    Barry
>>> 
>>> 
>>>> On Dec 21, 2025, at 5:53?PM, David Knezevic <david.knezevic at akselos.com <mailto:david.knezevic at akselos.com>> wrote:
>>>> 
>>>> Hi, actually, I have a follow up on this topic.
>>>> 
>>>> I noticed that when I call SNESSetFunctionDomainError(), it exits the solve as expected, but it leads to a converged reason "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged reason, so that's what I'm doing now. I was surprised by this behavior, though, since I expected that calling SNESSetFunctionDomainError woudld lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to check on what could be causing this.
>>>> 
>>>> FYI, I'm using PETSc 3.23.4
>>>> 
>>>> Thanks,
>>>> David
>>>> 
>>>> 
>>>> On Thu, Dec 18, 2025 at 8:10?AM David Knezevic <david.knezevic at akselos.com <mailto:david.knezevic at akselos.com>> wrote:
>>>>> Thank you very much for this guidance. I switched to use SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now.
>>>>> 
>>>>> Thanks!
>>>>> David
>>>>> 
>>>>> 
>>>>> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>> 
>>>>>> 
>>>>>>> On Dec 17, 2025, at 2:47?PM, David Knezevic <david.knezevic at akselos.com <mailto:david.knezevic at akselos.com>> wrote:
>>>>>>> 
>>>>>>> Stefano and Barry: Thank you, this is very helpful.
>>>>>>> 
>>>>>>> I'll give some more info here which may help to clarify further. Normally we do just get a negative "converged reason", as you described. But in this specific case where I'm having issues the solve is a numerically sensitive creep solve, which has exponential terms in the residual and jacobian callback that can "blow up" and give NaN values. In this case, the root cause is that we hit a NaN value during a callback, and then we throw an exception (in libMesh C++ code) which I gather leads to the SNES solve exiting with this error code.
>>>>>>> 
>>>>>>> Is there a way to tell the SNES to terminate with a negative "converged reason" because we've encountered some issue during the callback?
>>>>>> 
>>>>>>    In your callback you should call SNESSetFunctionDomainError() and make sure the function value has an infinity or NaN in it (you can call VecFlag() for this purpose)). 
>>>>>> 
>>>>>>    Now SNESConvergedReason will be a completely reasonable SNES_DIVERGED_FUNCTION_DOMAIN
>>>>>> 
>>>>>>   Barry
>>>>>> 
>>>>>> If you are using an ancient version of PETSc (I hope you are using the latest since that always has more bug fixes and features) that does not have SNESSetFunctionDomainError then just make sure the function vector result has an infinity or NaN in it and then SNESConvergedReason will be SNES_DIVERGED_FNORM_NAN
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately.
>>>>>>>>> 
>>>>>>>>> The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE.
>>>>>>>> 
>>>>>>>>    Once a PetscError is generated you CANNOT continue the PETSc program, it is not designed to allow this and trying to continue will lead to further problems. 
>>>>>>>> 
>>>>>>>>    So what you need to do is prevent PETSc from getting to the point where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve() returns without generating an error even if the nonlinear solver failed (for example did not converge). One then uses SNESGetConvergedReason to check if it converged or not. Normally when SNESSolve() returns, regardless of whether the converged reason is negative or positive, there will be no locked vectors and one can modify the SNES object and call SNESSolve again. 
>>>>>>>> 
>>>>>>>>   So my guess is that an actual PETSc error is being generated because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by either your code or libMesh or the option -snes_error_if_not_converged is being used. In your case when you wish the code to work after a non-converged SNESSolve() these options should never be set instead you should check the result of SNESGetConvergedReason() to check if SNESSolve has failed. If SNESSetErrorIfNotConverged() is never being set that may indicate you are using an old version of PETSc or have it a bug inside PETSc's SNES that does not handle errors correctly and we can help fix the problem if you can provide a full debug output version of when the error occurs.
>>>>>>>> 
>>>>>>>>   Barry
>>>>>>>> 
>>>>>>>> 
>>>>>>>>   
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>> wrote:
>>>>>>>>>> You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way.
>>>>>>>>>> Put in C++ lingo, the solution vector is a "const" argument
>>>>>>>>>> It would be great if you could provide an MWE to help us understand your problem
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> ha scritto:
>>>>>>>>>>> Hi all,
>>>>>>>>>>> 
>>>>>>>>>>> I have a question about this error:
>>>>>>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin)
>>>>>>>>>>> 
>>>>>>>>>>> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector".
>>>>>>>>>>> 
>>>>>>>>>>> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above.
>>>>>>>>>>> 
>>>>>>>>>>> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any  help would be most appreciated!
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Stefano
>>>>>>>> 
>>>>>> 
>>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251224/118b2c4b/attachment-0001.html>

From david.knezevic at akselos.com  Thu Dec 25 15:00:26 2025
From: david.knezevic at akselos.com (David Knezevic)
Date: Thu, 25 Dec 2025 15:00:26 -0600
Subject: [petsc-users] Question regarding SNES error about locked vectors
In-Reply-To: <CCF8534E-F8A1-4CE0-8D6E-7367D5820434@petsc.dev>
References: <CAJCWK9AOj5WS4AKm5PDA=HVK2BAJ5FiCCHXwBRFDDc919pqoaQ@mail.gmail.com>
	<CAGPUisi_qfRZt0rd9fouUQLs6nRw2KH4hV3e6a2+Eu041MeVmA@mail.gmail.com>
	<CAJCWK9D=Ur2iemv1F_AnhenDMi9CVj+LWR7UkDjSP7p3A42Z=Q@mail.gmail.com>
	<855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev>
	<CAJCWK9AZtxA4+zEtJO2Wh7QVSgmorgNLvEj-xY_jSLTGZ-usWg@mail.gmail.com>
	<DA96903B-B9E3-4F9D-9CA4-9CD0807551AF@petsc.dev>
	<CAJCWK9DWYBLj0cE6x+D2X-fMOw8mQ-7KTattef7+fNbBfmJLvg@mail.gmail.com>
	<CAJCWK9C2bsTCf-iPa1Udeo2jaLJ_WwvgoY=MFszV03BRSNjPcw@mail.gmail.com>
	<FFA9D9F1-C74B-40F1-9B21-6D0C14CB38AA@petsc.dev>
	<CAJCWK9AR7AoNVzv=kiy=74mQ_C=Gh5v-dae6Y6gptogMFW_AeA@mail.gmail.com>
	<CAJCWK9B0f_BRsRVPM6nhEWjzo11V0zd_+tgGuLR+YWnM5q+o1w@mail.gmail.com>
	<CCF8534E-F8A1-4CE0-8D6E-7367D5820434@petsc.dev>
Message-ID: <CAJCWK9DHNghwyXSZNvRg=jgbUOQaas9pb-9j0GbU-TEAUB4=tg@mail.gmail.com>

OK, thanks! I'll let you know once I get a chance to try it out.


On Wed, Dec 24, 2025 at 10:02?PM Barry Smith <bsmith at petsc.dev> wrote:

>    I have started a merge request to properly propagate failure reasons up
> from the line search to the SNESSolve in
> https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8914__;!!G_uCfscf7eWS!c1k47peCRJtiG7O9EYxpFZUWSVyAnoq-6zoYdEPVFi0-gbNBHUxwlalV7EwvUqCe4iRdsX2nR2S2lzW1Ww7O2LY0rRmDhMQ$  Could you give it a
> try when you get the chance?
>
>
> On Dec 22, 2025, at 3:03?PM, David Knezevic <david.knezevic at akselos.com>
> wrote:
>
> P.S. As a test I removed the "postcheck" callback, and I still get
> the same behavior with the DIVERGED_LINE_SEARCH converged reason, so I
> guess the "postcheck" is not related.
>
>
> On Mon, Dec 22, 2025 at 1:58?PM David Knezevic <david.knezevic at akselos.com>
> wrote:
>
>> The print out I get from -snes_view is shown below. I wonder if the issue
>> is related to "using user-defined postcheck step"?
>>
>>
>> SNES Object: 1 MPI process
>>   type: newtonls
>>   maximum iterations=5, maximum function evaluations=10000
>>   tolerances: relative=0., absolute=0., solution=0.
>>   total number of linear solver iterations=3
>>   total number of function evaluations=4
>>   norm schedule ALWAYS
>>   SNESLineSearch Object: 1 MPI process
>>     type: basic
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>> lambda=1.000000e-08
>>     maximum iterations=40
>>     using user-defined postcheck step
>>   KSP Object: 1 MPI process
>>     type: preonly
>>     maximum iterations=10000, initial guess is zero
>>     tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>     left preconditioning
>>     using NONE norm type for convergence test
>>   PC Object: 1 MPI process
>>     type: cholesky
>>       out-of-place factorization
>>       tolerance for zero pivot 2.22045e-14
>>       matrix ordering: external
>>       factor fill ratio given 0., needed 0.
>>         Factored matrix follows:
>>           Mat Object: 1 MPI process
>>             type: mumps
>>             rows=1152, cols=1152
>>             package used to perform factorization: mumps
>>             total: nonzeros=126936, allocated nonzeros=126936
>>               MUMPS run parameters:
>>                 Use -ksp_view ::ascii_info_detail to display information
>> for all processes
>>                 RINFOG(1) (global estimated flops for the elimination
>> after analysis): 1.63461e+07
>>                 RINFOG(2) (global estimated flops for the assembly after
>> factorization): 74826.
>>                 RINFOG(3) (global estimated flops for the elimination
>> after factorization): 1.63461e+07
>>                 (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):
>> (0.,0.)*(2^0)
>>                 INFOG(3) (estimated real workspace for factors on all
>> processors after analysis): 150505
>>                 INFOG(4) (estimated integer workspace for factors on all
>> processors after analysis): 6276
>>                 INFOG(5) (estimated maximum front size in the complete
>> tree): 216
>>                 INFOG(6) (number of nodes in the complete tree): 24
>>                 INFOG(7) (ordering option effectively used after
>> analysis): 2
>>                 INFOG(8) (structural symmetry in percent of the permuted
>> matrix after analysis): 100
>>                 INFOG(9) (total real/complex workspace to store the
>> matrix factors after factorization): 150505
>>                 INFOG(10) (total integer space store the matrix factors
>> after factorization): 6276
>>                 INFOG(11) (order of largest frontal matrix after
>> factorization): 216
>>                 INFOG(12) (number of off-diagonal pivots): 1044
>>                 INFOG(13) (number of delayed pivots after factorization):
>> 0
>>                 INFOG(14) (number of memory compress after
>> factorization): 0
>>                 INFOG(15) (number of steps of iterative refinement after
>> solution): 0
>>                 INFOG(16) (estimated size (in MB) of all MUMPS internal
>> data for factorization after analysis: value on the most memory consuming
>> processor): 2
>>                 INFOG(17) (estimated size of all MUMPS internal data for
>> factorization after analysis: sum over all processors): 2
>>                 INFOG(18) (size of all MUMPS internal data allocated
>> during factorization: value on the most memory consuming processor): 2
>>                 INFOG(19) (size of all MUMPS internal data allocated
>> during factorization: sum over all processors): 2
>>                 INFOG(20) (estimated number of entries in the factors):
>> 126936
>>                 INFOG(21) (size in MB of memory effectively used during
>> factorization - value on the most memory consuming processor): 2
>>                 INFOG(22) (size in MB of memory effectively used during
>> factorization - sum over all processors): 2
>>                 INFOG(23) (after analysis: value of ICNTL(6) effectively
>> used): 0
>>                 INFOG(24) (after analysis: value of ICNTL(12) effectively
>> used): 1
>>                 INFOG(25) (after factorization: number of pivots modified
>> by static pivoting): 0
>>                 INFOG(28) (after factorization: number of null pivots
>> encountered): 0
>>                 INFOG(29) (after factorization: effective number of
>> entries in the factors (sum over all processors)): 126936
>>                 INFOG(30, 31) (after solution: size in Mbytes of memory
>> used during solution phase): 2, 2
>>                 INFOG(32) (after analysis: type of analysis done): 1
>>                 INFOG(33) (value used for ICNTL(8)): 7
>>                 INFOG(34) (exponent of the determinant if determinant is
>> requested): 0
>>                 INFOG(35) (after factorization: number of entries taking
>> into account BLR factor compression - sum over all processors): 126936
>>                 INFOG(36) (after analysis: estimated size of all MUMPS
>> internal data for running BLR in-core - value on the most memory consuming
>> processor): 0
>>                 INFOG(37) (after analysis: estimated size of all MUMPS
>> internal data for running BLR in-core - sum over all processors): 0
>>                 INFOG(38) (after analysis: estimated size of all MUMPS
>> internal data for running BLR out-of-core - value on the most memory
>> consuming processor): 0
>>                 INFOG(39) (after analysis: estimated size of all MUMPS
>> internal data for running BLR out-of-core - sum over all processors): 0
>>     linear system matrix = precond matrix:
>>     Mat Object: 1 MPI process
>>       type: seqaij
>>       rows=1152, cols=1152
>>       total: nonzeros=60480, allocated nonzeros=60480
>>       total number of mallocs used during MatSetValues calls=0
>>         using I-node routines: found 384 nodes, limit used is 5
>>
>>
>>
>> On Mon, Dec 22, 2025 at 9:25?AM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>   David,
>>>
>>>     This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was
>>> added long after the origins of SNES and, in places, the code was never
>>> fully updated to handle function domain problems. In particular, parts of
>>> the line search don't handle it correctly. Can you run with -snes_view and
>>> that will help us find the spot that needs to be updated.
>>>
>>>    Barry
>>>
>>>
>>> On Dec 21, 2025, at 5:53?PM, David Knezevic <david.knezevic at akselos.com>
>>> wrote:
>>>
>>> Hi, actually, I have a follow up on this topic.
>>>
>>> I noticed that when I call SNESSetFunctionDomainError(), it exits the
>>> solve as expected, but it leads to a converged reason
>>> "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also
>>> set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the
>>> callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged
>>> reason, so that's what I'm doing now. I was surprised by this behavior,
>>> though, since I expected that calling SNESSetFunctionDomainError woudld
>>> lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to
>>> check on what could be causing this.
>>>
>>> FYI, I'm using PETSc 3.23.4
>>>
>>> Thanks,
>>> David
>>>
>>>
>>> On Thu, Dec 18, 2025 at 8:10?AM David Knezevic <
>>> david.knezevic at akselos.com> wrote:
>>>
>>>> Thank you very much for this guidance. I switched to use
>>>> SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now.
>>>>
>>>> Thanks!
>>>> David
>>>>
>>>>
>>>> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith <bsmith at petsc.dev> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Dec 17, 2025, at 2:47?PM, David Knezevic <
>>>>> david.knezevic at akselos.com> wrote:
>>>>>
>>>>> Stefano and Barry: Thank you, this is very helpful.
>>>>>
>>>>> I'll give some more info here which may help to clarify further.
>>>>> Normally we do just get a negative "converged reason", as you described.
>>>>> But in this specific case where I'm having issues the solve is a
>>>>> numerically sensitive creep solve, which has exponential terms in the
>>>>> residual and jacobian callback that can "blow up" and give NaN values. In
>>>>> this case, the root cause is that we hit a NaN value during a callback, and
>>>>> then we throw an exception (in libMesh C++ code) which I gather leads to
>>>>> the SNES solve exiting with this error code.
>>>>>
>>>>> Is there a way to tell the SNES to terminate with a negative
>>>>> "converged reason" because we've encountered some issue during the callback?
>>>>>
>>>>>
>>>>>    In your callback you should call SNESSetFunctionDomainError() and
>>>>> make sure the function value has an infinity or NaN in it (you can call
>>>>> VecFlag() for this purpose)).
>>>>>
>>>>>    Now SNESConvergedReason will be a completely
>>>>> reasonable SNES_DIVERGED_FUNCTION_DOMAIN
>>>>>
>>>>>   Barry
>>>>>
>>>>> If you are using an ancient version of PETSc (I hope you are using the
>>>>> latest since that always has more bug fixes and features) that does not
>>>>> have SNESSetFunctionDomainError then just make sure the function vector
>>>>> result has an infinity or NaN in it and then SNESConvergedReason will be
>>>>> SNES_DIVERGED_FNORM_NAN
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>
>>>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith <bsmith at petsc.dev> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users <
>>>>>> petsc-users at mcs.anl.gov> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm using PETSc via the libMesh framework, so creating a MWE is
>>>>>> complicated by that, unfortunately.
>>>>>>
>>>>>> The situation is that I am not modifying the solution vector in a
>>>>>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
>>>>>> want to update the solution vector (reset it to the "previously converged
>>>>>> value") and then try to solve again with a smaller load increment. This is
>>>>>> a typical "auto load stepping" strategy in FE.
>>>>>>
>>>>>>
>>>>>>    Once a PetscError is generated you CANNOT continue the PETSc
>>>>>> program, it is not designed to allow this and trying to continue will lead
>>>>>> to further problems.
>>>>>>
>>>>>>    So what you need to do is prevent PETSc from getting to the point
>>>>>> where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve()
>>>>>> returns without generating an error even if the nonlinear solver failed
>>>>>> (for example did not converge). One then uses SNESGetConvergedReason to
>>>>>> check if it converged or not. Normally when SNESSolve() returns, regardless
>>>>>> of whether the converged reason is negative or positive, there will be no
>>>>>> locked vectors and one can modify the SNES object and call SNESSolve again.
>>>>>>
>>>>>>   So my guess is that an actual PETSc error is being generated
>>>>>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by
>>>>>> either your code or libMesh or the option -snes_error_if_not_converged is
>>>>>> being used. In your case when you wish the code to work after a
>>>>>> non-converged SNESSolve() these options should never be set instead you
>>>>>> should check the result of SNESGetConvergedReason() to check if SNESSolve
>>>>>> has failed. If SNESSetErrorIfNotConverged() is never being set that may
>>>>>> indicate you are using an old version of PETSc or have it a bug inside
>>>>>> PETSc's SNES that does not handle errors correctly and we can help fix the
>>>>>> problem if you can provide a full debug output version of when the error
>>>>>> occurs.
>>>>>>
>>>>>>   Barry
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I think the key piece of info I'd like to know is, at what point is
>>>>>> the solution vector "unlocked" by the SNES object? Should it be unlocked as
>>>>>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
>>>>>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
>>>>>> processes). Should I manually "unlock" the solution vector by
>>>>>> calling VecLockWriteSet?
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini <
>>>>>> stefano.zampini at gmail.com> wrote:
>>>>>>
>>>>>>> You are not allowed to call VecGetArray on the solution vector of an
>>>>>>> SNES object within a user callback, nor to modify its values in any other
>>>>>>> way.
>>>>>>> Put in C++ lingo, the solution vector is a "const" argument
>>>>>>> It would be great if you could provide an MWE to help us understand
>>>>>>> your problem
>>>>>>>
>>>>>>>
>>>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via
>>>>>>> petsc-users <petsc-users at mcs.anl.gov> ha scritto:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I have a question about this error:
>>>>>>>>
>>>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only
>>>>>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate
>>>>>>>>> to function begin)
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm encountering this error in an FE solve where there is an error
>>>>>>>> encountered during the residual/jacobian assembly, and what we normally do
>>>>>>>> in that situation is shrink the load step and continue, starting from the
>>>>>>>> "last converged solution". However, in this case I'm running on 32
>>>>>>>> processes, and 5 of the processes report the error above about a "locked
>>>>>>>> vector".
>>>>>>>>
>>>>>>>> We clear the SNES object (via SNESDestroy) before we reset the
>>>>>>>> solution to the "last converged solution", and then we make a new SNES
>>>>>>>> object subsequently. But it seems to me that somehow the solution vector is
>>>>>>>> still marked as "locked" on 5 of the processes when we modify the solution
>>>>>>>> vector, which leads to the error above.
>>>>>>>>
>>>>>>>> I was wondering if someone could advise on what the best way to
>>>>>>>> handle this would be? I thought one option could be to add an MPI barrier
>>>>>>>> call prior to updating the solution vector to "last converged solution", to
>>>>>>>> make sure that the SNES object is destroyed on all procs (and hence the
>>>>>>>> locks cleared) before editing the solution vector, but I'm unsure if that
>>>>>>>> would make a difference. Any  help would be most appreciated!
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Stefano
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251225/ab1fb3b0/attachment-0001.html>