From aldo.bonfiglioli at unibas.it Mon Dec 1 07:22:21 2025 From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli) Date: Mon, 1 Dec 2025 14:22:21 +0100 Subject: [petsc-users] Trouble when viewing a subDM in vtk format Message-ID: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it> Dear developers, I wrote a code that extracts subDMs corresponding to the various strata in the Face Sets. I run into troubles when I view a subDM or a Vec attached to the subDM using the VTK format. More precisely, the problem only occurs on more than one processor when the rank=0 processor has no points on a given subDM. For instance, when the attached reproducer is run on 2 procs, the u_01.vtu file (global u Vec mapped to the subDM corresponding to stratum=1) only includes the header, but no data. All other u_0?.vtu files can successfully be loaded and viewed in paraview. The problem does NOT arise when I view the same objects in HDF5 format. However, my problem in using the HDF5 lies in the fact that: while the hdf5 file obtained with DMView can be post-processed with "petsc/lib/petsc/bin/petsc_gen_xdmf.py" to create a xmf file readable by paraview I do not know how to view the field(s) associated with the DM when the hdf5 file is obtained from VecView. The reproducer compiles with the latest petsc release. Thanks, Aldo -- Dr. Aldo Bonfiglioli Associate professor of Fluid Mechanics Dipartimento di Ingegneria Universita' della Basilicata V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY tel:+39.0971.205203 fax:+39.0971.205215 web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!eXE2csxUb1J5bnRosf9LIGxLs17TdstMxQcGs0mbXFroRjzLN81jVmeAWfMr41X8JGEuVm286lVTmDgmi5s1zfsfegJNuWVVWTM$ -------------- next part -------------- A non-text attachment was scrubbed... Name: rgmsh.F90 Type: text/x-fortran Size: 26554 bytes Desc: not available URL: -------------- next part -------------- -dm_plex_dim 3 -dm_plex_shape box -dm_plex_box_faces 20,20,20 -dm_plex_box_lower 0.,0.,0. -dm_plex_box_upper 1.,1.,1. ##-dm_plex_filename cube6.msh ##-dm_plex_simplex false -dm_plex_simplex true -dm_plex_interpolate ##-dm_plex_check_all ##-dm_plex_filename /home/abonfi/grids/3D/MASA_ns3d/unnested/cube1/cube1.msh # # read a solution from an existing file # #-viewer_type hdf5 #-viewer_format vtk #-viewer_filename /home/abonfi/testcases/3D/scalar/advdiff/hiro/DMPlex/Re100/cube0/u.h5 # ##-viewer_binary_filename /home/abonfi/testcases/3D/scalar/advdiff/hiro/DMPlex/Re100/cube1/sol.bin # # and write it to u.* # ###-vec_view hdf5:u.h5 -vec_view vtk:u.vtu ## ## I can read both mesh.h5 and mesh.vtu in paraview ## -dm_view hdf5:mesh.h5 ##-dm_view vtk:mesh.vtu # # uncomment the following to write each boundary patch separately in an HDF5 file # works both serial and parallel # -patch_01_dm_view hdf5:patch_01.h5 -patch_02_dm_view hdf5:patch_02.h5 -patch_03_dm_view hdf5:patch_03.h5 -patch_04_dm_view hdf5:patch_04.h5 -patch_05_dm_view hdf5:patch_05.h5 -patch_06_dm_view hdf5:patch_06.h5 # # uncomment the following to write each boundary patch separately in a VTK file # it work on one processor, but fails whenever the rank=0 processor has no points # on a given submesh # # #-patch_01_dm_view vtk:patch_01.vtu #-patch_02_dm_view vtk:patch_02.vtu #-patch_03_dm_view vtk:patch_03.vtu #-patch_04_dm_view vtk:patch_04.vtu #-patch_05_dm_view vtk:patch_05.vtu #-patch_06_dm_view vtk:patch_06.vtu # # # ##-dm_plex_view_labels "marker" ##-dm_plex_view_labels "Face Sets" -petscpartitioner_view ####-dm_petscsection_view -options_left From knepley at gmail.com Tue Dec 2 08:02:18 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 Dec 2025 09:02:18 -0500 Subject: [petsc-users] Trouble when viewing a subDM in vtk format In-Reply-To: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it> References: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it> Message-ID: On Mon, Dec 1, 2025 at 8:22?AM Aldo Bonfiglioli wrote: > Dear developers, > > I wrote a code that extracts subDMs corresponding to the various strata > in the Face Sets. > > I run into troubles when I view a subDM or a Vec attached to the subDM > using the VTK format. > > More precisely, the problem only occurs on more than one processor when > the rank=0 processor has no points on a given subDM. > > For instance, when the attached reproducer is run on 2 procs, the > u_01.vtu file (global u Vec mapped to the subDM corresponding to stratum=1) > > only includes the header, but no data. All other u_0?.vtu files can > successfully be loaded and viewed in paraview. > > The problem does NOT arise when I view the same objects in HDF5 format. > > However, my problem in using the HDF5 lies in the fact that: > > while the hdf5 file obtained with DMView can be post-processed with > "petsc/lib/petsc/bin/petsc_gen_xdmf.py" to create a xmf file readable by > paraview > Hi Aldo, Sorry about this, I would like to make it more intuitive. First, the solution (I think) -dm_plex_view_hdf5_storage_version 1.1.0 will write the Viz field by default, so that PAraview will see it. Can you try this? Why do we need this? I have now made version-controlled output formats. There is something about this in the manual, but not enough. Paraview only supports vertex-based fields and cell-based fields (at least that I understand), so we need to write a separate copy of the field (since Plex supports any layout). Lots of people do not want a separate copy, since they are checkpointing, so we control this with a format (PETSC_VIEWER_HDF5_VIZ). You can pass this for specific output, or use the format version that does it automatically. Let me know if this works. Thanks, Matt > I do not know how to view the field(s) associated with the DM when the > hdf5 file is obtained from VecView. > > The reproducer compiles with the latest petsc release. > > Thanks, > > Aldo > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Mechanics > Dipartimento di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web: > https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!eXE2csxUb1J5bnRosf9LIGxLs17TdstMxQcGs0mbXFroRjzLN81jVmeAWfMr41X8JGEuVm286lVTmDgmi5s1zfsfegJNuWVVWTM$ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f1RvXE2KMeUsh5sgUBZzIIBhluwlYswPKT9rJ68dK6QgBnEwc3sSJnMK0IaiZzswk8NZNJjy-jh0mT2gw3zP$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Elena.Moral.Sanchez at ipp.mpg.de Tue Dec 2 10:53:22 2025 From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena) Date: Tue, 2 Dec 2025 16:53:22 +0000 Subject: [petsc-users] error setting the type of the TAO solver Message-ID: <0e75a579f52348bb9eee6d26636885c1@ipp.mpg.de> Hi, I am trying to initialize a LCL TAO solver with petsc4py: from petsc4py import PETSc solver = PETSc.TAO().create() solver.setType(PETSc.TAO.Type.LCL) The last line throws the following error: Traceback (most recent call last): File "", line 3, in File "petsc4py/PETSc/TAO.pyx", line 183, in petsc4py.PETSc.TAO.setType petsc4py.PETSc.Error: error code 86 [0] TaoSetType() at /petsc/src/tao/interface/taosolver.c:2164 [0] Unknown type. Check for miss-spelling or missing package: https://urldefense.us/v3/__https://petsc.org/release/install/install/*external-packages__;Iw!!G_uCfscf7eWS!bi3UN8Pwci-Vryovl2zHhUj6yCPxh-3xwyOp74MnoU6mnVpJN8twrV3OQEGKWOU6UtghBOlXVbBW_TAta4L0NMGih55H4vncwyyG$ [0] Unable to find requested Tao type lcl However, hasattr(solver.Type(), 'LCL') returns True. The same happens with any other PETSc.TAO.Type. What am I missing here? Cheers, Elena -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Wed Dec 3 15:00:34 2025 From: liufield at gmail.com (neil liu) Date: Wed, 3 Dec 2025 16:00:34 -0500 Subject: [petsc-users] Questions about memory usage in Petsc Message-ID: Dear users and developers, I am recently running a large system from Nedelec element, 14 million dofs (complex number). A little confused about the memory there. Then I tried a small system (34,000 dofs) to see the memory usage. It was solved with MUMPS with 1 rank. Then I used PetscMemoryGetCurrentUsage() to show the memory used there. The pseudocode is PetscMemoryGetCurrentUsage (*Memory 1:* 64.237M) KSPset KSPsolve (INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): *Memory 2:* 408 MB) PetscMemoryGetCurrentUsage (*Memory 3: *54.307M) [0] Maximum memory PetscMalloc()ed 49.45MB maximum size of entire process 424MB (*Memory 4:* 54.307M) The following is my understanding, please correct me if I am wrong, It seems the difference between Memory 1 and 3 is approximately the size of 30 Krylov vectors (complex). It seems Memory 4 is not the summation of Memory 2 and 3; but on the same order of magnitude. It is a little confusing here. Thanks, Xiaodong -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 3 15:18:45 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 3 Dec 2025 16:18:45 -0500 Subject: [petsc-users] Questions about memory usage in Petsc In-Reply-To: References: Message-ID: <9EC264D4-C478-48CB-9FB1-96A4A2ED7974@petsc.dev> Unix/Linux has never had a good API for tracking process memory usage. PetscMemoryGetCurrentUsage() gets what it can from the OS, but the exact number should not be considered a true measure of process memory usage at that point in time. Jumps up and down are not accurate measures of changes in memory usage. PetscMallocGetCurrentUsage() and the number from MUMPS are (assuming no bugs in our code and MUMPS counting space) accurate values of memory usage. You should use these to see how memory usage is scaling with your problem size. Barry > On Dec 3, 2025, at 4:00?PM, neil liu wrote: > > Dear users and developers, > > I am recently running a large system from Nedelec element, 14 million dofs (complex number). > A little confused about the memory there. Then I tried a small system (34,000 dofs) to see the memory usage. It was solved with MUMPS with 1 rank. > Then I used PetscMemoryGetCurrentUsage() to show the memory used there. > The pseudocode is > PetscMemoryGetCurrentUsage (Memory 1: 64.237M) > KSPset > KSPsolve (INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): Memory 2: 408 MB) > PetscMemoryGetCurrentUsage (Memory 3: 54.307M) > > [0] Maximum memory PetscMalloc()ed 49.45MB maximum size of entire process 424MB (Memory 4: 54.307M) > The following is my understanding, please correct me if I am wrong, > It seems the difference between Memory 1 and 3 is approximately the size of 30 Krylov vectors (complex). > It seems Memory 4 is not the summation of Memory 2 and 3; but on the same order of magnitude. It is a little confusing here. > > Thanks, > Xiaodong > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Wed Dec 3 17:32:47 2025 From: liufield at gmail.com (neil liu) Date: Wed, 3 Dec 2025 18:32:47 -0500 Subject: [petsc-users] Questions about memory usage in Petsc In-Reply-To: <9EC264D4-C478-48CB-9FB1-96A4A2ED7974@petsc.dev> References: <9EC264D4-C478-48CB-9FB1-96A4A2ED7974@petsc.dev> Message-ID: Thanks a lot for this advice. Will do. On Wed, Dec 3, 2025 at 4:18?PM Barry Smith wrote: > > Unix/Linux has never had a good API for tracking process memory usage. PetscMemoryGetCurrentUsage() > gets what it can from the OS, but the exact number should not be considered > a true measure of process memory usage at that point in time. Jumps up and > down are not accurate measures of changes in memory usage. > > PetscMallocGetCurrentUsage() and the number from MUMPS are (assuming no > bugs in our code and MUMPS counting space) accurate values of memory usage. > You should use these to see how memory usage is scaling with your problem > size. > > Barry > > > On Dec 3, 2025, at 4:00?PM, neil liu wrote: > > Dear users and developers, > > I am recently running a large system from Nedelec element, 14 million dofs > (complex number). > A little confused about the memory there. Then I tried a small system > (34,000 dofs) to see the memory usage. It was solved with MUMPS with 1 > rank. > Then I used PetscMemoryGetCurrentUsage() to show the memory used there. > The pseudocode is > PetscMemoryGetCurrentUsage (*Memory 1:* 64.237M) > KSPset > KSPsolve (INFOG(18) (size of all MUMPS internal data allocated during > factorization: value on the most memory consuming processor): *Memory 2:* 408 > MB) > PetscMemoryGetCurrentUsage (*Memory 3: *54.307M) > > [0] Maximum memory PetscMalloc()ed 49.45MB maximum size of entire process > 424MB (*Memory 4:* 54.307M) > The following is my understanding, please correct me if I am wrong, > It seems the difference between Memory 1 and 3 is approximately the size > of 30 Krylov vectors (complex). > It seems Memory 4 is not the summation of Memory 2 and 3; but on the same > order of magnitude. It is a little confusing here. > > Thanks, > Xiaodong > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.wiesheier at gmail.com Thu Dec 4 02:03:22 2025 From: simon.wiesheier at gmail.com (Simon Wiesheier) Date: Thu, 4 Dec 2025 09:03:22 +0100 Subject: [petsc-users] TAO PDIPM handling of objective evaluation failures (NaN / PDE non-convergence) Message-ID: Dear PETSc developers and users, I am considering using TAO?s Primal-Dual Interior-Point Method (PDIPM) for a constrained optimization problem in solid mechanics. The objective involves solving a nonlinear PDE (hyperelasticity) for each parameter vector, and for some parameter combinations the PDE solver may fail to converge or produce non-physical states. With MATLAB?s fmincon, it is possible to signal such failures by returning NaN/Inf for the objective, and the solver will then backtrack or try a different step without crashing. My questions are: 1. How does TAO?s PDIPM handle cases where the user objective or gradient callback returns NaN/Inf (e.g., due to PDE solver failure)? 2. Is there a recommended way in TAO/PETSc to gracefully signal an evaluation failure (like ?bad point in parameter space?) so that the algorithm can back off and try a smaller step, instead of aborting? 3. If the recommended pattern is *not* to return NaNs, what is the best practice in TAO for such PDE-constrained problems? Any guidance on how TAO/PDIPM is intended to behave in the presence of evaluation failures would be greatly appreciated. Best regards, Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Dec 4 09:31:55 2025 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 4 Dec 2025 10:31:55 -0500 Subject: [petsc-users] TAO PDIPM handling of objective evaluation failures (NaN / PDE non-convergence) In-Reply-To: References: Message-ID: Simon, Thanks for the question. We would love to have such functionality in TAO, but we do not currently have it. For SNES, we provide SNESSetFunctionDomainError() and SNESSetJacobianDomainError() which can be called (on any subset of the MPI processes in the MPI communicator) to indicate such a domain error. Currently, SNES simply checks this way (in a collective, friendly manner without extra communication) and returns a clean SNES_DIVERGED_FUNCTION_DOMAIN via the SNESConvergedReason. We would like at least some of the SNES solvers to handle this better, as you suggest, by backtracking to find a valid point in the domain. For TS we provide TSSetFunctionDomainError(), which allows the user to pass in a function to check if a point is in the domain. It is currently unused. I think it should use the same approach as SNES. So for TAO, I envision a similar TaoSetFunctionDomainError(), TaoSetJacobianDomainError(), and TaoSetHessianDomainError(), which would allow particular TAO solvers to "back off" but continue running as you request. Would you be interested in collaborating with us on adding such support to Tao? In particular, focusing exactly on the Tao solver algorithm you are using? We don't currently have PETSc developers focusing on Tao, so can only make progress on it by actively collaborating with others who need new/improved functionality. Barry > On Dec 4, 2025, at 3:03?AM, Simon Wiesheier wrote: > > Dear PETSc developers and users, > I am considering using TAO?s Primal-Dual Interior-Point Method (PDIPM) for a constrained optimization problem in solid mechanics. The objective involves solving a nonlinear PDE (hyperelasticity) for each parameter vector, and for some parameter combinations the PDE solver may fail to converge or produce non-physical states. > > With MATLAB?s fmincon, it is possible to signal such failures by returning NaN/Inf for the objective, and the solver will then backtrack or try a different step without crashing. > > My questions are: > > How does TAO?s PDIPM handle cases where the user objective or gradient callback returns NaN/Inf (e.g., due to PDE solver failure)? > > Is there a recommended way in TAO/PETSc to gracefully signal an evaluation failure (like ?bad point in parameter space?) so that the algorithm can back off and try a smaller step, instead of aborting? > > If the recommended pattern is not to return NaNs, what is the best practice in TAO for such PDE-constrained problems? > > Any guidance on how TAO/PDIPM is intended to behave in the presence of evaluation failures would be greatly appreciated. > > Best regards, > > Simon > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.wiesheier at gmail.com Fri Dec 5 10:50:28 2025 From: simon.wiesheier at gmail.com (Simon Wiesheier) Date: Fri, 5 Dec 2025 17:50:28 +0100 Subject: [petsc-users] TAO PDIPM handling of objective evaluation failures (NaN / PDE non-convergence) In-Reply-To: References: Message-ID: Thank you very much for the detailed explanation. I would in principle be interested in collaborating on adding such functionality to TAO, especially since robust handling of domain errors would be very valuable for PDE-constrained inverse problems. That said, my experience with PETSc is mostly through petsc4py and some interfacing from deal.II, so I am not very familiar with the internals of TAO or the C-level API. I would therefore need some guidance on where such functionality should live inside the TAO infrastructure, and how similar mechanisms are implemented in SNES or TS. If you can point me to the relevant parts of the TAO codebase and outline the expected design (e.g., how the domain-error flag should propagate through the solver), I would be happy to explore how I could contribute. Thanks again for your support and for considering this extension to TAO. Best, Simon Am Do., 4. Dez. 2025 um 16:32 Uhr schrieb Barry Smith : > Simon, > > Thanks for the question. We would love to have such functionality in > TAO, but we do not currently have it. > > For SNES, we provide SNESSetFunctionDomainError() and > SNESSetJacobianDomainError() which can be called (on any subset of the MPI > processes in the MPI communicator) to indicate such a domain error. > Currently, SNES simply checks this way (in a collective, friendly manner > without extra communication) and returns a clean > SNES_DIVERGED_FUNCTION_DOMAIN via the SNESConvergedReason. We would like > at least some of the SNES solvers to handle this better, as you suggest, by > backtracking to find a valid point in the domain. > > For TS we provide TSSetFunctionDomainError(), which allows the user to > pass in a function to check if a point is in the domain. It is currently > unused. I think it should use the same approach as SNES. > > So for TAO, I envision a similar TaoSetFunctionDomainError(), > TaoSetJacobianDomainError(), and TaoSetHessianDomainError(), which would > allow particular TAO solvers to "back off" but continue running as you > request. > > Would you be interested in collaborating with us on adding such support > to Tao? In particular, focusing exactly on the Tao solver algorithm you are > using? We don't currently have PETSc developers focusing on Tao, so can > only make progress on it by actively collaborating with others who need > new/improved functionality. > > > Barry > > > On Dec 4, 2025, at 3:03?AM, Simon Wiesheier > wrote: > > Dear PETSc developers and users, > > I am considering using TAO?s Primal-Dual Interior-Point Method (PDIPM) for > a constrained optimization problem in solid mechanics. The objective > involves solving a nonlinear PDE (hyperelasticity) for each parameter > vector, and for some parameter combinations the PDE solver may fail to > converge or produce non-physical states. > > With MATLAB?s fmincon, it is possible to signal such failures by returning > NaN/Inf for the objective, and the solver will then backtrack or try a > different step without crashing. > > My questions are: > > 1. > > How does TAO?s PDIPM handle cases where the user objective or gradient > callback returns NaN/Inf (e.g., due to PDE solver failure)? > 2. > > Is there a recommended way in TAO/PETSc to gracefully signal an > evaluation failure (like ?bad point in parameter space?) so that the > algorithm can back off and try a smaller step, instead of aborting? > 3. > > If the recommended pattern is *not* to return NaNs, what is the best > practice in TAO for such PDE-constrained problems? > > Any guidance on how TAO/PDIPM is intended to behave in the presence of > evaluation failures would be greatly appreciated. > > Best regards, > > Simon > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhaowenbo.npic at gmail.com Sun Dec 7 22:03:37 2025 From: zhaowenbo.npic at gmail.com (Wenbo Zhao) Date: Mon, 8 Dec 2025 12:03:37 +0800 Subject: [petsc-users] Petsc veccuda device to host copy Message-ID: Hi, we are using petsc's veccuda and found that the data in the host array obtained via VecGetArrayRead is partially updated sometime. Vec vgpu, vcpu; iterations: // ksp solve a * vgpu=b const PetscScalar * agpu; PetscScalar * acpu; VecGetArrayRead(vgpu, &agpu); VecGetArray(vcpu, &acpu); PetscArraycpy (acpu, agpu,size); // check updating std::cout << agpu[0] << agpu [size-1]< From bsmith at petsc.dev Sun Dec 7 22:35:26 2025 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 Dec 2025 23:35:26 -0500 Subject: [petsc-users] Petsc veccuda device to host copy In-Reply-To: References: Message-ID: I am sorry to hear you are having difficulties. Please send a full reproducer so we can track down the problem using the latest PETSc release. Software changes very rapidly for GPUs so we cannot support or debug PETSc 3.21.1, which is a couple of years old. But if the problem persists in 3.24, we will definitely track it down if you provide a reproducer. Barry > On Dec 7, 2025, at 11:03?PM, Wenbo Zhao wrote: > > Hi, > > we are using petsc's veccuda and found that the data in the host array obtained via VecGetArrayRead is partially updated sometime. > > > > > Vec vgpu, vcpu; > > iterations: > > // ksp solve a * vgpu=b > > const PetscScalar * agpu; > > PetscScalar * acpu; > > VecGetArrayRead(vgpu, &agpu); > > VecGetArray(vcpu, &acpu); > > > PetscArraycpy (acpu, agpu,size); > > // check updating > > std::cout << agpu[0] << agpu [size-1]< > // we found that agpu[0] is last iterations value, agpu[size-1] updated from device value, randomly > > // use acpu values to update matrix a .... > > > > > > > > Petsc 3.21.1 is used. And manual said, > > For vectors that may also have array data in GPU memory, for example, VECCUDA, this call ensures the CPU array has the most recent array values by copying the data from the GPU memory if needed. > > > > > > Best wishes, > > Wenbo > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Wed Dec 10 07:14:06 2025 From: dontbugthedevs at proton.me (Noam T.) Date: Wed, 10 Dec 2025 13:14:06 +0000 Subject: [petsc-users] Use of flag dm_plex_high_order_view Message-ID: Hello, In an old question about obtaining the node connectivity of a cell with a high order approximation space, the use of the flag "-dm_plex_high_order_view" for visualization purposes was brought up, as a way to refine the grid down to linear space. Link: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plex.c?ref_type=heads*L2021__;Iw!!G_uCfscf7eWS!ZZVaar4OGVdTXOAAjkEadcw-CAZUpAAzibpD_2SKpRupYG4oxSH-W7u5necsgTYDii6v-1kjfV4aqsllxL9XI_hK2YITfNz-$ I'm trying to make use of this flag to see the refinement, but I see no difference with higher order approximations. Perhaps I am misunderstanding its use? I thought that by using it, one could see a "subdivision" of each element. Say, a single triangle, FE approximation space order 2 (3 corner nodes, 3 mid-edge nodes), would be refined into e.g. 4 linear triangles. The code (Fortran) looks something along these lines: ! Create a DM from the mesh file DM :: cdm PetscInt :: K, cdim, fedim PetscBool :: simplex PetscFE :: cfe PetscViewer :: viewer PetscErrorCode :: ierr PetsCallA(DMSetFromOptions(cdm, ierr)) PetsCallA(DMGetDimension(cdm, cdim, ierr)) PetsCallA(DMPlexSimplex(cdm, simplex, ierr)) PetsCallA(PetscFECreateDefault(PETSC_COMM_WORLD, cdim, cdim, simplex, "ho_", K, cfe, ierr)) PetsCallA(PetscFEGetDimension(cfe, fedim, ierr)) PetsCallA(PetscViewerCreate(PETSC_COMM_WORLD, viewer, ierr)) PetsCallA(PetscViewerDrawOpen(PETSC_COMM_WORLD, PETSC_NULL_CHARACTER, PETSC_NULL_CHARACTER, 100, 100, 800, 800, viewer, ierr)) PetsCallA(PetscViewerSetFromOptions(viewer, ierr)) ! also using flag -draw_pause 5 PetsCallA(DMView(cdm, viewer, ierr)) [...] With the following list of flags: -ho_petscspace_degree K -ho_petscdualspace_lagrange_node_type equispaced -ho_petscdualspace_lagrange_node_endpoints 1 -dm_plex_high_order_view -options_left Using "-options_left" shows that "there are no unused options"; so "-dm_plex_high_order_view" is used somehow; it is at least required for the call to "DMPlexCreateHighOrderSurrogate_Internal" within the "draw" functions. >From the CoordinateDM I can see the additional nodes created for the higher order approximation (e.g. mid-edge, mid-face nodes), so it seems the FE space is correct. Regardless of the order "K", the mesh plotted with "DMView" is always the same, corresponding to the linear case. Thank you. Noam -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Dec 11 11:00:48 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Dec 2025 12:00:48 -0500 Subject: [petsc-users] Use of flag dm_plex_high_order_view In-Reply-To: References: Message-ID: On Wed, Dec 10, 2025 at 8:14?AM Noam T. via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > In an old question about obtaining the node connectivity of a cell with a > high order approximation space, > This is the misunderstanding here. What I implemented was visualization for meshes with high order _coordinate_ spaces. I am "guaranteed" that this will work because I know that coordinates are discretized with Lagrange spaces. I did not do this for approximation spaces because I can't guarantee this. We could do the same thing for approximation spaces, and I would be happy to help you factor it out. In general, I would need to figure out the smallest DG space containing the approximation space (don't know how to do yet, but possible), then project to that first and start the descent as before. Thanks, Matt > the use of the flag "-dm_plex_high_order_view" for visualization purposes > was brought up, as a way to refine the grid down to linear space. > > Link: > https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plex.c?ref_type=heads*L2021__;Iw!!G_uCfscf7eWS!ekDGozJlgGi9X5H88PnWzpu8atTA6FPYDxbKBZbEfCAjP3O-HlOcRx5bJdDfiII0H3rwmPSpEjZTGy6TVz1M$ > > > I'm trying to make use of this flag to see the refinement, but I see no > difference with higher order approximations. Perhaps I am misunderstanding > its use? I thought that by using it, one could see a "subdivision" of each > element. Say, a single triangle, FE approximation space order 2 (3 corner > nodes, 3 mid-edge nodes), would be refined into e.g. 4 linear triangles. > > The code (Fortran) looks something along these lines: > > ! Create a DM from the mesh file > > DM :: cdm > PetscInt :: K, cdim, fedim > PetscBool :: simplex > PetscFE :: cfe > PetscViewer :: viewer > PetscErrorCode :: ierr > > PetsCallA(DMSetFromOptions(cdm, ierr)) > PetsCallA(DMGetDimension(cdm, cdim, ierr)) > PetsCallA(DMPlexSimplex(cdm, simplex, ierr)) > PetsCallA(PetscFECreateDefault(PETSC_COMM_WORLD, cdim, cdim, simplex, > "ho_", K, cfe, ierr)) > PetsCallA(PetscFEGetDimension(cfe, fedim, ierr)) > > PetsCallA(PetscViewerCreate(PETSC_COMM_WORLD, viewer, ierr)) > PetsCallA(PetscViewerDrawOpen(PETSC_COMM_WORLD, PETSC_NULL_CHARACTER, > PETSC_NULL_CHARACTER, 100, 100, 800, 800, viewer, ierr)) > PetsCallA(PetscViewerSetFromOptions(viewer, ierr)) ! also using flag > -draw_pause 5 > PetsCallA(DMView(cdm, viewer, ierr)) > [...] > > > With the following list of flags: > > -ho_petscspace_degree K > -ho_petscdualspace_lagrange_node_type equispaced > -ho_petscdualspace_lagrange_node_endpoints 1 > -dm_plex_high_order_view > -options_left > > Using "-options_left" shows that "there are no unused options"; so > "-dm_plex_high_order_view" is used somehow; it is at least required for the > call to "DMPlexCreateHighOrderSurrogate_Internal" within the "draw" > functions. > > From the CoordinateDM I can see the additional nodes created for the > higher order approximation (e.g. mid-edge, mid-face nodes), so it seems the > FE space is correct. > > Regardless of the order "K", the mesh plotted with "DMView" is always the > same, corresponding to the linear case. > > Thank you. > > Noam > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ekDGozJlgGi9X5H88PnWzpu8atTA6FPYDxbKBZbEfCAjP3O-HlOcRx5bJdDfiII0H3rwmPSpEjZTG4wF9UiD$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldo.bonfiglioli at unibas.it Mon Dec 15 02:43:45 2025 From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli) Date: Mon, 15 Dec 2025 09:43:45 +0100 Subject: [petsc-users] Trouble when viewing a subDM in vtk format In-Reply-To: References: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it> Message-ID: <8c7c0794-3ede-4aa5-a515-03af3187fb4c@unibas.it> On 12/2/25 15:02, Matthew Knepley wrote: > On Mon, Dec 1, 2025 at 8:22?AM Aldo Bonfiglioli > wrote: > > Dear developers, > > I wrote a code that extracts subDMs corresponding to the various > strata > in the Face Sets. > > I run into troubles when I view a subDM or a Vec attached to the > subDM > using the VTK format. > > More precisely, the problem only occurs on more than one processor > when > the rank=0 processor has no points on a given subDM. > > For instance, when the attached reproducer is run on 2 procs, the > u_01.vtu file (global u Vec mapped to the subDM corresponding to > stratum=1) > > only includes the header, but no data. All other u_0?.vtu files can > successfully be loaded and viewed in paraview. > > The problem does NOT arise when I view the same objects in HDF5 > format. > > However, my problem in using the HDF5 lies in the fact that: > > while the hdf5 file obtained with DMView can be post-processed with > "petsc/lib/petsc/bin/petsc_gen_xdmf.py" to create a xmf file > readable by > paraview > > > Hi Aldo, > > Sorry about this, I would like to make it more intuitive. First, the > solution (I think) > > ??-dm_plex_view_hdf5_storage_version 1.1.0 > > will write the Viz field by default, so that PAraview will see it. Can > you try this? > > Why do we need this? I have now made version-controlled output > formats. There is something about > this in the manual, but not enough. Paraview only supports > vertex-based fields and cell-based fields > (at least that I understand), so we need to write a separate copy of > the field (since Plex supports any > layout). Lots of people do not want a separate copy, since they are > checkpointing, so we control this > with a format (PETSC_VIEWER_HDF5_VIZ). You can pass this for specific > output, or use the format > version that does it automatically. > > Let me know if this works. > > ? Thanks, > > ? ? ?Matt > > I do not know how to view the field(s) associated with the DM when > the > hdf5 file is obtained from VecView. > > The reproducer compiles with the latest petsc release. > > Thanks, > > Aldo > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Mechanics > Dipartimento di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web: > https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!eXE2csxUb1J5bnRosf9LIGxLs17TdstMxQcGs0mbXFroRjzLN81jVmeAWfMr41X8JGEuVm286lVTmDgmi5s1zfsfegJNuWVVWTM$ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Zla-FYPeT9khxXRG5MuCbh6NLunsaK0ljFdIlNk-3XDnbGEJPBARoshmLdtN7p3kHgPJbQDyNgvHIi-GT6IyAxxdaZ2jTWw_kEg$ > Matt, use of the option "-dm_plex_view_hdf5_storage_version 1.1.0" is indeed necessary, but I also realized, thanks to a suggestion from matteo.semplice at uninsubria.it, that I have to View BOTH the dm and vec in the same hdf5 file, i.e. > ! > ! ???dump the dm+u to the same HDF5 file > ! > > filename ="test.h5" > ??PetscCall(PetscViewerHDF5Open(PETSC_COMM_WORLD, trim(filename), > FILE_MODE_WRITE, viewer, ierr)) > ??PetscCall(DMView(dm, viewer, ierr)) > ??PetscCall(PetscViewerDestroy(viewer, ierr)) > ??PetscCall(PetscViewerHDF5Open(PETSC_COMM_WORLD, trim(filename), > FILE_MODE_APPEND, viewer, ierr)) > ??PetscCall(VecView(u, viewer, ierr)) > ??PetscCall(PetscViewerDestroy(viewer, ierr)) > > Once test.h5 has been processed with "petsc_gen_xdmf.py", I can load the xmf file in paraview e see the solution (there is NO solution unless -dm_plex_view_hdf5_storage_version 1.1.0 is in the options db). I was probably misled by the fact that a single VecView in VTK format gives both the mesh and solution in the same file. Does this make sense? Final question: is it possible to specify, using command line options or the options db, that vecview should be appended to an existing file? Thanks, Aldo -- Dr. Aldo Bonfiglioli Associate professor of Fluid Mechanics Dipartimento di Ingegneria Universita' della Basilicata V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY tel:+39.0971.205203 fax:+39.0971.205215 web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!Zla-FYPeT9khxXRG5MuCbh6NLunsaK0ljFdIlNk-3XDnbGEJPBARoshmLdtN7p3kHgPJbQDyNgvHIi-GT6IyAxxdaZ2jJl4AAbM$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 15 06:41:07 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 Dec 2025 07:41:07 -0500 Subject: [petsc-users] Trouble when viewing a subDM in vtk format In-Reply-To: <8c7c0794-3ede-4aa5-a515-03af3187fb4c@unibas.it> References: <0732cd7a-6ea5-4720-acf8-6ce8a314136d@unibas.it> <8c7c0794-3ede-4aa5-a515-03af3187fb4c@unibas.it> Message-ID: On Mon, Dec 15, 2025 at 3:43?AM Aldo Bonfiglioli wrote: > On 12/2/25 15:02, Matthew Knepley wrote: > > On Mon, Dec 1, 2025 at 8:22?AM Aldo Bonfiglioli < > aldo.bonfiglioli at unibas.it> wrote: > >> Dear developers, >> >> I wrote a code that extracts subDMs corresponding to the various strata >> in the Face Sets. >> >> I run into troubles when I view a subDM or a Vec attached to the subDM >> using the VTK format. >> >> More precisely, the problem only occurs on more than one processor when >> the rank=0 processor has no points on a given subDM. >> >> For instance, when the attached reproducer is run on 2 procs, the >> u_01.vtu file (global u Vec mapped to the subDM corresponding to >> stratum=1) >> >> only includes the header, but no data. All other u_0?.vtu files can >> successfully be loaded and viewed in paraview. >> >> The problem does NOT arise when I view the same objects in HDF5 format. >> >> However, my problem in using the HDF5 lies in the fact that: >> >> while the hdf5 file obtained with DMView can be post-processed with >> "petsc/lib/petsc/bin/petsc_gen_xdmf.py" to create a xmf file readable by >> paraview >> > > Hi Aldo, > > Sorry about this, I would like to make it more intuitive. First, the > solution (I think) > > -dm_plex_view_hdf5_storage_version 1.1.0 > > will write the Viz field by default, so that PAraview will see it. Can you > try this? > > Why do we need this? I have now made version-controlled output formats. > There is something about > this in the manual, but not enough. Paraview only supports vertex-based > fields and cell-based fields > (at least that I understand), so we need to write a separate copy of the > field (since Plex supports any > layout). Lots of people do not want a separate copy, since they are > checkpointing, so we control this > with a format (PETSC_VIEWER_HDF5_VIZ). You can pass this for specific > output, or use the format > version that does it automatically. > > Let me know if this works. > > Thanks, > > Matt > > >> I do not know how to view the field(s) associated with the DM when the >> hdf5 file is obtained from VecView. >> >> The reproducer compiles with the latest petsc release. >> >> Thanks, >> >> Aldo >> >> -- >> Dr. Aldo Bonfiglioli >> Associate professor of Fluid Mechanics >> Dipartimento di Ingegneria >> Universita' della Basilicata >> V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY >> tel:+39.0971.205203 fax:+39.0971.205215 >> web: >> https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!eXE2csxUb1J5bnRosf9LIGxLs17TdstMxQcGs0mbXFroRjzLN81jVmeAWfMr41X8JGEuVm286lVTmDgmi5s1zfsfegJNuWVVWTM$ >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e3ZEg0fMK55wMqFxwA8I9trdBUMCc27ap7Q8V8beAiyV258mz8N43KSQa8MUBZMDUTVJw33Yay2EgEb4psDr$ > > > Matt, > > use of the option "-dm_plex_view_hdf5_storage_version 1.1.0" is indeed > necessary, > > but I also realized, thanks to a suggestion from > matteo.semplice at uninsubria.it, that I have to View BOTH the dm and vec in > the same hdf5 file, i.e. > > ! > ! dump the dm+u to the same HDF5 file > ! > > filename = "test.h5" > PetscCall(PetscViewerHDF5Open(PETSC_COMM_WORLD, trim(filename), > FILE_MODE_WRITE, viewer, ierr)) > PetscCall(DMView(dm, viewer, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > PetscCall(PetscViewerHDF5Open(PETSC_COMM_WORLD, trim(filename), > FILE_MODE_APPEND, viewer, ierr)) > PetscCall(VecView(u, viewer, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > > > Once test.h5 has been processed with "petsc_gen_xdmf.py", I can load the > xmf file in paraview e see the solution (there is NO solution unless > -dm_plex_view_hdf5_storage_version 1.1.0 is in the options db). > > I was probably misled by the fact that a single VecView in VTK format > gives both the mesh and solution in the same file. > > Does this make sense? > Ah, yes. VTK does not have a way to construct the file without the DM, so we force it. Our HDF5 format can actually handle multiple DMs (so the adaptive refinement can be visualized), so you need to specify what to put in. > Final question: > > is it possible to specify, using command line options or the options db, > that vecview should be appended to an existing file? > > Yes. You use the mode -vec_view hdf5:sol.h5::append Thanks, Matt > Thanks, > > Aldo > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Mechanics > Dipartimento di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!e3ZEg0fMK55wMqFxwA8I9trdBUMCc27ap7Q8V8beAiyV258mz8N43KSQa8MUBZMDUTVJw33Yay2EgLcMeEyC$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e3ZEg0fMK55wMqFxwA8I9trdBUMCc27ap7Q8V8beAiyV258mz8N43KSQa8MUBZMDUTVJw33Yay2EgEb4psDr$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Tue Dec 16 11:39:07 2025 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Tue, 16 Dec 2025 17:39:07 +0000 Subject: [petsc-users] Handling inactive (zero-occupancy) equations in large SNES systems References: Message-ID: <8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es> Dear all, I am working with a large nonlinear system solved with SNES, where a significant fraction of the unknowns are temporarily inactive due to a physical parameter being zero (e.g. zero occupancy / zero weight). For those DOF the corresponding equilibrium equation is physically inactive, but the unknown still appears in the global vector and in couplings of neighboring particles (Im using dmswarm). At the moment, these inactive equations contribute with a zero residual (F_i=0), which (I think) leads to poor conditioning and convergence issues for large problems. My question is about best numerical practice in this situation. For the position field, should I do something like F_i = q_i - q_(i,n)? Where q_(i,n) is the position of the particle at the previous configuration. Best regards, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 16 12:23:45 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Dec 2025 13:23:45 -0500 Subject: [petsc-users] Handling inactive (zero-occupancy) equations in large SNES systems In-Reply-To: <8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es> References: <8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es> Message-ID: On Tue, Dec 16, 2025 at 12:39?PM MIGUEL MOLINOS PEREZ wrote: > > Dear all, > > I am working with a large nonlinear system solved with SNES, where a > significant fraction of the unknowns are temporarily inactive due to a > physical parameter being zero (e.g. zero occupancy / zero weight). > > > For those DOF the corresponding equilibrium equation is physically > inactive, but the unknown still appears in the global vector and in > couplings of neighboring particles (Im using dmswarm). > > At the moment, these inactive equations contribute with a zero residual > (F_i=0), which (I think) leads to poor conditioning and convergence issues > for large problems. > > > My question is about best numerical practice in this situation. For the > position field, should I do something like F_i = q_i - q_(i,n)? Where q_(i,n) > is the position of the particle at the previous configuration. > This puts a 1 on the diagonal, which is usually what you want (esp for particle problems). However, there could be convergence problems with Newton, with these directions swamping other descent directions. That is the argument for eliminating these unknowns. It sounds like it would be worth trying to see if this is the case. Thanks, Matt > Best regards, > > Miguel > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a9ge_7Blw6FP4XR8osFatvOvy7Q2pdIyLX8lVZs3eFcKKQhLJ0TRkrPMAXOBllnlR6EP1Oa_qkH_pcJzgpX-$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Tue Dec 16 13:04:05 2025 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Tue, 16 Dec 2025 19:04:05 +0000 Subject: [petsc-users] Handling inactive (zero-occupancy) equations in large SNES systems In-Reply-To: References: <8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es> Message-ID: <642C9300-38AB-4906-AEE6-FE5DE9C715A0@us.es> I?ll give it a try to F_i = q_i - q_(i,n). The problem with the dof elimination is that it messes up with local-to-global numbering and ghost particles creation too. Miguel On 16 Dec 2025, at 19:24, Matthew Knepley wrote: ? On Tue, Dec 16, 2025 at 12:39?PM MIGUEL MOLINOS PEREZ > wrote: Dear all, I am working with a large nonlinear system solved with SNES, where a significant fraction of the unknowns are temporarily inactive due to a physical parameter being zero (e.g. zero occupancy / zero weight). For those DOF the corresponding equilibrium equation is physically inactive, but the unknown still appears in the global vector and in couplings of neighboring particles (Im using dmswarm). At the moment, these inactive equations contribute with a zero residual (F_i=0), which (I think) leads to poor conditioning and convergence issues for large problems. My question is about best numerical practice in this situation. For the position field, should I do something like F_i = q_i - q_(i,n)? Where q_(i,n) is the position of the particle at the previous configuration. This puts a 1 on the diagonal, which is usually what you want (esp for particle problems). However, there could be convergence problems with Newton, with these directions swamping other descent directions. That is the argument for eliminating these unknowns. It sounds like it would be worth trying to see if this is the case. Thanks, Matt Best regards, Miguel -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dXRvo4zLp_xH8xtPz1XikKeBlmIcplblRVj-9N1BTV2H0XI0cUS-2cnQoPgCRz5QvCkTONlTpwY_jFYEtSoYLw$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Dec 16 13:52:10 2025 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 16 Dec 2025 14:52:10 -0500 Subject: [petsc-users] Handling inactive (zero-occupancy) equations in large SNES systems In-Reply-To: References: <8C1BB514-0528-46FC-A5B8-D88BD1C8AA90@us.es> Message-ID: <0B95E014-6132-4194-BEAC-EF8F303BCCC4@petsc.dev> > On Dec 16, 2025, at 1:23?PM, Matthew Knepley wrote: > > On Tue, Dec 16, 2025 at 12:39?PM MIGUEL MOLINOS PEREZ > wrote: >> >> Dear all, >> >> I am working with a large nonlinear system solved with SNES, where a significant fraction of the unknowns are temporarily inactive due to a physical parameter being zero (e.g. zero occupancy / zero weight). >> >> >> >> For those DOF the corresponding equilibrium equation is physically inactive, but the unknown still appears in the global vector and in couplings of neighboring particles (Im using dmswarm). >> >> At the moment, these inactive equations contribute with a zero residual (F_i=0), which (I think) leads to poor conditioning and convergence issues for large problems. >> >> >> >> My question is about best numerical practice in this situation. For the position field, should I do something like F_i = q_i - q_(i,n)? Where q_(i,n) is the position of the particle at the previous configuration. >> > This puts a 1 on the diagonal, which is usually what you want (esp for particle problems). > > However, there could be convergence problems with Newton, with these directions swamping other descent directions. That is the argument for eliminating these unknowns. It sounds like it would be worth trying to see if this is the case. Instead of putting 1 on the diagonal you can put a value on the diagonal that is "near" the other diagonal values of the matrix. This is usally (always?) better than using 1 > > Thanks, > > Matt > > >> Best regards, >> >> Miguel >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e7ZjkKLVfmP45FsSYYPcHnoVEh9Kv6xPWA0L7i3BWuJkZi6jqWFQQujUIpV08-8TqGp0djoNntSxHUMhMeBAx5U$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Wed Dec 17 11:50:18 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Wed, 17 Dec 2025 12:50:18 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors Message-ID: Hi all, I have a question about this error: > Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in > unknown_function() at unknown file:0 (line numbers only accurate to > function begin) I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector". We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above. I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any help would be most appreciated! Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Wed Dec 17 13:02:21 2025 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Wed, 17 Dec 2025 22:02:21 +0300 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: Message-ID: You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way. Put in C++ lingo, the solution vector is a "const" argument It would be great if you could provide an MWE to help us understand your problem Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users < petsc-users at mcs.anl.gov> ha scritto: > Hi all, > > I have a question about this error: > >> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access >> in unknown_function() at unknown file:0 (line numbers only accurate to >> function begin) > > > I'm encountering this error in an FE solve where there is an error > encountered during the residual/jacobian assembly, and what we normally do > in that situation is shrink the load step and continue, starting from the > "last converged solution". However, in this case I'm running on 32 > processes, and 5 of the processes report the error above about a "locked > vector". > > We clear the SNES object (via SNESDestroy) before we reset the solution to > the "last converged solution", and then we make a new SNES object > subsequently. But it seems to me that somehow the solution vector is still > marked as "locked" on 5 of the processes when we modify the solution > vector, which leads to the error above. > > I was wondering if someone could advise on what the best way to handle > this would be? I thought one option could be to add an MPI barrier call > prior to updating the solution vector to "last converged solution", to make > sure that the SNES object is destroyed on all procs (and hence the locks > cleared) before editing the solution vector, but I'm unsure if that would > make a difference. Any help would be most appreciated! > > Thanks, > David > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Wed Dec 17 13:08:55 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Wed, 17 Dec 2025 14:08:55 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: Message-ID: Hi, I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately. The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE. I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet? Thanks, David On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini wrote: > You are not allowed to call VecGetArray on the solution vector of an SNES > object within a user callback, nor to modify its values in any other way. > Put in C++ lingo, the solution vector is a "const" argument > It would be great if you could provide an MWE to help us understand your > problem > > > Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users < > petsc-users at mcs.anl.gov> ha scritto: > >> Hi all, >> >> I have a question about this error: >> >>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access >>> in unknown_function() at unknown file:0 (line numbers only accurate to >>> function begin) >> >> >> I'm encountering this error in an FE solve where there is an error >> encountered during the residual/jacobian assembly, and what we normally do >> in that situation is shrink the load step and continue, starting from the >> "last converged solution". However, in this case I'm running on 32 >> processes, and 5 of the processes report the error above about a "locked >> vector". >> >> We clear the SNES object (via SNESDestroy) before we reset the solution >> to the "last converged solution", and then we make a new SNES object >> subsequently. But it seems to me that somehow the solution vector is still >> marked as "locked" on 5 of the processes when we modify the solution >> vector, which leads to the error above. >> >> I was wondering if someone could advise on what the best way to handle >> this would be? I thought one option could be to add an MPI barrier call >> prior to updating the solution vector to "last converged solution", to make >> sure that the SNES object is destroyed on all procs (and hence the locks >> cleared) before editing the solution vector, but I'm unsure if that would >> make a difference. Any help would be most appreciated! >> >> Thanks, >> David >> > > > -- > Stefano > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 17 13:12:39 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 17 Dec 2025 14:12:39 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: Message-ID: > On Dec 17, 2025, at 12:50?PM, David Knezevic via petsc-users wrote: > > Hi all, > > I have a question about this error: >> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin) > > I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector". It is very surprising that the vector is only locked on a subset of processes; this not normally expected behavior and likely indicates memory corruption or a logic error in the code. So the first thing to do is determine where the locking took place. You can build another instance of the PETSc libraries with debugging turned on by setting PETSC_ARCH to a new value and using the same ./configure options you used before but with --with-debugging=1 (instead of --with-debugging=0). If you are using a prebuilt PETSc from a package manager I don't know if there is an easy way to get a version with debugging turned on; I would hope so but some package managers may not provide it, in that case you need to build PETSc yourself from source.) Then run your code again and it should give detailed information (a stack trace) about where the locking took place. Why does PETSc lock vectors? During a nonlinear solver, for example, the solver algorithm will request your function to be evaluated using the function you pass to SNESSetFunction(). To ensure user code does not corrupt the solution process the input vector (often called u in the documentation and code) is locked. It is locked because if the user code changes the values of the input vector it will "break" the iterative solver code (that is incorrect answers could be produced) During a standard Newton solver the user never needs to "adjust" the Newton proposed solutions, that is all handled by the PETSc solver code (and line search etc). But for some difficult problems, the user may want to have custom code that "messes around" directly with the proposed Newton steps. SNES provides "hooks" where the user can provide such custom code (the "messing around" should never take place with the SNESSetFunction() callback, only within the "hooks"). For SNES Newton's method with line search (the default) one can provide hook functions using `SNESLineSearchSetPostCheck()`, `SNESLineSearchSetPreCheck()`, where the linesearch object is obtained with SNESGetLineSearch(). The Newton trust region methods have their own set of hooks. Based on your mention of "shrink the load step" I am speculating that for your code standard Newton's method may not converge so you are adding additional code to help get Newton to converge and this is triggering the error you are seeing. But it is possible my guess is incorrect and their is some other cause for the error; in either case running with the debug version will help indicate where the locking issues is occurring. Barry > > We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above. > > I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any help would be most appreciated! > > Thanks, > David -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Wed Dec 17 13:20:51 2025 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Wed, 17 Dec 2025 22:20:51 +0300 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: Message-ID: Note that all error codes in PETSc are fatal, and we don't handle these cases gracefully. So, code like this ierr = SNESSolve(snes,...); if (ierr) { " do something and expect PETSc will keep working as usual" } is not supported in general. However, an SNES that does not converge should not generate an error, but rather return a negative SNESConvergedReason. Maybe you are using -snes_error_if_not_converged (via command line) or calling SNESSetErrorIfNotConverged at some point in the code. You can then do PetscCall(SNESSolve(snes,...);) PetscCall(SNESGetConvergedReason(snes,&reason...);) if (reason < 0) { " now you can do whatever you want and PETSc will keep working as usual" } Il giorno mer 17 dic 2025 alle ore 22:09 David Knezevic < david.knezevic at akselos.com> ha scritto: > Hi, > > I'm using PETSc via the libMesh framework, so creating a MWE is > complicated by that, unfortunately. > > The situation is that I am not modifying the solution vector in a > callback. The SNES solve has terminated, with PetscErrorCode 82, and I then > want to update the solution vector (reset it to the "previously converged > value") and then try to solve again with a smaller load increment. This is > a typical "auto load stepping" strategy in FE. > > I think the key piece of info I'd like to know is, at what point is the > solution vector "unlocked" by the SNES object? Should it be unlocked as > soon as the SNES solve has terminated with PetscErrorCode 82? Since it > seems to me that it hasn't been unlocked yet (maybe just on a subset of the > processes). Should I manually "unlock" the solution vector by > calling VecLockWriteSet? > > Thanks, > David > > > > On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini > wrote: > >> You are not allowed to call VecGetArray on the solution vector of an SNES >> object within a user callback, nor to modify its values in any other way. >> Put in C++ lingo, the solution vector is a "const" argument >> It would be great if you could provide an MWE to help us understand your >> problem >> >> >> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users < >> petsc-users at mcs.anl.gov> ha scritto: >> >>> Hi all, >>> >>> I have a question about this error: >>> >>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access >>>> in unknown_function() at unknown file:0 (line numbers only accurate to >>>> function begin) >>> >>> >>> I'm encountering this error in an FE solve where there is an error >>> encountered during the residual/jacobian assembly, and what we normally do >>> in that situation is shrink the load step and continue, starting from the >>> "last converged solution". However, in this case I'm running on 32 >>> processes, and 5 of the processes report the error above about a "locked >>> vector". >>> >>> We clear the SNES object (via SNESDestroy) before we reset the solution >>> to the "last converged solution", and then we make a new SNES object >>> subsequently. But it seems to me that somehow the solution vector is still >>> marked as "locked" on 5 of the processes when we modify the solution >>> vector, which leads to the error above. >>> >>> I was wondering if someone could advise on what the best way to handle >>> this would be? I thought one option could be to add an MPI barrier call >>> prior to updating the solution vector to "last converged solution", to make >>> sure that the SNES object is destroyed on all procs (and hence the locks >>> cleared) before editing the solution vector, but I'm unsure if that would >>> make a difference. Any help would be most appreciated! >>> >>> Thanks, >>> David >>> >> >> >> -- >> Stefano >> > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 17 13:25:08 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 17 Dec 2025 14:25:08 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: Message-ID: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> > On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users wrote: > > Hi, > > I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately. > > The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE. Once a PetscError is generated you CANNOT continue the PETSc program, it is not designed to allow this and trying to continue will lead to further problems. So what you need to do is prevent PETSc from getting to the point where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() returns without generating an error even if the nonlinear solver failed (for example did not converge). One then uses SNESGetConvergedReason to check if it converged or not. Normally when SNESSolve() returns, regardless of whether the converged reason is negative or positive, there will be no locked vectors and one can modify the SNES object and call SNESSolve again. So my guess is that an actual PETSc error is being generated because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by either your code or libMesh or the option -snes_error_if_not_converged is being used. In your case when you wish the code to work after a non-converged SNESSolve() these options should never be set instead you should check the result of SNESGetConvergedReason() to check if SNESSolve has failed. If SNESSetErrorIfNotConverged() is never being set that may indicate you are using an old version of PETSc or have it a bug inside PETSc's SNES that does not handle errors correctly and we can help fix the problem if you can provide a full debug output version of when the error occurs. Barry > > I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet? > > Thanks, > David > > > > On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini > wrote: >> You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way. >> Put in C++ lingo, the solution vector is a "const" argument >> It would be great if you could provide an MWE to help us understand your problem >> >> >> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users > ha scritto: >>> Hi all, >>> >>> I have a question about this error: >>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin) >>> >>> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector". >>> >>> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above. >>> >>> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any help would be most appreciated! >>> >>> Thanks, >>> David >> >> >> >> -- >> Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Wed Dec 17 13:47:58 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Wed, 17 Dec 2025 14:47:58 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: Stefano and Barry: Thank you, this is very helpful. I'll give some more info here which may help to clarify further. Normally we do just get a negative "converged reason", as you described. But in this specific case where I'm having issues the solve is a numerically sensitive creep solve, which has exponential terms in the residual and jacobian callback that can "blow up" and give NaN values. In this case, the root cause is that we hit a NaN value during a callback, and then we throw an exception (in libMesh C++ code) which I gather leads to the SNES solve exiting with this error code. Is there a way to tell the SNES to terminate with a negative "converged reason" because we've encountered some issue during the callback? Thanks, David On Wed, Dec 17, 2025 at 2:25?PM Barry Smith wrote: > > > On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi, > > I'm using PETSc via the libMesh framework, so creating a MWE is > complicated by that, unfortunately. > > The situation is that I am not modifying the solution vector in a > callback. The SNES solve has terminated, with PetscErrorCode 82, and I then > want to update the solution vector (reset it to the "previously converged > value") and then try to solve again with a smaller load increment. This is > a typical "auto load stepping" strategy in FE. > > > Once a PetscError is generated you CANNOT continue the PETSc program, > it is not designed to allow this and trying to continue will lead to > further problems. > > So what you need to do is prevent PETSc from getting to the point where > an actual PetscErrorCode of 82 is generated. Normally SNESSolve() returns > without generating an error even if the nonlinear solver failed (for > example did not converge). One then uses SNESGetConvergedReason to check if > it converged or not. Normally when SNESSolve() returns, regardless of > whether the converged reason is negative or positive, there will be no > locked vectors and one can modify the SNES object and call SNESSolve again. > > So my guess is that an actual PETSc error is being generated > because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by > either your code or libMesh or the option -snes_error_if_not_converged is > being used. In your case when you wish the code to work after a > non-converged SNESSolve() these options should never be set instead you > should check the result of SNESGetConvergedReason() to check if SNESSolve > has failed. If SNESSetErrorIfNotConverged() is never being set that may > indicate you are using an old version of PETSc or have it a bug inside > PETSc's SNES that does not handle errors correctly and we can help fix the > problem if you can provide a full debug output version of when the error > occurs. > > Barry > > > > > > > > > I think the key piece of info I'd like to know is, at what point is the > solution vector "unlocked" by the SNES object? Should it be unlocked as > soon as the SNES solve has terminated with PetscErrorCode 82? Since it > seems to me that it hasn't been unlocked yet (maybe just on a subset of the > processes). Should I manually "unlock" the solution vector by > calling VecLockWriteSet? > > Thanks, > David > > > > On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini > wrote: > >> You are not allowed to call VecGetArray on the solution vector of an SNES >> object within a user callback, nor to modify its values in any other way. >> Put in C++ lingo, the solution vector is a "const" argument >> It would be great if you could provide an MWE to help us understand your >> problem >> >> >> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users < >> petsc-users at mcs.anl.gov> ha scritto: >> >>> Hi all, >>> >>> I have a question about this error: >>> >>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access >>>> in unknown_function() at unknown file:0 (line numbers only accurate to >>>> function begin) >>> >>> >>> I'm encountering this error in an FE solve where there is an error >>> encountered during the residual/jacobian assembly, and what we normally do >>> in that situation is shrink the load step and continue, starting from the >>> "last converged solution". However, in this case I'm running on 32 >>> processes, and 5 of the processes report the error above about a "locked >>> vector". >>> >>> We clear the SNES object (via SNESDestroy) before we reset the solution >>> to the "last converged solution", and then we make a new SNES object >>> subsequently. But it seems to me that somehow the solution vector is still >>> marked as "locked" on 5 of the processes when we modify the solution >>> vector, which leads to the error above. >>> >>> I was wondering if someone could advise on what the best way to handle >>> this would be? I thought one option could be to add an MPI barrier call >>> prior to updating the solution vector to "last converged solution", to make >>> sure that the SNES object is destroyed on all procs (and hence the locks >>> cleared) before editing the solution vector, but I'm unsure if that would >>> make a difference. Any help would be most appreciated! >>> >>> Thanks, >>> David >>> >> >> >> -- >> Stefano >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Wed Dec 17 14:17:55 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Wed, 17 Dec 2025 15:17:55 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: P.S. I checked our code more carefully, and I see that the PETSC_ERROR_CODE 82 is coming from our code. Sorry for not realizing that earlier. We encounter the NaN I mentioned in my previous email, which leads to us returning that error code 82 from the "residual assembly" callback. I guess instead of doing that, we should just set the "converged reason" to be a negative value (e.g. SNES_DIVERGED_USER), and that should let PETSc exit the solve properly? Is it possible to set "converged reason" to SNES_DIVERGED_USER, or is there a better way to handle this? Thanks, David On Wed, Dec 17, 2025 at 2:47?PM David Knezevic wrote: > Stefano and Barry: Thank you, this is very helpful. > > I'll give some more info here which may help to clarify further. Normally > we do just get a negative "converged reason", as you described. But in this > specific case where I'm having issues the solve is a numerically sensitive > creep solve, which has exponential terms in the residual and jacobian > callback that can "blow up" and give NaN values. In this case, the root > cause is that we hit a NaN value during a callback, and then we throw an > exception (in libMesh C++ code) which I gather leads to the SNES solve > exiting with this error code. > > Is there a way to tell the SNES to terminate with a negative "converged > reason" because we've encountered some issue during the callback? > > Thanks, > David > > > On Wed, Dec 17, 2025 at 2:25?PM Barry Smith wrote: > >> >> >> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> Hi, >> >> I'm using PETSc via the libMesh framework, so creating a MWE is >> complicated by that, unfortunately. >> >> The situation is that I am not modifying the solution vector in a >> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then >> want to update the solution vector (reset it to the "previously converged >> value") and then try to solve again with a smaller load increment. This is >> a typical "auto load stepping" strategy in FE. >> >> >> Once a PetscError is generated you CANNOT continue the PETSc program, >> it is not designed to allow this and trying to continue will lead to >> further problems. >> >> So what you need to do is prevent PETSc from getting to the point >> where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() >> returns without generating an error even if the nonlinear solver failed >> (for example did not converge). One then uses SNESGetConvergedReason to >> check if it converged or not. Normally when SNESSolve() returns, regardless >> of whether the converged reason is negative or positive, there will be no >> locked vectors and one can modify the SNES object and call SNESSolve again. >> >> So my guess is that an actual PETSc error is being generated >> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by >> either your code or libMesh or the option -snes_error_if_not_converged is >> being used. In your case when you wish the code to work after a >> non-converged SNESSolve() these options should never be set instead you >> should check the result of SNESGetConvergedReason() to check if SNESSolve >> has failed. If SNESSetErrorIfNotConverged() is never being set that may >> indicate you are using an old version of PETSc or have it a bug inside >> PETSc's SNES that does not handle errors correctly and we can help fix the >> problem if you can provide a full debug output version of when the error >> occurs. >> >> Barry >> >> >> >> >> >> >> >> >> I think the key piece of info I'd like to know is, at what point is the >> solution vector "unlocked" by the SNES object? Should it be unlocked as >> soon as the SNES solve has terminated with PetscErrorCode 82? Since it >> seems to me that it hasn't been unlocked yet (maybe just on a subset of the >> processes). Should I manually "unlock" the solution vector by >> calling VecLockWriteSet? >> >> Thanks, >> David >> >> >> >> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini < >> stefano.zampini at gmail.com> wrote: >> >>> You are not allowed to call VecGetArray on the solution vector of an >>> SNES object within a user callback, nor to modify its values in any other >>> way. >>> Put in C++ lingo, the solution vector is a "const" argument >>> It would be great if you could provide an MWE to help us understand your >>> problem >>> >>> >>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users < >>> petsc-users at mcs.anl.gov> ha scritto: >>> >>>> Hi all, >>>> >>>> I have a question about this error: >>>> >>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only >>>>> access in unknown_function() at unknown file:0 (line numbers only accurate >>>>> to function begin) >>>> >>>> >>>> I'm encountering this error in an FE solve where there is an error >>>> encountered during the residual/jacobian assembly, and what we normally do >>>> in that situation is shrink the load step and continue, starting from the >>>> "last converged solution". However, in this case I'm running on 32 >>>> processes, and 5 of the processes report the error above about a "locked >>>> vector". >>>> >>>> We clear the SNES object (via SNESDestroy) before we reset the solution >>>> to the "last converged solution", and then we make a new SNES object >>>> subsequently. But it seems to me that somehow the solution vector is still >>>> marked as "locked" on 5 of the processes when we modify the solution >>>> vector, which leads to the error above. >>>> >>>> I was wondering if someone could advise on what the best way to handle >>>> this would be? I thought one option could be to add an MPI barrier call >>>> prior to updating the solution vector to "last converged solution", to make >>>> sure that the SNES object is destroyed on all procs (and hence the locks >>>> cleared) before editing the solution vector, but I'm unsure if that would >>>> make a difference. Any help would be most appreciated! >>>> >>>> Thanks, >>>> David >>>> >>> >>> >>> -- >>> Stefano >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 17 14:43:07 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 17 Dec 2025 15:43:07 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: > On Dec 17, 2025, at 2:47?PM, David Knezevic wrote: > > Stefano and Barry: Thank you, this is very helpful. > > I'll give some more info here which may help to clarify further. Normally we do just get a negative "converged reason", as you described. But in this specific case where I'm having issues the solve is a numerically sensitive creep solve, which has exponential terms in the residual and jacobian callback that can "blow up" and give NaN values. In this case, the root cause is that we hit a NaN value during a callback, and then we throw an exception (in libMesh C++ code) which I gather leads to the SNES solve exiting with this error code. > > Is there a way to tell the SNES to terminate with a negative "converged reason" because we've encountered some issue during the callback? In your callback you should call SNESSetFunctionDomainError() and make sure the function value has an infinity or NaN in it (you can call VecFlag() for this purpose)). Now SNESConvergedReason will be a completely reasonable SNES_DIVERGED_FUNCTION_DOMAIN Barry If you are using an ancient version of PETSc (I hope you are using the latest since that always has more bug fixes and features) that does not have SNESSetFunctionDomainError then just make sure the function vector result has an infinity or NaN in it and then SNESConvergedReason will be SNES_DIVERGED_FNORM_NAN > > Thanks, > David > > > On Wed, Dec 17, 2025 at 2:25?PM Barry Smith > wrote: >> >> >>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users > wrote: >>> >>> Hi, >>> >>> I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately. >>> >>> The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE. >> >> Once a PetscError is generated you CANNOT continue the PETSc program, it is not designed to allow this and trying to continue will lead to further problems. >> >> So what you need to do is prevent PETSc from getting to the point where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() returns without generating an error even if the nonlinear solver failed (for example did not converge). One then uses SNESGetConvergedReason to check if it converged or not. Normally when SNESSolve() returns, regardless of whether the converged reason is negative or positive, there will be no locked vectors and one can modify the SNES object and call SNESSolve again. >> >> So my guess is that an actual PETSc error is being generated because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by either your code or libMesh or the option -snes_error_if_not_converged is being used. In your case when you wish the code to work after a non-converged SNESSolve() these options should never be set instead you should check the result of SNESGetConvergedReason() to check if SNESSolve has failed. If SNESSetErrorIfNotConverged() is never being set that may indicate you are using an old version of PETSc or have it a bug inside PETSc's SNES that does not handle errors correctly and we can help fix the problem if you can provide a full debug output version of when the error occurs. >> >> Barry >> >> >> >> >> >> >> >>> >>> I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet? >>> >>> Thanks, >>> David >>> >>> >>> >>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini > wrote: >>>> You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way. >>>> Put in C++ lingo, the solution vector is a "const" argument >>>> It would be great if you could provide an MWE to help us understand your problem >>>> >>>> >>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users > ha scritto: >>>>> Hi all, >>>>> >>>>> I have a question about this error: >>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin) >>>>> >>>>> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector". >>>>> >>>>> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above. >>>>> >>>>> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any help would be most appreciated! >>>>> >>>>> Thanks, >>>>> David >>>> >>>> >>>> >>>> -- >>>> Stefano >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Thu Dec 18 07:10:14 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Thu, 18 Dec 2025 08:10:14 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: Thank you very much for this guidance. I switched to use SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now. Thanks! David On Wed, Dec 17, 2025 at 3:43?PM Barry Smith wrote: > > > On Dec 17, 2025, at 2:47?PM, David Knezevic > wrote: > > Stefano and Barry: Thank you, this is very helpful. > > I'll give some more info here which may help to clarify further. Normally > we do just get a negative "converged reason", as you described. But in this > specific case where I'm having issues the solve is a numerically sensitive > creep solve, which has exponential terms in the residual and jacobian > callback that can "blow up" and give NaN values. In this case, the root > cause is that we hit a NaN value during a callback, and then we throw an > exception (in libMesh C++ code) which I gather leads to the SNES solve > exiting with this error code. > > Is there a way to tell the SNES to terminate with a negative "converged > reason" because we've encountered some issue during the callback? > > > In your callback you should call SNESSetFunctionDomainError() and make > sure the function value has an infinity or NaN in it (you can call > VecFlag() for this purpose)). > > Now SNESConvergedReason will be a completely > reasonable SNES_DIVERGED_FUNCTION_DOMAIN > > Barry > > If you are using an ancient version of PETSc (I hope you are using the > latest since that always has more bug fixes and features) that does not > have SNESSetFunctionDomainError then just make sure the function vector > result has an infinity or NaN in it and then SNESConvergedReason will be > SNES_DIVERGED_FNORM_NAN > > > > > Thanks, > David > > > On Wed, Dec 17, 2025 at 2:25?PM Barry Smith wrote: > >> >> >> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> Hi, >> >> I'm using PETSc via the libMesh framework, so creating a MWE is >> complicated by that, unfortunately. >> >> The situation is that I am not modifying the solution vector in a >> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then >> want to update the solution vector (reset it to the "previously converged >> value") and then try to solve again with a smaller load increment. This is >> a typical "auto load stepping" strategy in FE. >> >> >> Once a PetscError is generated you CANNOT continue the PETSc program, >> it is not designed to allow this and trying to continue will lead to >> further problems. >> >> So what you need to do is prevent PETSc from getting to the point >> where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() >> returns without generating an error even if the nonlinear solver failed >> (for example did not converge). One then uses SNESGetConvergedReason to >> check if it converged or not. Normally when SNESSolve() returns, regardless >> of whether the converged reason is negative or positive, there will be no >> locked vectors and one can modify the SNES object and call SNESSolve again. >> >> So my guess is that an actual PETSc error is being generated >> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by >> either your code or libMesh or the option -snes_error_if_not_converged is >> being used. In your case when you wish the code to work after a >> non-converged SNESSolve() these options should never be set instead you >> should check the result of SNESGetConvergedReason() to check if SNESSolve >> has failed. If SNESSetErrorIfNotConverged() is never being set that may >> indicate you are using an old version of PETSc or have it a bug inside >> PETSc's SNES that does not handle errors correctly and we can help fix the >> problem if you can provide a full debug output version of when the error >> occurs. >> >> Barry >> >> >> >> >> >> >> >> >> I think the key piece of info I'd like to know is, at what point is the >> solution vector "unlocked" by the SNES object? Should it be unlocked as >> soon as the SNES solve has terminated with PetscErrorCode 82? Since it >> seems to me that it hasn't been unlocked yet (maybe just on a subset of the >> processes). Should I manually "unlock" the solution vector by >> calling VecLockWriteSet? >> >> Thanks, >> David >> >> >> >> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini < >> stefano.zampini at gmail.com> wrote: >> >>> You are not allowed to call VecGetArray on the solution vector of an >>> SNES object within a user callback, nor to modify its values in any other >>> way. >>> Put in C++ lingo, the solution vector is a "const" argument >>> It would be great if you could provide an MWE to help us understand your >>> problem >>> >>> >>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users < >>> petsc-users at mcs.anl.gov> ha scritto: >>> >>>> Hi all, >>>> >>>> I have a question about this error: >>>> >>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only >>>>> access in unknown_function() at unknown file:0 (line numbers only accurate >>>>> to function begin) >>>> >>>> >>>> I'm encountering this error in an FE solve where there is an error >>>> encountered during the residual/jacobian assembly, and what we normally do >>>> in that situation is shrink the load step and continue, starting from the >>>> "last converged solution". However, in this case I'm running on 32 >>>> processes, and 5 of the processes report the error above about a "locked >>>> vector". >>>> >>>> We clear the SNES object (via SNESDestroy) before we reset the solution >>>> to the "last converged solution", and then we make a new SNES object >>>> subsequently. But it seems to me that somehow the solution vector is still >>>> marked as "locked" on 5 of the processes when we modify the solution >>>> vector, which leads to the error above. >>>> >>>> I was wondering if someone could advise on what the best way to handle >>>> this would be? I thought one option could be to add an MPI barrier call >>>> prior to updating the solution vector to "last converged solution", to make >>>> sure that the SNES object is destroyed on all procs (and hence the locks >>>> cleared) before editing the solution vector, but I'm unsure if that would >>>> make a difference. Any help would be most appreciated! >>>> >>>> Thanks, >>>> David >>>> >>> >>> >>> -- >>> Stefano >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Dec 19 11:27:57 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 19 Dec 2025 11:27:57 -0600 Subject: [petsc-users] Calling for short user presentations in a PETSc BoF Message-ID: Dear PETSc/TAO Community, We are soliciting PETSc users to share their usage experiences, application successes, and ongoing challenges in an online Zoom Birds-of-a-Feather (BoF) session, to be held between February 10~12, 2026. We are seeking approximately five short user presentations, each consisting of a 5-minute talk followed by 2 minutes of questions. If you are interested in presenting, please contact petsc-maint at mcs.anl.gov with your talk title, a brief abstract, and your preferred time slot (11:00 AM, 1:00 PM, or 3:00 PM EST). The BoF is hosted by the Consortium for the Advancement of Scientific Software (CASS ) and led by the PESO (Partnering for Scientific Software Ecosystem Stewardship) project. The PETSc session will last 90 minutes and will take place on one day between February 10 and 12, 2026 (exact date to be finalized). Please note that the session will not be recorded. Our preferred time slot is 11:00 AM EST (5:00 PM UTC) to better accommodate European participants, although alternative options at 1:00 PM or 3:00 PM EST are also under consideration. In addition to user presentations, during the session, PETSc developers will highlight recent advances developed following the Exascale Computing Project, including the new PETSc Fortran bindings, PetscRegressor, TaoTerm, updates to PETSc GPU backends, mixed-precision support in PETSc/MUMPS, and integration with OpenFOAM, among other topics. The program will also include an open discussion of emerging PETSc research directions, such as leveraging agentic artificial intelligence to enhance and exploit the PETSc knowledge base. The BoF will provide insight into PETSc?s near-term development roadmap and offer a forum for user feedback on desired features and improvements. Active participation and questions from the audience are strongly encouraged, enabling the PETSc team to better align future development with community needs. The agenda will be posted once the program is finalized. Thank you, and we look forward to your participation. Junchao Zhang On behalf of the PETSc team -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Sun Dec 21 16:53:25 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Sun, 21 Dec 2025 17:53:25 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: Hi, actually, I have a follow up on this topic. I noticed that when I call SNESSetFunctionDomainError(), it exits the solve as expected, but it leads to a converged reason "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged reason, so that's what I'm doing now. I was surprised by this behavior, though, since I expected that calling SNESSetFunctionDomainError woudld lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to check on what could be causing this. FYI, I'm using PETSc 3.23.4 Thanks, David On Thu, Dec 18, 2025 at 8:10?AM David Knezevic wrote: > Thank you very much for this guidance. I switched to use > SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now. > > Thanks! > David > > > On Wed, Dec 17, 2025 at 3:43?PM Barry Smith wrote: > >> >> >> On Dec 17, 2025, at 2:47?PM, David Knezevic >> wrote: >> >> Stefano and Barry: Thank you, this is very helpful. >> >> I'll give some more info here which may help to clarify further. Normally >> we do just get a negative "converged reason", as you described. But in this >> specific case where I'm having issues the solve is a numerically sensitive >> creep solve, which has exponential terms in the residual and jacobian >> callback that can "blow up" and give NaN values. In this case, the root >> cause is that we hit a NaN value during a callback, and then we throw an >> exception (in libMesh C++ code) which I gather leads to the SNES solve >> exiting with this error code. >> >> Is there a way to tell the SNES to terminate with a negative "converged >> reason" because we've encountered some issue during the callback? >> >> >> In your callback you should call SNESSetFunctionDomainError() and make >> sure the function value has an infinity or NaN in it (you can call >> VecFlag() for this purpose)). >> >> Now SNESConvergedReason will be a completely >> reasonable SNES_DIVERGED_FUNCTION_DOMAIN >> >> Barry >> >> If you are using an ancient version of PETSc (I hope you are using the >> latest since that always has more bug fixes and features) that does not >> have SNESSetFunctionDomainError then just make sure the function vector >> result has an infinity or NaN in it and then SNESConvergedReason will be >> SNES_DIVERGED_FNORM_NAN >> >> >> >> >> Thanks, >> David >> >> >> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith wrote: >> >>> >>> >>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>> Hi, >>> >>> I'm using PETSc via the libMesh framework, so creating a MWE is >>> complicated by that, unfortunately. >>> >>> The situation is that I am not modifying the solution vector in a >>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then >>> want to update the solution vector (reset it to the "previously converged >>> value") and then try to solve again with a smaller load increment. This is >>> a typical "auto load stepping" strategy in FE. >>> >>> >>> Once a PetscError is generated you CANNOT continue the PETSc program, >>> it is not designed to allow this and trying to continue will lead to >>> further problems. >>> >>> So what you need to do is prevent PETSc from getting to the point >>> where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() >>> returns without generating an error even if the nonlinear solver failed >>> (for example did not converge). One then uses SNESGetConvergedReason to >>> check if it converged or not. Normally when SNESSolve() returns, regardless >>> of whether the converged reason is negative or positive, there will be no >>> locked vectors and one can modify the SNES object and call SNESSolve again. >>> >>> So my guess is that an actual PETSc error is being generated >>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by >>> either your code or libMesh or the option -snes_error_if_not_converged is >>> being used. In your case when you wish the code to work after a >>> non-converged SNESSolve() these options should never be set instead you >>> should check the result of SNESGetConvergedReason() to check if SNESSolve >>> has failed. If SNESSetErrorIfNotConverged() is never being set that may >>> indicate you are using an old version of PETSc or have it a bug inside >>> PETSc's SNES that does not handle errors correctly and we can help fix the >>> problem if you can provide a full debug output version of when the error >>> occurs. >>> >>> Barry >>> >>> >>> >>> >>> >>> >>> >>> >>> I think the key piece of info I'd like to know is, at what point is the >>> solution vector "unlocked" by the SNES object? Should it be unlocked as >>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it >>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the >>> processes). Should I manually "unlock" the solution vector by >>> calling VecLockWriteSet? >>> >>> Thanks, >>> David >>> >>> >>> >>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini < >>> stefano.zampini at gmail.com> wrote: >>> >>>> You are not allowed to call VecGetArray on the solution vector of an >>>> SNES object within a user callback, nor to modify its values in any other >>>> way. >>>> Put in C++ lingo, the solution vector is a "const" argument >>>> It would be great if you could provide an MWE to help us understand >>>> your problem >>>> >>>> >>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users >>>> ha scritto: >>>> >>>>> Hi all, >>>>> >>>>> I have a question about this error: >>>>> >>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only >>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate >>>>>> to function begin) >>>>> >>>>> >>>>> I'm encountering this error in an FE solve where there is an error >>>>> encountered during the residual/jacobian assembly, and what we normally do >>>>> in that situation is shrink the load step and continue, starting from the >>>>> "last converged solution". However, in this case I'm running on 32 >>>>> processes, and 5 of the processes report the error above about a "locked >>>>> vector". >>>>> >>>>> We clear the SNES object (via SNESDestroy) before we reset the >>>>> solution to the "last converged solution", and then we make a new SNES >>>>> object subsequently. But it seems to me that somehow the solution vector is >>>>> still marked as "locked" on 5 of the processes when we modify the solution >>>>> vector, which leads to the error above. >>>>> >>>>> I was wondering if someone could advise on what the best way to handle >>>>> this would be? I thought one option could be to add an MPI barrier call >>>>> prior to updating the solution vector to "last converged solution", to make >>>>> sure that the SNES object is destroyed on all procs (and hence the locks >>>>> cleared) before editing the solution vector, but I'm unsure if that would >>>>> make a difference. Any help would be most appreciated! >>>>> >>>>> Thanks, >>>>> David >>>>> >>>> >>>> >>>> -- >>>> Stefano >>>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From liluo at um.edu.mo Mon Dec 22 04:46:29 2025 From: liluo at um.edu.mo (liluo) Date: Mon, 22 Dec 2025 10:46:29 +0000 Subject: [petsc-users] A partition of DMPlex mesh similar to what DMDA provides? Message-ID: Dear PETSc developers, I?m using DMPlex to manage an unstructured mesh. However, in my case, the input mesh is actually a structured tetrahedral mesh, and its geometric domain is just a simple box. Is there any PETSc functionality or recommended approach to obtain a partition similar to what DMDA provides?i.e., a simple Cartesian block partition?when working with such a mesh in DMPlex? Any guidance or best practices would be greatly appreciated. Thank you! Bests, Li Luo -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Dec 22 09:25:39 2025 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 22 Dec 2025 10:25:39 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: David, This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was added long after the origins of SNES and, in places, the code was never fully updated to handle function domain problems. In particular, parts of the line search don't handle it correctly. Can you run with -snes_view and that will help us find the spot that needs to be updated. Barry > On Dec 21, 2025, at 5:53?PM, David Knezevic wrote: > > Hi, actually, I have a follow up on this topic. > > I noticed that when I call SNESSetFunctionDomainError(), it exits the solve as expected, but it leads to a converged reason "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged reason, so that's what I'm doing now. I was surprised by this behavior, though, since I expected that calling SNESSetFunctionDomainError woudld lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to check on what could be causing this. > > FYI, I'm using PETSc 3.23.4 > > Thanks, > David > > > On Thu, Dec 18, 2025 at 8:10?AM David Knezevic > wrote: >> Thank you very much for this guidance. I switched to use SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now. >> >> Thanks! >> David >> >> >> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith > wrote: >>> >>> >>>> On Dec 17, 2025, at 2:47?PM, David Knezevic > wrote: >>>> >>>> Stefano and Barry: Thank you, this is very helpful. >>>> >>>> I'll give some more info here which may help to clarify further. Normally we do just get a negative "converged reason", as you described. But in this specific case where I'm having issues the solve is a numerically sensitive creep solve, which has exponential terms in the residual and jacobian callback that can "blow up" and give NaN values. In this case, the root cause is that we hit a NaN value during a callback, and then we throw an exception (in libMesh C++ code) which I gather leads to the SNES solve exiting with this error code. >>>> >>>> Is there a way to tell the SNES to terminate with a negative "converged reason" because we've encountered some issue during the callback? >>> >>> In your callback you should call SNESSetFunctionDomainError() and make sure the function value has an infinity or NaN in it (you can call VecFlag() for this purpose)). >>> >>> Now SNESConvergedReason will be a completely reasonable SNES_DIVERGED_FUNCTION_DOMAIN >>> >>> Barry >>> >>> If you are using an ancient version of PETSc (I hope you are using the latest since that always has more bug fixes and features) that does not have SNESSetFunctionDomainError then just make sure the function vector result has an infinity or NaN in it and then SNESConvergedReason will be SNES_DIVERGED_FNORM_NAN >>> >>> >>> >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith > wrote: >>>>> >>>>> >>>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users > wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately. >>>>>> >>>>>> The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE. >>>>> >>>>> Once a PetscError is generated you CANNOT continue the PETSc program, it is not designed to allow this and trying to continue will lead to further problems. >>>>> >>>>> So what you need to do is prevent PETSc from getting to the point where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() returns without generating an error even if the nonlinear solver failed (for example did not converge). One then uses SNESGetConvergedReason to check if it converged or not. Normally when SNESSolve() returns, regardless of whether the converged reason is negative or positive, there will be no locked vectors and one can modify the SNES object and call SNESSolve again. >>>>> >>>>> So my guess is that an actual PETSc error is being generated because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by either your code or libMesh or the option -snes_error_if_not_converged is being used. In your case when you wish the code to work after a non-converged SNESSolve() these options should never be set instead you should check the result of SNESGetConvergedReason() to check if SNESSolve has failed. If SNESSetErrorIfNotConverged() is never being set that may indicate you are using an old version of PETSc or have it a bug inside PETSc's SNES that does not handle errors correctly and we can help fix the problem if you can provide a full debug output version of when the error occurs. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini > wrote: >>>>>>> You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way. >>>>>>> Put in C++ lingo, the solution vector is a "const" argument >>>>>>> It would be great if you could provide an MWE to help us understand your problem >>>>>>> >>>>>>> >>>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users > ha scritto: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I have a question about this error: >>>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin) >>>>>>>> >>>>>>>> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector". >>>>>>>> >>>>>>>> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above. >>>>>>>> >>>>>>>> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any help would be most appreciated! >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Stefano >>>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 22 10:21:22 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Dec 2025 11:21:22 -0500 Subject: [petsc-users] A partition of DMPlex mesh similar to what DMDA provides? In-Reply-To: References: Message-ID: On Mon, Dec 22, 2025 at 5:46?AM liluo wrote: > Dear PETSc developers, > > > I?m using DMPlex to manage an unstructured mesh. However, in my case, the > input mesh is actually a structured tetrahedral mesh, and its geometric > domain is just a simple box. > > > Is there any PETSc functionality or recommended approach to obtain a > partition similar to what DMDA provides?i.e., a simple Cartesian block > partition?when working with such a mesh in DMPlex? > > Any guidance or best practices would be greatly appreciated. > This is trivial in 2D because triangles nicely tile the box, but in 3D tetrahedra are harder to handle.I can see three avenues: 1) Manually You can use PlexPartitioner type user, which allows you to explicitly indicate the cell numbers that go to each process. This is probably more work than you want. 2) Mesh Partitioner + Refinement You can run a partitioner on a small mesh, for which they are pretty good, and then refine that. This is mostly what I do. 3) New algorithm Amal Timalsina published a nice algorithm for converting hexes to tets, so you could create a hex mesh that is partitioned exactly as you want, and then convert it to tets, but this would mean writing new code. Why are you using tets instead of hexes for this problem? Thanks, Matt > Thank you! > > > Bests, > > Li Luo > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a8JIUtZ9kWgwf5HLe7vrUozP6RnDa-KxLqpAyxrAnKFhl_wgCNxF1SgnsC3wHJFY61YTVZF3nYa7ruCuM9mC$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Mon Dec 22 13:58:49 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Mon, 22 Dec 2025 14:58:49 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: The print out I get from -snes_view is shown below. I wonder if the issue is related to "using user-defined postcheck step"? SNES Object: 1 MPI process type: newtonls maximum iterations=5, maximum function evaluations=10000 tolerances: relative=0., absolute=0., solution=0. total number of linear solver iterations=3 total number of function evaluations=4 norm schedule ALWAYS SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 using user-defined postcheck step KSP Object: 1 MPI process type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: 1 MPI process type: cholesky out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: external factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 1 MPI process type: mumps rows=1152, cols=1152 package used to perform factorization: mumps total: nonzeros=126936, allocated nonzeros=126936 MUMPS run parameters: Use -ksp_view ::ascii_info_detail to display information for all processes RINFOG(1) (global estimated flops for the elimination after analysis): 1.63461e+07 RINFOG(2) (global estimated flops for the assembly after factorization): 74826. RINFOG(3) (global estimated flops for the elimination after factorization): 1.63461e+07 (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0) INFOG(3) (estimated real workspace for factors on all processors after analysis): 150505 INFOG(4) (estimated integer workspace for factors on all processors after analysis): 6276 INFOG(5) (estimated maximum front size in the complete tree): 216 INFOG(6) (number of nodes in the complete tree): 24 INFOG(7) (ordering option effectively used after analysis): 2 INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 150505 INFOG(10) (total integer space store the matrix factors after factorization): 6276 INFOG(11) (order of largest frontal matrix after factorization): 216 INFOG(12) (number of off-diagonal pivots): 1044 INFOG(13) (number of delayed pivots after factorization): 0 INFOG(14) (number of memory compress after factorization): 0 INFOG(15) (number of steps of iterative refinement after solution): 0 INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2 INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 2 INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2 INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 2 INFOG(20) (estimated number of entries in the factors): 126936 INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2 INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 2 INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 INFOG(28) (after factorization: number of null pivots encountered): 0 INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 126936 INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2, 2 INFOG(32) (after analysis: type of analysis done): 1 INFOG(33) (value used for ICNTL(8)): 7 INFOG(34) (exponent of the determinant if determinant is requested): 0 INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 126936 INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0 INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0 INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0 INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0 linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqaij rows=1152, cols=1152 total: nonzeros=60480, allocated nonzeros=60480 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 384 nodes, limit used is 5 On Mon, Dec 22, 2025 at 9:25?AM Barry Smith wrote: > David, > > This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was > added long after the origins of SNES and, in places, the code was never > fully updated to handle function domain problems. In particular, parts of > the line search don't handle it correctly. Can you run with -snes_view and > that will help us find the spot that needs to be updated. > > Barry > > > On Dec 21, 2025, at 5:53?PM, David Knezevic > wrote: > > Hi, actually, I have a follow up on this topic. > > I noticed that when I call SNESSetFunctionDomainError(), it exits the > solve as expected, but it leads to a converged reason > "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also > set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the > callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged > reason, so that's what I'm doing now. I was surprised by this behavior, > though, since I expected that calling SNESSetFunctionDomainError woudld > lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to > check on what could be causing this. > > FYI, I'm using PETSc 3.23.4 > > Thanks, > David > > > On Thu, Dec 18, 2025 at 8:10?AM David Knezevic > wrote: > >> Thank you very much for this guidance. I switched to use >> SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now. >> >> Thanks! >> David >> >> >> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith wrote: >> >>> >>> >>> On Dec 17, 2025, at 2:47?PM, David Knezevic >>> wrote: >>> >>> Stefano and Barry: Thank you, this is very helpful. >>> >>> I'll give some more info here which may help to clarify further. >>> Normally we do just get a negative "converged reason", as you described. >>> But in this specific case where I'm having issues the solve is a >>> numerically sensitive creep solve, which has exponential terms in the >>> residual and jacobian callback that can "blow up" and give NaN values. In >>> this case, the root cause is that we hit a NaN value during a callback, and >>> then we throw an exception (in libMesh C++ code) which I gather leads to >>> the SNES solve exiting with this error code. >>> >>> Is there a way to tell the SNES to terminate with a negative "converged >>> reason" because we've encountered some issue during the callback? >>> >>> >>> In your callback you should call SNESSetFunctionDomainError() and >>> make sure the function value has an infinity or NaN in it (you can call >>> VecFlag() for this purpose)). >>> >>> Now SNESConvergedReason will be a completely >>> reasonable SNES_DIVERGED_FUNCTION_DOMAIN >>> >>> Barry >>> >>> If you are using an ancient version of PETSc (I hope you are using the >>> latest since that always has more bug fixes and features) that does not >>> have SNESSetFunctionDomainError then just make sure the function vector >>> result has an infinity or NaN in it and then SNESConvergedReason will be >>> SNES_DIVERGED_FNORM_NAN >>> >>> >>> >>> >>> Thanks, >>> David >>> >>> >>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith wrote: >>> >>>> >>>> >>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>> Hi, >>>> >>>> I'm using PETSc via the libMesh framework, so creating a MWE is >>>> complicated by that, unfortunately. >>>> >>>> The situation is that I am not modifying the solution vector in a >>>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then >>>> want to update the solution vector (reset it to the "previously converged >>>> value") and then try to solve again with a smaller load increment. This is >>>> a typical "auto load stepping" strategy in FE. >>>> >>>> >>>> Once a PetscError is generated you CANNOT continue the PETSc >>>> program, it is not designed to allow this and trying to continue will lead >>>> to further problems. >>>> >>>> So what you need to do is prevent PETSc from getting to the point >>>> where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() >>>> returns without generating an error even if the nonlinear solver failed >>>> (for example did not converge). One then uses SNESGetConvergedReason to >>>> check if it converged or not. Normally when SNESSolve() returns, regardless >>>> of whether the converged reason is negative or positive, there will be no >>>> locked vectors and one can modify the SNES object and call SNESSolve again. >>>> >>>> So my guess is that an actual PETSc error is being generated >>>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by >>>> either your code or libMesh or the option -snes_error_if_not_converged is >>>> being used. In your case when you wish the code to work after a >>>> non-converged SNESSolve() these options should never be set instead you >>>> should check the result of SNESGetConvergedReason() to check if SNESSolve >>>> has failed. If SNESSetErrorIfNotConverged() is never being set that may >>>> indicate you are using an old version of PETSc or have it a bug inside >>>> PETSc's SNES that does not handle errors correctly and we can help fix the >>>> problem if you can provide a full debug output version of when the error >>>> occurs. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> I think the key piece of info I'd like to know is, at what point is the >>>> solution vector "unlocked" by the SNES object? Should it be unlocked as >>>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it >>>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the >>>> processes). Should I manually "unlock" the solution vector by >>>> calling VecLockWriteSet? >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> >>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini < >>>> stefano.zampini at gmail.com> wrote: >>>> >>>>> You are not allowed to call VecGetArray on the solution vector of an >>>>> SNES object within a user callback, nor to modify its values in any other >>>>> way. >>>>> Put in C++ lingo, the solution vector is a "const" argument >>>>> It would be great if you could provide an MWE to help us understand >>>>> your problem >>>>> >>>>> >>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via >>>>> petsc-users ha scritto: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I have a question about this error: >>>>>> >>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only >>>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate >>>>>>> to function begin) >>>>>> >>>>>> >>>>>> I'm encountering this error in an FE solve where there is an error >>>>>> encountered during the residual/jacobian assembly, and what we normally do >>>>>> in that situation is shrink the load step and continue, starting from the >>>>>> "last converged solution". However, in this case I'm running on 32 >>>>>> processes, and 5 of the processes report the error above about a "locked >>>>>> vector". >>>>>> >>>>>> We clear the SNES object (via SNESDestroy) before we reset the >>>>>> solution to the "last converged solution", and then we make a new SNES >>>>>> object subsequently. But it seems to me that somehow the solution vector is >>>>>> still marked as "locked" on 5 of the processes when we modify the solution >>>>>> vector, which leads to the error above. >>>>>> >>>>>> I was wondering if someone could advise on what the best way to >>>>>> handle this would be? I thought one option could be to add an MPI barrier >>>>>> call prior to updating the solution vector to "last converged solution", to >>>>>> make sure that the SNES object is destroyed on all procs (and hence the >>>>>> locks cleared) before editing the solution vector, but I'm unsure if that >>>>>> would make a difference. Any help would be most appreciated! >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>> >>>>> >>>>> -- >>>>> Stefano >>>>> >>>> >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Mon Dec 22 14:03:03 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Mon, 22 Dec 2025 15:03:03 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: P.S. As a test I removed the "postcheck" callback, and I still get the same behavior with the DIVERGED_LINE_SEARCH converged reason, so I guess the "postcheck" is not related. On Mon, Dec 22, 2025 at 1:58?PM David Knezevic wrote: > The print out I get from -snes_view is shown below. I wonder if the issue > is related to "using user-defined postcheck step"? > > > SNES Object: 1 MPI process > type: newtonls > maximum iterations=5, maximum function evaluations=10000 > tolerances: relative=0., absolute=0., solution=0. > total number of linear solver iterations=3 > total number of function evaluations=4 > norm schedule ALWAYS > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > using user-defined postcheck step > KSP Object: 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: 1 MPI process > type: cholesky > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: external > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: 1 MPI process > type: mumps > rows=1152, cols=1152 > package used to perform factorization: mumps > total: nonzeros=126936, allocated nonzeros=126936 > MUMPS run parameters: > Use -ksp_view ::ascii_info_detail to display information > for all processes > RINFOG(1) (global estimated flops for the elimination > after analysis): 1.63461e+07 > RINFOG(2) (global estimated flops for the assembly after > factorization): 74826. > RINFOG(3) (global estimated flops for the elimination > after factorization): 1.63461e+07 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0.,0.)*(2^0) > INFOG(3) (estimated real workspace for factors on all > processors after analysis): 150505 > INFOG(4) (estimated integer workspace for factors on all > processors after analysis): 6276 > INFOG(5) (estimated maximum front size in the complete > tree): 216 > INFOG(6) (number of nodes in the complete tree): 24 > INFOG(7) (ordering option effectively used after > analysis): 2 > INFOG(8) (structural symmetry in percent of the permuted > matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix > factors after factorization): 150505 > INFOG(10) (total integer space store the matrix factors > after factorization): 6276 > INFOG(11) (order of largest frontal matrix after > factorization): 216 > INFOG(12) (number of off-diagonal pivots): 1044 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): > 0 > INFOG(15) (number of steps of iterative refinement after > solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal > data for factorization after analysis: value on the most memory consuming > processor): 2 > INFOG(17) (estimated size of all MUMPS internal data for > factorization after analysis: sum over all processors): 2 > INFOG(18) (size of all MUMPS internal data allocated > during factorization: value on the most memory consuming processor): 2 > INFOG(19) (size of all MUMPS internal data allocated > during factorization: sum over all processors): 2 > INFOG(20) (estimated number of entries in the factors): > 126936 > INFOG(21) (size in MB of memory effectively used during > factorization - value on the most memory consuming processor): 2 > INFOG(22) (size in MB of memory effectively used during > factorization - sum over all processors): 2 > INFOG(23) (after analysis: value of ICNTL(6) effectively > used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively > used): 1 > INFOG(25) (after factorization: number of pivots modified > by static pivoting): 0 > INFOG(28) (after factorization: number of null pivots > encountered): 0 > INFOG(29) (after factorization: effective number of > entries in the factors (sum over all processors)): 126936 > INFOG(30, 31) (after solution: size in Mbytes of memory > used during solution phase): 2, 2 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is > requested): 0 > INFOG(35) (after factorization: number of entries taking > into account BLR factor compression - sum over all processors): 126936 > INFOG(36) (after analysis: estimated size of all MUMPS > internal data for running BLR in-core - value on the most memory consuming > processor): 0 > INFOG(37) (after analysis: estimated size of all MUMPS > internal data for running BLR in-core - sum over all processors): 0 > INFOG(38) (after analysis: estimated size of all MUMPS > internal data for running BLR out-of-core - value on the most memory > consuming processor): 0 > INFOG(39) (after analysis: estimated size of all MUMPS > internal data for running BLR out-of-core - sum over all processors): 0 > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaij > rows=1152, cols=1152 > total: nonzeros=60480, allocated nonzeros=60480 > total number of mallocs used during MatSetValues calls=0 > using I-node routines: found 384 nodes, limit used is 5 > > > > On Mon, Dec 22, 2025 at 9:25?AM Barry Smith wrote: > >> David, >> >> This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was >> added long after the origins of SNES and, in places, the code was never >> fully updated to handle function domain problems. In particular, parts of >> the line search don't handle it correctly. Can you run with -snes_view and >> that will help us find the spot that needs to be updated. >> >> Barry >> >> >> On Dec 21, 2025, at 5:53?PM, David Knezevic >> wrote: >> >> Hi, actually, I have a follow up on this topic. >> >> I noticed that when I call SNESSetFunctionDomainError(), it exits the >> solve as expected, but it leads to a converged reason >> "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also >> set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the >> callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged >> reason, so that's what I'm doing now. I was surprised by this behavior, >> though, since I expected that calling SNESSetFunctionDomainError woudld >> lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to >> check on what could be causing this. >> >> FYI, I'm using PETSc 3.23.4 >> >> Thanks, >> David >> >> >> On Thu, Dec 18, 2025 at 8:10?AM David Knezevic < >> david.knezevic at akselos.com> wrote: >> >>> Thank you very much for this guidance. I switched to use >>> SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now. >>> >>> Thanks! >>> David >>> >>> >>> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith wrote: >>> >>>> >>>> >>>> On Dec 17, 2025, at 2:47?PM, David Knezevic >>>> wrote: >>>> >>>> Stefano and Barry: Thank you, this is very helpful. >>>> >>>> I'll give some more info here which may help to clarify further. >>>> Normally we do just get a negative "converged reason", as you described. >>>> But in this specific case where I'm having issues the solve is a >>>> numerically sensitive creep solve, which has exponential terms in the >>>> residual and jacobian callback that can "blow up" and give NaN values. In >>>> this case, the root cause is that we hit a NaN value during a callback, and >>>> then we throw an exception (in libMesh C++ code) which I gather leads to >>>> the SNES solve exiting with this error code. >>>> >>>> Is there a way to tell the SNES to terminate with a negative "converged >>>> reason" because we've encountered some issue during the callback? >>>> >>>> >>>> In your callback you should call SNESSetFunctionDomainError() and >>>> make sure the function value has an infinity or NaN in it (you can call >>>> VecFlag() for this purpose)). >>>> >>>> Now SNESConvergedReason will be a completely >>>> reasonable SNES_DIVERGED_FUNCTION_DOMAIN >>>> >>>> Barry >>>> >>>> If you are using an ancient version of PETSc (I hope you are using the >>>> latest since that always has more bug fixes and features) that does not >>>> have SNESSetFunctionDomainError then just make sure the function vector >>>> result has an infinity or NaN in it and then SNESConvergedReason will be >>>> SNES_DIVERGED_FNORM_NAN >>>> >>>> >>>> >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith wrote: >>>> >>>>> >>>>> >>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I'm using PETSc via the libMesh framework, so creating a MWE is >>>>> complicated by that, unfortunately. >>>>> >>>>> The situation is that I am not modifying the solution vector in a >>>>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then >>>>> want to update the solution vector (reset it to the "previously converged >>>>> value") and then try to solve again with a smaller load increment. This is >>>>> a typical "auto load stepping" strategy in FE. >>>>> >>>>> >>>>> Once a PetscError is generated you CANNOT continue the PETSc >>>>> program, it is not designed to allow this and trying to continue will lead >>>>> to further problems. >>>>> >>>>> So what you need to do is prevent PETSc from getting to the point >>>>> where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() >>>>> returns without generating an error even if the nonlinear solver failed >>>>> (for example did not converge). One then uses SNESGetConvergedReason to >>>>> check if it converged or not. Normally when SNESSolve() returns, regardless >>>>> of whether the converged reason is negative or positive, there will be no >>>>> locked vectors and one can modify the SNES object and call SNESSolve again. >>>>> >>>>> So my guess is that an actual PETSc error is being generated >>>>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by >>>>> either your code or libMesh or the option -snes_error_if_not_converged is >>>>> being used. In your case when you wish the code to work after a >>>>> non-converged SNESSolve() these options should never be set instead you >>>>> should check the result of SNESGetConvergedReason() to check if SNESSolve >>>>> has failed. If SNESSetErrorIfNotConverged() is never being set that may >>>>> indicate you are using an old version of PETSc or have it a bug inside >>>>> PETSc's SNES that does not handle errors correctly and we can help fix the >>>>> problem if you can provide a full debug output version of when the error >>>>> occurs. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> I think the key piece of info I'd like to know is, at what point is >>>>> the solution vector "unlocked" by the SNES object? Should it be unlocked as >>>>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it >>>>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the >>>>> processes). Should I manually "unlock" the solution vector by >>>>> calling VecLockWriteSet? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>> >>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini < >>>>> stefano.zampini at gmail.com> wrote: >>>>> >>>>>> You are not allowed to call VecGetArray on the solution vector of an >>>>>> SNES object within a user callback, nor to modify its values in any other >>>>>> way. >>>>>> Put in C++ lingo, the solution vector is a "const" argument >>>>>> It would be great if you could provide an MWE to help us understand >>>>>> your problem >>>>>> >>>>>> >>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via >>>>>> petsc-users ha scritto: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I have a question about this error: >>>>>>> >>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only >>>>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate >>>>>>>> to function begin) >>>>>>> >>>>>>> >>>>>>> I'm encountering this error in an FE solve where there is an error >>>>>>> encountered during the residual/jacobian assembly, and what we normally do >>>>>>> in that situation is shrink the load step and continue, starting from the >>>>>>> "last converged solution". However, in this case I'm running on 32 >>>>>>> processes, and 5 of the processes report the error above about a "locked >>>>>>> vector". >>>>>>> >>>>>>> We clear the SNES object (via SNESDestroy) before we reset the >>>>>>> solution to the "last converged solution", and then we make a new SNES >>>>>>> object subsequently. But it seems to me that somehow the solution vector is >>>>>>> still marked as "locked" on 5 of the processes when we modify the solution >>>>>>> vector, which leads to the error above. >>>>>>> >>>>>>> I was wondering if someone could advise on what the best way to >>>>>>> handle this would be? I thought one option could be to add an MPI barrier >>>>>>> call prior to updating the solution vector to "last converged solution", to >>>>>>> make sure that the SNES object is destroyed on all procs (and hence the >>>>>>> locks cleared) before editing the solution vector, but I'm unsure if that >>>>>>> would make a difference. Any help would be most appreciated! >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Stefano >>>>>> >>>>> >>>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From liluo at um.edu.mo Mon Dec 22 21:41:42 2025 From: liluo at um.edu.mo (liluo) Date: Tue, 23 Dec 2025 03:41:42 +0000 Subject: [petsc-users] A partition of DMPlex mesh similar to what DMDA provides? In-Reply-To: References: , Message-ID: <5e6806fe357b417e85969c8a0ae62418@um.edu.mo> Thanks for your suggestions! Since the code is already using tets for finite element discretization, I don't want to change it, but want a classical DMDA type partition. Bests, Li ________________________________ From: Matthew Knepley Sent: Tuesday, 23 December, 2025 00:21:22 To: liluo Cc: petsc-users at mcs.anl.gov; Zhang Pai Subject: Re: [petsc-users] A partition of DMPlex mesh similar to what DMDA provides? On Mon, Dec 22, 2025 at 5:?46 AM liluo wrote: Dear PETSc developers, I?m using DMPlex to manage an unstructured mesh. However, in my case, the input mesh is actually a structured tetrahedral mesh, and its geometric domain On Mon, Dec 22, 2025 at 5:46?AM liluo > wrote: Dear PETSc developers, I?m using DMPlex to manage an unstructured mesh. However, in my case, the input mesh is actually a structured tetrahedral mesh, and its geometric domain is just a simple box. Is there any PETSc functionality or recommended approach to obtain a partition similar to what DMDA provides?i.e., a simple Cartesian block partition?when working with such a mesh in DMPlex? Any guidance or best practices would be greatly appreciated. This is trivial in 2D because triangles nicely tile the box, but in 3D tetrahedra are harder to handle.I can see three avenues: 1) Manually You can use PlexPartitioner type user, which allows you to explicitly indicate the cell numbers that go to each process. This is probably more work than you want. 2) Mesh Partitioner + Refinement You can run a partitioner on a small mesh, for which they are pretty good, and then refine that. This is mostly what I do. 3) New algorithm Amal Timalsina published a nice algorithm for converting hexes to tets, so you could create a hex mesh that is partitioned exactly as you want, and then convert it to tets, but this would mean writing new code. Why are you using tets instead of hexes for this problem? Thanks, Matt Thank you! Bests, Li Luo -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f78XCziNeTCVb1AnVMYJVAf2Ped-KlffK4RRVwpO9gKfJ013n07va2DBl_SzI6is-rHSJwgZ3R8slp9jZfmZKg$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 24 22:02:17 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 24 Dec 2025 23:02:17 -0500 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: I have started a merge request to properly propagate failure reasons up from the line search to the SNESSolve in https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8914__;!!G_uCfscf7eWS!b2nWDVWqfyc96V63w_2sLd0siVZ769Ztwal8rZgfCzJ3q3V3ALVEMdGDLu6IvbSPmudCO08cQL4r0J54oVEz12k$ Could you give it a try when you get the chance? > On Dec 22, 2025, at 3:03?PM, David Knezevic wrote: > > P.S. As a test I removed the "postcheck" callback, and I still get the same behavior with the DIVERGED_LINE_SEARCH converged reason, so I guess the "postcheck" is not related. > > > On Mon, Dec 22, 2025 at 1:58?PM David Knezevic > wrote: >> The print out I get from -snes_view is shown below. I wonder if the issue is related to "using user-defined postcheck step"? >> >> >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=5, maximum function evaluations=10000 >> tolerances: relative=0., absolute=0., solution=0. >> total number of linear solver iterations=3 >> total number of function evaluations=4 >> norm schedule ALWAYS >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >> maximum iterations=40 >> using user-defined postcheck step >> KSP Object: 1 MPI process >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: 1 MPI process >> type: cholesky >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: external >> factor fill ratio given 0., needed 0. >> Factored matrix follows: >> Mat Object: 1 MPI process >> type: mumps >> rows=1152, cols=1152 >> package used to perform factorization: mumps >> total: nonzeros=126936, allocated nonzeros=126936 >> MUMPS run parameters: >> Use -ksp_view ::ascii_info_detail to display information for all processes >> RINFOG(1) (global estimated flops for the elimination after analysis): 1.63461e+07 >> RINFOG(2) (global estimated flops for the assembly after factorization): 74826. >> RINFOG(3) (global estimated flops for the elimination after factorization): 1.63461e+07 >> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0) >> INFOG(3) (estimated real workspace for factors on all processors after analysis): 150505 >> INFOG(4) (estimated integer workspace for factors on all processors after analysis): 6276 >> INFOG(5) (estimated maximum front size in the complete tree): 216 >> INFOG(6) (number of nodes in the complete tree): 24 >> INFOG(7) (ordering option effectively used after analysis): 2 >> INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 >> INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 150505 >> INFOG(10) (total integer space store the matrix factors after factorization): 6276 >> INFOG(11) (order of largest frontal matrix after factorization): 216 >> INFOG(12) (number of off-diagonal pivots): 1044 >> INFOG(13) (number of delayed pivots after factorization): 0 >> INFOG(14) (number of memory compress after factorization): 0 >> INFOG(15) (number of steps of iterative refinement after solution): 0 >> INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2 >> INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 2 >> INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2 >> INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 2 >> INFOG(20) (estimated number of entries in the factors): 126936 >> INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2 >> INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 2 >> INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 >> INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 >> INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 >> INFOG(28) (after factorization: number of null pivots encountered): 0 >> INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 126936 >> INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2, 2 >> INFOG(32) (after analysis: type of analysis done): 1 >> INFOG(33) (value used for ICNTL(8)): 7 >> INFOG(34) (exponent of the determinant if determinant is requested): 0 >> INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 126936 >> INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0 >> INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0 >> INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0 >> INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqaij >> rows=1152, cols=1152 >> total: nonzeros=60480, allocated nonzeros=60480 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 384 nodes, limit used is 5 >> >> >> >> On Mon, Dec 22, 2025 at 9:25?AM Barry Smith > wrote: >>> David, >>> >>> This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was added long after the origins of SNES and, in places, the code was never fully updated to handle function domain problems. In particular, parts of the line search don't handle it correctly. Can you run with -snes_view and that will help us find the spot that needs to be updated. >>> >>> Barry >>> >>> >>>> On Dec 21, 2025, at 5:53?PM, David Knezevic > wrote: >>>> >>>> Hi, actually, I have a follow up on this topic. >>>> >>>> I noticed that when I call SNESSetFunctionDomainError(), it exits the solve as expected, but it leads to a converged reason "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged reason, so that's what I'm doing now. I was surprised by this behavior, though, since I expected that calling SNESSetFunctionDomainError woudld lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to check on what could be causing this. >>>> >>>> FYI, I'm using PETSc 3.23.4 >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> On Thu, Dec 18, 2025 at 8:10?AM David Knezevic > wrote: >>>>> Thank you very much for this guidance. I switched to use SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now. >>>>> >>>>> Thanks! >>>>> David >>>>> >>>>> >>>>> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith > wrote: >>>>>> >>>>>> >>>>>>> On Dec 17, 2025, at 2:47?PM, David Knezevic > wrote: >>>>>>> >>>>>>> Stefano and Barry: Thank you, this is very helpful. >>>>>>> >>>>>>> I'll give some more info here which may help to clarify further. Normally we do just get a negative "converged reason", as you described. But in this specific case where I'm having issues the solve is a numerically sensitive creep solve, which has exponential terms in the residual and jacobian callback that can "blow up" and give NaN values. In this case, the root cause is that we hit a NaN value during a callback, and then we throw an exception (in libMesh C++ code) which I gather leads to the SNES solve exiting with this error code. >>>>>>> >>>>>>> Is there a way to tell the SNES to terminate with a negative "converged reason" because we've encountered some issue during the callback? >>>>>> >>>>>> In your callback you should call SNESSetFunctionDomainError() and make sure the function value has an infinity or NaN in it (you can call VecFlag() for this purpose)). >>>>>> >>>>>> Now SNESConvergedReason will be a completely reasonable SNES_DIVERGED_FUNCTION_DOMAIN >>>>>> >>>>>> Barry >>>>>> >>>>>> If you are using an ancient version of PETSc (I hope you are using the latest since that always has more bug fixes and features) that does not have SNESSetFunctionDomainError then just make sure the function vector result has an infinity or NaN in it and then SNESConvergedReason will be SNES_DIVERGED_FNORM_NAN >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith > wrote: >>>>>>>> >>>>>>>> >>>>>>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users > wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I'm using PETSc via the libMesh framework, so creating a MWE is complicated by that, unfortunately. >>>>>>>>> >>>>>>>>> The situation is that I am not modifying the solution vector in a callback. The SNES solve has terminated, with PetscErrorCode 82, and I then want to update the solution vector (reset it to the "previously converged value") and then try to solve again with a smaller load increment. This is a typical "auto load stepping" strategy in FE. >>>>>>>> >>>>>>>> Once a PetscError is generated you CANNOT continue the PETSc program, it is not designed to allow this and trying to continue will lead to further problems. >>>>>>>> >>>>>>>> So what you need to do is prevent PETSc from getting to the point where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() returns without generating an error even if the nonlinear solver failed (for example did not converge). One then uses SNESGetConvergedReason to check if it converged or not. Normally when SNESSolve() returns, regardless of whether the converged reason is negative or positive, there will be no locked vectors and one can modify the SNES object and call SNESSolve again. >>>>>>>> >>>>>>>> So my guess is that an actual PETSc error is being generated because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by either your code or libMesh or the option -snes_error_if_not_converged is being used. In your case when you wish the code to work after a non-converged SNESSolve() these options should never be set instead you should check the result of SNESGetConvergedReason() to check if SNESSolve has failed. If SNESSetErrorIfNotConverged() is never being set that may indicate you are using an old version of PETSc or have it a bug inside PETSc's SNES that does not handle errors correctly and we can help fix the problem if you can provide a full debug output version of when the error occurs. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> I think the key piece of info I'd like to know is, at what point is the solution vector "unlocked" by the SNES object? Should it be unlocked as soon as the SNES solve has terminated with PetscErrorCode 82? Since it seems to me that it hasn't been unlocked yet (maybe just on a subset of the processes). Should I manually "unlock" the solution vector by calling VecLockWriteSet? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini > wrote: >>>>>>>>>> You are not allowed to call VecGetArray on the solution vector of an SNES object within a user callback, nor to modify its values in any other way. >>>>>>>>>> Put in C++ lingo, the solution vector is a "const" argument >>>>>>>>>> It would be great if you could provide an MWE to help us understand your problem >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via petsc-users > ha scritto: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I have a question about this error: >>>>>>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only access in unknown_function() at unknown file:0 (line numbers only accurate to function begin) >>>>>>>>>>> >>>>>>>>>>> I'm encountering this error in an FE solve where there is an error encountered during the residual/jacobian assembly, and what we normally do in that situation is shrink the load step and continue, starting from the "last converged solution". However, in this case I'm running on 32 processes, and 5 of the processes report the error above about a "locked vector". >>>>>>>>>>> >>>>>>>>>>> We clear the SNES object (via SNESDestroy) before we reset the solution to the "last converged solution", and then we make a new SNES object subsequently. But it seems to me that somehow the solution vector is still marked as "locked" on 5 of the processes when we modify the solution vector, which leads to the error above. >>>>>>>>>>> >>>>>>>>>>> I was wondering if someone could advise on what the best way to handle this would be? I thought one option could be to add an MPI barrier call prior to updating the solution vector to "last converged solution", to make sure that the SNES object is destroyed on all procs (and hence the locks cleared) before editing the solution vector, but I'm unsure if that would make a difference. Any help would be most appreciated! >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Stefano >>>>>>>> >>>>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Thu Dec 25 15:00:26 2025 From: david.knezevic at akselos.com (David Knezevic) Date: Thu, 25 Dec 2025 15:00:26 -0600 Subject: [petsc-users] Question regarding SNES error about locked vectors In-Reply-To: References: <855F3D06-08B9-4CD1-ABE8-3E55D4DD802E@petsc.dev> Message-ID: OK, thanks! I'll let you know once I get a chance to try it out. On Wed, Dec 24, 2025 at 10:02?PM Barry Smith wrote: > I have started a merge request to properly propagate failure reasons up > from the line search to the SNESSolve in > https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8914__;!!G_uCfscf7eWS!c1k47peCRJtiG7O9EYxpFZUWSVyAnoq-6zoYdEPVFi0-gbNBHUxwlalV7EwvUqCe4iRdsX2nR2S2lzW1Ww7O2LY0rRmDhMQ$ Could you give it a > try when you get the chance? > > > On Dec 22, 2025, at 3:03?PM, David Knezevic > wrote: > > P.S. As a test I removed the "postcheck" callback, and I still get > the same behavior with the DIVERGED_LINE_SEARCH converged reason, so I > guess the "postcheck" is not related. > > > On Mon, Dec 22, 2025 at 1:58?PM David Knezevic > wrote: > >> The print out I get from -snes_view is shown below. I wonder if the issue >> is related to "using user-defined postcheck step"? >> >> >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=5, maximum function evaluations=10000 >> tolerances: relative=0., absolute=0., solution=0. >> total number of linear solver iterations=3 >> total number of function evaluations=4 >> norm schedule ALWAYS >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> using user-defined postcheck step >> KSP Object: 1 MPI process >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: 1 MPI process >> type: cholesky >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: external >> factor fill ratio given 0., needed 0. >> Factored matrix follows: >> Mat Object: 1 MPI process >> type: mumps >> rows=1152, cols=1152 >> package used to perform factorization: mumps >> total: nonzeros=126936, allocated nonzeros=126936 >> MUMPS run parameters: >> Use -ksp_view ::ascii_info_detail to display information >> for all processes >> RINFOG(1) (global estimated flops for the elimination >> after analysis): 1.63461e+07 >> RINFOG(2) (global estimated flops for the assembly after >> factorization): 74826. >> RINFOG(3) (global estimated flops for the elimination >> after factorization): 1.63461e+07 >> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >> (0.,0.)*(2^0) >> INFOG(3) (estimated real workspace for factors on all >> processors after analysis): 150505 >> INFOG(4) (estimated integer workspace for factors on all >> processors after analysis): 6276 >> INFOG(5) (estimated maximum front size in the complete >> tree): 216 >> INFOG(6) (number of nodes in the complete tree): 24 >> INFOG(7) (ordering option effectively used after >> analysis): 2 >> INFOG(8) (structural symmetry in percent of the permuted >> matrix after analysis): 100 >> INFOG(9) (total real/complex workspace to store the >> matrix factors after factorization): 150505 >> INFOG(10) (total integer space store the matrix factors >> after factorization): 6276 >> INFOG(11) (order of largest frontal matrix after >> factorization): 216 >> INFOG(12) (number of off-diagonal pivots): 1044 >> INFOG(13) (number of delayed pivots after factorization): >> 0 >> INFOG(14) (number of memory compress after >> factorization): 0 >> INFOG(15) (number of steps of iterative refinement after >> solution): 0 >> INFOG(16) (estimated size (in MB) of all MUMPS internal >> data for factorization after analysis: value on the most memory consuming >> processor): 2 >> INFOG(17) (estimated size of all MUMPS internal data for >> factorization after analysis: sum over all processors): 2 >> INFOG(18) (size of all MUMPS internal data allocated >> during factorization: value on the most memory consuming processor): 2 >> INFOG(19) (size of all MUMPS internal data allocated >> during factorization: sum over all processors): 2 >> INFOG(20) (estimated number of entries in the factors): >> 126936 >> INFOG(21) (size in MB of memory effectively used during >> factorization - value on the most memory consuming processor): 2 >> INFOG(22) (size in MB of memory effectively used during >> factorization - sum over all processors): 2 >> INFOG(23) (after analysis: value of ICNTL(6) effectively >> used): 0 >> INFOG(24) (after analysis: value of ICNTL(12) effectively >> used): 1 >> INFOG(25) (after factorization: number of pivots modified >> by static pivoting): 0 >> INFOG(28) (after factorization: number of null pivots >> encountered): 0 >> INFOG(29) (after factorization: effective number of >> entries in the factors (sum over all processors)): 126936 >> INFOG(30, 31) (after solution: size in Mbytes of memory >> used during solution phase): 2, 2 >> INFOG(32) (after analysis: type of analysis done): 1 >> INFOG(33) (value used for ICNTL(8)): 7 >> INFOG(34) (exponent of the determinant if determinant is >> requested): 0 >> INFOG(35) (after factorization: number of entries taking >> into account BLR factor compression - sum over all processors): 126936 >> INFOG(36) (after analysis: estimated size of all MUMPS >> internal data for running BLR in-core - value on the most memory consuming >> processor): 0 >> INFOG(37) (after analysis: estimated size of all MUMPS >> internal data for running BLR in-core - sum over all processors): 0 >> INFOG(38) (after analysis: estimated size of all MUMPS >> internal data for running BLR out-of-core - value on the most memory >> consuming processor): 0 >> INFOG(39) (after analysis: estimated size of all MUMPS >> internal data for running BLR out-of-core - sum over all processors): 0 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqaij >> rows=1152, cols=1152 >> total: nonzeros=60480, allocated nonzeros=60480 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 384 nodes, limit used is 5 >> >> >> >> On Mon, Dec 22, 2025 at 9:25?AM Barry Smith wrote: >> >>> David, >>> >>> This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was >>> added long after the origins of SNES and, in places, the code was never >>> fully updated to handle function domain problems. In particular, parts of >>> the line search don't handle it correctly. Can you run with -snes_view and >>> that will help us find the spot that needs to be updated. >>> >>> Barry >>> >>> >>> On Dec 21, 2025, at 5:53?PM, David Knezevic >>> wrote: >>> >>> Hi, actually, I have a follow up on this topic. >>> >>> I noticed that when I call SNESSetFunctionDomainError(), it exits the >>> solve as expected, but it leads to a converged reason >>> "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also >>> set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the >>> callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged >>> reason, so that's what I'm doing now. I was surprised by this behavior, >>> though, since I expected that calling SNESSetFunctionDomainError woudld >>> lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to >>> check on what could be causing this. >>> >>> FYI, I'm using PETSc 3.23.4 >>> >>> Thanks, >>> David >>> >>> >>> On Thu, Dec 18, 2025 at 8:10?AM David Knezevic < >>> david.knezevic at akselos.com> wrote: >>> >>>> Thank you very much for this guidance. I switched to use >>>> SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now. >>>> >>>> Thanks! >>>> David >>>> >>>> >>>> On Wed, Dec 17, 2025 at 3:43?PM Barry Smith wrote: >>>> >>>>> >>>>> >>>>> On Dec 17, 2025, at 2:47?PM, David Knezevic < >>>>> david.knezevic at akselos.com> wrote: >>>>> >>>>> Stefano and Barry: Thank you, this is very helpful. >>>>> >>>>> I'll give some more info here which may help to clarify further. >>>>> Normally we do just get a negative "converged reason", as you described. >>>>> But in this specific case where I'm having issues the solve is a >>>>> numerically sensitive creep solve, which has exponential terms in the >>>>> residual and jacobian callback that can "blow up" and give NaN values. In >>>>> this case, the root cause is that we hit a NaN value during a callback, and >>>>> then we throw an exception (in libMesh C++ code) which I gather leads to >>>>> the SNES solve exiting with this error code. >>>>> >>>>> Is there a way to tell the SNES to terminate with a negative >>>>> "converged reason" because we've encountered some issue during the callback? >>>>> >>>>> >>>>> In your callback you should call SNESSetFunctionDomainError() and >>>>> make sure the function value has an infinity or NaN in it (you can call >>>>> VecFlag() for this purpose)). >>>>> >>>>> Now SNESConvergedReason will be a completely >>>>> reasonable SNES_DIVERGED_FUNCTION_DOMAIN >>>>> >>>>> Barry >>>>> >>>>> If you are using an ancient version of PETSc (I hope you are using the >>>>> latest since that always has more bug fixes and features) that does not >>>>> have SNESSetFunctionDomainError then just make sure the function vector >>>>> result has an infinity or NaN in it and then SNESConvergedReason will be >>>>> SNES_DIVERGED_FNORM_NAN >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>> On Wed, Dec 17, 2025 at 2:25?PM Barry Smith wrote: >>>>> >>>>>> >>>>>> >>>>>> On Dec 17, 2025, at 2:08?PM, David Knezevic via petsc-users < >>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I'm using PETSc via the libMesh framework, so creating a MWE is >>>>>> complicated by that, unfortunately. >>>>>> >>>>>> The situation is that I am not modifying the solution vector in a >>>>>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then >>>>>> want to update the solution vector (reset it to the "previously converged >>>>>> value") and then try to solve again with a smaller load increment. This is >>>>>> a typical "auto load stepping" strategy in FE. >>>>>> >>>>>> >>>>>> Once a PetscError is generated you CANNOT continue the PETSc >>>>>> program, it is not designed to allow this and trying to continue will lead >>>>>> to further problems. >>>>>> >>>>>> So what you need to do is prevent PETSc from getting to the point >>>>>> where an actual PetscErrorCode of 82 is generated. Normally SNESSolve() >>>>>> returns without generating an error even if the nonlinear solver failed >>>>>> (for example did not converge). One then uses SNESGetConvergedReason to >>>>>> check if it converged or not. Normally when SNESSolve() returns, regardless >>>>>> of whether the converged reason is negative or positive, there will be no >>>>>> locked vectors and one can modify the SNES object and call SNESSolve again. >>>>>> >>>>>> So my guess is that an actual PETSc error is being generated >>>>>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by >>>>>> either your code or libMesh or the option -snes_error_if_not_converged is >>>>>> being used. In your case when you wish the code to work after a >>>>>> non-converged SNESSolve() these options should never be set instead you >>>>>> should check the result of SNESGetConvergedReason() to check if SNESSolve >>>>>> has failed. If SNESSetErrorIfNotConverged() is never being set that may >>>>>> indicate you are using an old version of PETSc or have it a bug inside >>>>>> PETSc's SNES that does not handle errors correctly and we can help fix the >>>>>> problem if you can provide a full debug output version of when the error >>>>>> occurs. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I think the key piece of info I'd like to know is, at what point is >>>>>> the solution vector "unlocked" by the SNES object? Should it be unlocked as >>>>>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it >>>>>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the >>>>>> processes). Should I manually "unlock" the solution vector by >>>>>> calling VecLockWriteSet? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Dec 17, 2025 at 2:02?PM Stefano Zampini < >>>>>> stefano.zampini at gmail.com> wrote: >>>>>> >>>>>>> You are not allowed to call VecGetArray on the solution vector of an >>>>>>> SNES object within a user callback, nor to modify its values in any other >>>>>>> way. >>>>>>> Put in C++ lingo, the solution vector is a "const" argument >>>>>>> It would be great if you could provide an MWE to help us understand >>>>>>> your problem >>>>>>> >>>>>>> >>>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via >>>>>>> petsc-users ha scritto: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I have a question about this error: >>>>>>>> >>>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only >>>>>>>>> access in unknown_function() at unknown file:0 (line numbers only accurate >>>>>>>>> to function begin) >>>>>>>> >>>>>>>> >>>>>>>> I'm encountering this error in an FE solve where there is an error >>>>>>>> encountered during the residual/jacobian assembly, and what we normally do >>>>>>>> in that situation is shrink the load step and continue, starting from the >>>>>>>> "last converged solution". However, in this case I'm running on 32 >>>>>>>> processes, and 5 of the processes report the error above about a "locked >>>>>>>> vector". >>>>>>>> >>>>>>>> We clear the SNES object (via SNESDestroy) before we reset the >>>>>>>> solution to the "last converged solution", and then we make a new SNES >>>>>>>> object subsequently. But it seems to me that somehow the solution vector is >>>>>>>> still marked as "locked" on 5 of the processes when we modify the solution >>>>>>>> vector, which leads to the error above. >>>>>>>> >>>>>>>> I was wondering if someone could advise on what the best way to >>>>>>>> handle this would be? I thought one option could be to add an MPI barrier >>>>>>>> call prior to updating the solution vector to "last converged solution", to >>>>>>>> make sure that the SNES object is destroyed on all procs (and hence the >>>>>>>> locks cleared) before editing the solution vector, but I'm unsure if that >>>>>>>> would make a difference. Any help would be most appreciated! >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Stefano >>>>>>> >>>>>> >>>>>> >>>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: