From tisaac at anl.gov Mon Jun 2 10:47:05 2025 From: tisaac at anl.gov (Isaac, Toby) Date: Mon, 2 Jun 2025 15:47:05 +0000 Subject: [petsc-users] Request for comment: proposed 2026 meeting dates Message-ID: Hello PETSc users, We had a great time at the PETSc Annual Meeting this year in Buffalo, NY, and we are already looking forward to next year. Our friends at the Firedrake Project have proposed holding a joint PETSc/Firedrake meeting next year in the UK near London, and have suggested dates in the week of **June 1-5, 2026**. It would be great to find dates that work for as many people in our community who are interested in attending as possible. If there are conflicting events that we should be aware of, please let us know, either by responding to this email, or in a thread we have made for this purpose in our discord: https://urldefense.us/v3/__https://discord.com/channels/1119324534303109172/1379121958318374965__;!!G_uCfscf7eWS!e-pwbZysvB5ie_fmaCpEVemEfV7iRKPWWURwEw6pi1IdyXbXmqlEnTmtxa9KGlcghXtomY86y45Roa6WFkdMkw$ . Cheers, Toby From bramkamp at nsc.liu.se Wed Jun 4 08:35:37 2025 From: bramkamp at nsc.liu.se (Frank Bramkamp) Date: Wed, 4 Jun 2025 15:35:37 +0200 Subject: [petsc-users] BLOCK INVERSE FOR ILU Message-ID: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> Dear PETSc team, I have a general question regarding the implementation of the inverse For the ILU factorization. I am looking e.g. on the implementation for block size 6 for the BAIJ matrix format in serial: src/mat/impls/baij/seq/dgefa6.c I wonder: You are using pivoting in the inverse computation. But if one changes the position of elements or rows, do we not have to apply an inverse permutation at the end to get the inverse of the original matrix to some column ?! Otherwise it is the inverse of the permutated matrix ?! I am not quite sure if one has to do this in principle or if my understanding is not quite correct. Maybe I miss something in your code. Thanks, Frank From dsalac at gmail.com Wed Jun 4 09:52:03 2025 From: dsalac at gmail.com (David Salac) Date: Wed, 4 Jun 2025 10:52:03 -0400 Subject: [petsc-users] DMPlexTransform with DMForest Message-ID: <0ea3e043-58b3-416a-9a11-65c094500b42@gmail.com> Hello PETSc team, I am trying to extrude some cells from a mesh that comes from a DMForest. The DMForest is converted to a DMPlex which is then being used in a modified version of DMPlexExtrude (patch is included). The issue is the DMForest to DMPlex conversion switches the stratum ordering. It appears that DMPlexTransform assumes that you have the standard DMPlex ordering: Cells->Vertices->Edges in 2D. The plex that comes from DMForest has Cells->Edges->Vertices. This results in an edge being incorrectly flagged as outside of the range of edges. I'm including a minimal code showing this. Sample output and the error message is also shown below. Thanks, Dave Original Plex: ??? 0: [100,221) ??? 1: [221,441) ??? 2: [0,100) Before Extrude Box: +0.000000e+00, +1.000000e+00 +0.000000e+00, +1.000000e+00 ?After Extrude Box: -1.000000e-01, +1.100000e+00 -1.000000e-01, +1.100000e+00 Converted Plex: ??? 0: [320,441) ??? 1: [100,320) ??? 2: [0,100) Before Extrude Box: +0.000000e+00, +1.000000e+00 +0.000000e+00, +1.000000e+00 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: PETSc has generated inconsistent data [0]PETSC ERROR: Point 100 is not a segment [221, 441) [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!cKv8WrCLXVqU9jE_pHlDNyVtTMk8aT2xxJcPTjksuQP4PiWtA5Kv8E0Ph8jKhOm7MbkM6qFHmK3R5N4y_sQ$ for trouble shooting. [0]PETSC ERROR: PETSc Development Git Revision: v3.23.3-199-g3b15527e756 Git Date: 2025-06-03 15:20:49 +0000 [0]PETSC ERROR: ./ex with 1 MPI process(es) and PETSC_ARCH arch-ablate-opt on libuse.eng.buffalo.edu by davidsal Wed Jun? 4 10:39:41 2025 [0]PETSC ERROR: Configure options: --with-64-bit-indices=0 --download-metis --download-mpich --download-p4est --download-parmetis --download-scalapack --download-slepc --download-tetgen --download-triangle --download-zlib --with-debugging=0 --with-libpng --with-slepc PETSC_ARCH=arch-ablate-opt --with-fortran-bindings=0 [0]PETSC ERROR: #1 DMPlexTransformGetTargetPoint() at /home/davidsal/Research/petsc/src/dm/impls/plex/transform/interface/plextransform.c:1003 [0]PETSC ERROR: #2 DMPlexTransformSetConeSizes() at /home/davidsal/Research/petsc/src/dm/impls/plex/transform/interface/plextransform.c:1400 [0]PETSC ERROR: #3 DMPlexTransformApply() at /home/davidsal/Research/petsc/src/dm/impls/plex/transform/interface/plextransform.c:2385 [0]PETSC ERROR: #4 DMPlexExtrude() at /home/davidsal/Research/petsc/src/dm/impls/plex/plexextrude.c:82 [0]PETSC ERROR: #5 extrudeMesh() at ex.c:34 [0]PETSC ERROR: #6 main() at ex.c:86 [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- -------------- next part -------------- A non-text attachment was scrubbed... Name: extrude.patch Type: text/x-patch Size: 3981 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex.c Type: text/x-csrc Size: 3838 bytes Desc: not available URL: From mfadams at lbl.gov Wed Jun 4 12:06:55 2025 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 4 Jun 2025 13:06:55 -0400 Subject: [petsc-users] BLOCK INVERSE FOR ILU In-Reply-To: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> References: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> Message-ID: It looks like it pivots in blocks and is a point-wise ILU: https://urldefense.us/v3/__https://petsc.org/release/manualpages/PC/PCILU/__;!!G_uCfscf7eWS!d82xhlyQ6-cWmSO0PBI3pnGPDROVlCOPZn87hYtI4Pi9Kkbf4wgpQjnYsrJVy0pwHGtZYvPKXDxQCoq9BTvE6Ao$ Mark On Wed, Jun 4, 2025 at 9:36?AM Frank Bramkamp wrote: > Dear PETSc team, > > I have a general question regarding the implementation of the inverse > For the ILU factorization. > > I am looking e.g. on the implementation for block size 6 for the BAIJ > matrix format in serial: > src/mat/impls/baij/seq/dgefa6.c > > I wonder: > You are using pivoting in the inverse computation. > > But if one changes the position of elements or rows, do we not have to > apply an inverse permutation at the end > to get the inverse of the original matrix to some column ?! Otherwise it > is the inverse of the permutated matrix ?! > > I am not quite sure if one has to do this in principle or if my > understanding is not quite correct. > Maybe I miss something in your code. > > > Thanks, Frank > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Jun 4 15:59:36 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 4 Jun 2025 16:59:36 -0400 Subject: [petsc-users] BLOCK INVERSE FOR ILU In-Reply-To: References: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> Message-ID: > On Jun 4, 2025, at 1:06?PM, Mark Adams wrote: > > It looks like it pivots in blocks and is a point-wise ILU: https://urldefense.us/v3/__https://petsc.org/release/manualpages/PC/PCILU/__;!!G_uCfscf7eWS!bLJjV_o5EdHrYFCvaAx1cH4gN3bYp79B0t10NWDhrlPZQWdn7KcKBX_pOyjuYD5pyZO4oNCt22EC7u-7hUelMUc$ > > Mark > > On Wed, Jun 4, 2025 at 9:36?AM Frank Bramkamp > wrote: >> Dear PETSc team, >> >> I have a general question regarding the implementation of the inverse >> For the ILU factorization. >> >> I am looking e.g. on the implementation for block size 6 for the BAIJ matrix format in serial: >> src/mat/impls/baij/seq/dgefa6.c >> >> I wonder: >> You are using pivoting in the inverse computation. >> >> But if one changes the position of elements or rows, do we not have to apply an inverse permutation at the end >> to get the inverse of the original matrix to some column ?! Otherwise it is the inverse of the permutated matrix ?! To compute the explicit inverse it does 6 triangular solves with columns of the identity. During the triangular solves it does the needed permutations so the inverse is the inverse of the original matrix, not a permuted version. >> >> I am not quite sure if one has to do this in principle or if my understanding is not quite correct. >> Maybe I miss something in your code. >> >> >> Thanks, Frank >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Wed Jun 4 20:30:39 2025 From: yangzongze at gmail.com (Zongze Yang) Date: Thu, 5 Jun 2025 01:30:39 +0000 Subject: [petsc-users] Issue with tau_exec and PETSc: "perfstubs could not be initialized" on macOS M-series Message-ID: Dear PETSc team, I?m encountering an issue when running a PETSc-based application with TAU instrumentation on macOS with Apple Silicon (M-series chip). The command I used is: ``` tau_exec ./ex56 -log_perfstubs ??``` However, it results in the following error: ?``` ? tau_exec ./ex56 -log_perfstubs [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Error in external library [0]PETSC ERROR: perfstubs could not be initialized [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!Y921vmhOfEgEAvsThMr8bYV0Dl7pvKJs2Jjextm_IgBm1MrgwUsBGAxFZ9uPb9Ev5ZSFB2VzhgsJcDkhw3ZDGTII$ for trouble shooting. [0]PETSC ERROR: PETSc Release Version 3.23.0, unknown [0]PETSC ERROR: ./ex56 with 1 MPI process(es) and PETSC_ARCH arch-firedrake-default on yzzs-mac.local by zzyang Thu Jun 5 09:19:42 2025 [0]PETSC ERROR: Configure options: --COPTFLAGS="-O3 -march=native -mtune=native" --CXXOPTFLAGS="-O3 -march=native -mtune=native" --FOPTFLAGS="-O3 -mtune=native" --download-bison --download-cmake --download-ctetgen --download-eigen --download-fftw --download-hpddm --download-hypre --download-libpng --download-metis --download-mmg --download-mumps --download-mumps-avoid-mpi-in-place --download-netcdf --download-p4est --download-parmmg --download-pnetcdf --download-pragmatic --download-ptscotch --download-scalapack --download-slepc --download-suitesparse --download-superlu_dist --download-tetgen --download-triangle --with-c2html=0 --with-debugging=0 --with-fortran-bindings=0 --with-hdf5-dir=/opt/homebrew --with-hwloc-dir=/opt/homebrew --with-shared-libraries=1 --with-strict-petscerrorcode --with-zlib PETSC_ARCH=arch-firedrake-default --with-tau-perfstubs [0]PETSC ERROR: #1 PetscLogHandlerCreate_Perfstubs() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/perfstubs/logperfstubs.c:184 [0]PETSC ERROR: #2 PetscLogHandlerSetType() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/interface/lhreg.c:104 [0]PETSC ERROR: #3 PetscLogTypeBegin() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/plog.c:423 [0]PETSC ERROR: #4 PetscLogPerfstubsBegin() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/plog.c:658 [0]PETSC ERROR: #5 PetscOptionsCheckInitial_Private() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/objects/init.c:528 [0]PETSC ERROR: #6 PetscInitialize_Common() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/objects/pinit.c:1046 [0]PETSC ERROR: #7 PetscInitialize() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/objects/pinit.c:1369 [0]PETSC ERROR: #8 main() at ex56.c:208 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -log_perfstubs (source: command line) [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF Proc: [[8742,0],0] Errorcode: 76 NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- ``` I would appreciate any guidance on how to resolve or further debug this issue: 1. Are there known issues with PerfStubs support on Apple Silicon? 2. Are there recommended configuration steps or flags needed when building TAU and PETSc for this platform? -------------- next part -------------- An HTML attachment was scrubbed... URL: From bramkamp at nsc.liu.se Thu Jun 5 03:27:00 2025 From: bramkamp at nsc.liu.se (Frank Bramkamp) Date: Thu, 5 Jun 2025 10:27:00 +0200 Subject: [petsc-users] BLOCK INVERSE FOR ILU In-Reply-To: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> References: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> Message-ID: Thanks for the response, I refer to the routine src/mat/impls/baij/seq/dgefa6.c /* Inverts 6 by 6 matrix using gaussian elimination with partial pivoting. Used by the sparse factorization routines in src/mat/impls/baij/seq This is a combination of the Linpack routines dgefa() and dgedi() specialized for a size of 6. */ which implements the the computation of an inverse for a dense 6x6 block. I think this routine is used in the ILU numeric factorization to compute the inverse diagonal block when we later apply ILU preconditioning where we need the inverse. And there it looks we use a direct LU solve and not triangular solves ?! In this context I am not sure if something is missing at the end or not regarding the inverse pivoting ?! Greetings, Frank From knepley at gmail.com Thu Jun 5 06:49:51 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 5 Jun 2025 05:49:51 -0600 Subject: [petsc-users] BLOCK INVERSE FOR ILU In-Reply-To: References: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> Message-ID: On Thu, Jun 5, 2025 at 2:27?AM Frank Bramkamp wrote: > Thanks for the response, > > > I refer to the routine src/mat/impls/baij/seq/dgefa6.c > > > /* > Inverts 6 by 6 matrix using gaussian elimination with partial > pivoting. > > Used by the sparse factorization routines in > src/mat/impls/baij/seq > > This is a combination of the Linpack routines > dgefa() and dgedi() specialized for a size of 6. > > */ > > > which implements the the computation of an inverse for a dense 6x6 block. > I think this routine is used in the ILU numeric factorization to compute > the inverse diagonal block > when we later apply ILU preconditioning where we need the inverse. > > And there it looks we use a direct LU solve and not triangular solves ?! > > In this context I am not sure if something is missing at the end or not > regarding the inverse pivoting ?! > We do not think so, but there is a simple test for this. You can construct a small block diagonal matrix with blocksize 6, and apply it to the vector [1, 2, 3, 4, 5, 6 ...]. Use that vector as the RHS for a solve with preonly/ILU. That should get the exact inverse and output the original vector. If the vector is permuted, then we indeed have a bug. Thanks! Matt > Greetings, Frank > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!d8vX5tFZOpXshlbVrdeXXKCaT9OQePsnBaR3YBpe5DEoej9o3r1ZB9Rs-RgRm4xVgXdw-FNm8ShhNQ_vNueb$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jun 5 09:08:37 2025 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 5 Jun 2025 10:08:37 -0400 Subject: [petsc-users] BLOCK INVERSE FOR ILU In-Reply-To: References: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> Message-ID: <9DF81966-9D26-4605-89E1-B2C76C31ED69@petsc.dev> I do urge you to run a test as Matt suggests to detect possible bugs in our code and satisfy yourself the code is correct or incorrect. The code that begins l = ipvt[k - 1]; if (l != k) { ax = &a[k3 + 1]; ay = &a[6 * l + 1]; is attempting to take into account the pivoting. This chunk of code may be incorrect, of course. Note that there is similar code for block size 5 etc that may also be buggy I guess. Barry > On Jun 5, 2025, at 7:49?AM, Matthew Knepley wrote: > > On Thu, Jun 5, 2025 at 2:27?AM Frank Bramkamp > wrote: >> Thanks for the response, >> >> >> I refer to the routine src/mat/impls/baij/seq/dgefa6.c >> >> >> /* >> Inverts 6 by 6 matrix using gaussian elimination with partial pivoting. >> >> Used by the sparse factorization routines in >> src/mat/impls/baij/seq >> >> This is a combination of the Linpack routines >> dgefa() and dgedi() specialized for a size of 6. >> >> */ >> >> >> which implements the the computation of an inverse for a dense 6x6 block. >> I think this routine is used in the ILU numeric factorization to compute the inverse diagonal block >> when we later apply ILU preconditioning where we need the inverse. >> >> And there it looks we use a direct LU solve and not triangular solves ?! >> >> In this context I am not sure if something is missing at the end or not regarding the inverse pivoting ?! > > We do not think so, but there is a simple test for this. You can construct a small block diagonal matrix with > blocksize 6, and apply it to the vector [1, 2, 3, 4, 5, 6 ...]. Use that vector as the RHS for a solve with > preonly/ILU. That should get the exact inverse and output the original vector. If the vector is permuted, then > we indeed have a bug. > > Thanks! > > Matt > >> Greetings, Frank >> >> >> >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bMXreiYRM3eT86Jioq3IATcR2EVaQfLh4ml3pmh88QtCtsm-80eT25djN21OkkgGiBm1v8KIhRDmklo_cvaliCI$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bramkamp at nsc.liu.se Thu Jun 5 11:26:57 2025 From: bramkamp at nsc.liu.se (Frank Bramkamp) Date: Thu, 5 Jun 2025 18:26:57 +0200 Subject: [petsc-users] BLOCK INVERSE FOR ILU In-Reply-To: <9DF81966-9D26-4605-89E1-B2C76C31ED69@petsc.dev> References: <70CFDF74-8B47-419F-8CE3-0C5322BB2316@nsc.liu.se> <9DF81966-9D26-4605-89E1-B2C76C31ED69@petsc.dev> Message-ID: <26C5C826-9AD4-4951-9B53-13FBF795BEB8@nsc.liu.se> Thanks for the reply, I will do a small test. Thanks for looking at it, Frank From sblondel at utk.edu Fri Jun 6 17:56:10 2025 From: sblondel at utk.edu (Blondel, Sophie) Date: Fri, 6 Jun 2025 22:56:10 +0000 Subject: [petsc-users] Error installing PETSc with --download-f2cblaslapack Message-ID: Hi, I am getting an error when trying to install PETSc with --download-f2cblaslapack on Linux: ============================================================================================= Installing F2CBLASLAPACK; this may take several minutes ============================================================================================= ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries With the error in configure.log (which is accurate, the file is not present in the folder): ============================================================================================= Installing F2CBLASLAPACK; this may take several minutes ============================================================================================= Executing: ['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] Executing: ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] stdout: cp: cannot stat 'libf2clapack.a': No such file or directory Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries: Could not execute "[['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'], ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib']]": cp: cannot stat 'libf2clapack.a': No such file or directory The configure command is: ./configure --prefix=/home/sophie/Workspace/xolotl-stable-build/external/petsc_install --with-fc=0 --with-cuda=0 --with-mpi --with-openmp=0 --with-debugging=0 --with-shared-libraries --with-64-bit-indices --download-kokkos --download-kokkos-kernels --download-hdf5 --download-hdf5-configure-arguments=--enable-parallel --download-boost --download-f2cblaslapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The version of PETSc you are using is out-of-date, we recommend updating to the new release Available Version: 3.23.3 Installed Version: 3.22.2 Let me know what additional information I can provide to help identify the issue. Best, Sophie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 6 18:12:38 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 6 Jun 2025 19:12:38 -0400 Subject: [petsc-users] Error installing PETSc with --download-f2cblaslapack In-Reply-To: References: Message-ID: <3E85CAFE-748D-4F71-9DDF-41505C7271BF@petsc.dev> Send configure.log I assume you tried rerunning a few times? It seems like possibly a flaky filesystem problem. Barry > On Jun 6, 2025, at 6:56?PM, Blondel, Sophie via petsc-users wrote: > > Hi, > > I am getting an error when trying to install PETSc with --download-f2cblaslapack on Linux: > ============================================================================================= > Installing F2CBLASLAPACK; this may take several minutes > ============================================================================================= > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > Error moving > /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 > libraries > > With the error in configure.log (which is accurate, the file is not present in the folder): > ============================================================================================= > Installing F2CBLASLAPACK; this may take several minutes > ============================================================================================= > Executing: ['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] > Executing: ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] > stdout: cp: cannot stat 'libf2clapack.a': No such file or directory > Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries: Could not execute "[['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'], ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib']]": > cp: cannot stat 'libf2clapack.a': No such file or directory > > The configure command is: > ./configure --prefix=/home/sophie/Workspace/xolotl-stable-build/external/petsc_install --with-fc=0 --with-cuda=0 --with-mpi --with-openmp=0 --with-debugging=0 --with-shared-libraries --with-64-bit-indices --download-kokkos --download-kokkos-kernels --download-hdf5 --download-hdf5-configure-arguments=--enable-parallel --download-boost --download-f2cblaslapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > The version of PETSc you are using is out-of-date, we recommend updating to the new release > Available Version: 3.23.3 Installed Version: 3.22.2 > > Let me know what additional information I can provide to help identify the issue. > > Best, > > Sophie -------------- next part -------------- An HTML attachment was scrubbed... URL: From sblondel at utk.edu Fri Jun 6 18:19:44 2025 From: sblondel at utk.edu (Blondel, Sophie) Date: Fri, 6 Jun 2025 23:19:44 +0000 Subject: [petsc-users] Error installing PETSc with --download-f2cblaslapack In-Reply-To: <3E85CAFE-748D-4F71-9DDF-41505C7271BF@petsc.dev> References: <3E85CAFE-748D-4F71-9DDF-41505C7271BF@petsc.dev> Message-ID: <6da6f377-936d-4e4b-b8d1-355b5c413699@email.android.com> Correct, I tried a few different python versions and mpi versions, conda is providing the external dependencies. Maybe it wrongly points to some base builds instead of the specific environment I set up. Best, Sophie On Jun 6, 2025 7:12 PM, Barry Smith wrote: Send configure.log I assume you tried rerunning a few times? It seems like possibly a flaky filesystem problem. Barry On Jun 6, 2025, at 6:56?PM, Blondel, Sophie via petsc-users wrote: Hi, I am getting an error when trying to install PETSc with --download-f2cblaslapack on Linux: ============================================================================================= Installing F2CBLASLAPACK; this may take several minutes ============================================================================================= ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries With the error in configure.log (which is accurate, the file is not present in the folder): ============================================================================================= Installing F2CBLASLAPACK; this may take several minutes ============================================================================================= Executing: ['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] Executing: ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] stdout: cp: cannot stat 'libf2clapack.a': No such file or directory Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries: Could not execute "[['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'], ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib']]": cp: cannot stat 'libf2clapack.a': No such file or directory The configure command is: ./configure --prefix=/home/sophie/Workspace/xolotl-stable-build/external/petsc_install --with-fc=0 --with-cuda=0 --with-mpi --with-openmp=0 --with-debugging=0 --with-shared-libraries --with-64-bit-indices --download-kokkos --download-kokkos-kernels --download-hdf5 --download-hdf5-configure-arguments=--enable-parallel --download-boost --download-f2cblaslapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The version of PETSc you are using is out-of-date, we recommend updating to the new release Available Version: 3.23.3 Installed Version: 3.22.2 Let me know what additional information I can provide to help identify the issue. Best, Sophie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 6 18:27:41 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 6 Jun 2025 19:27:41 -0400 Subject: [petsc-users] Error installing PETSc with --download-f2cblaslapack In-Reply-To: <6da6f377-936d-4e4b-b8d1-355b5c413699@email.android.com> References: <3E85CAFE-748D-4F71-9DDF-41505C7271BF@petsc.dev> <6da6f377-936d-4e4b-b8d1-355b5c413699@email.android.com> Message-ID: you forgot configure.log > On Jun 6, 2025, at 7:19?PM, Blondel, Sophie wrote: > > Correct, I tried a few different python versions and mpi versions, conda is providing the external dependencies. Maybe it wrongly points to some base builds instead of the specific environment I set up. > > Best, > > Sophie > > On Jun 6, 2025 7:12 PM, Barry Smith wrote: > > Send configure.log > > I assume you tried rerunning a few times? It seems like possibly a flaky filesystem problem. > > Barry > > >> On Jun 6, 2025, at 6:56?PM, Blondel, Sophie via petsc-users wrote: >> >> Hi, >> >> I am getting an error when trying to install PETSc with --download-f2cblaslapack on Linux: >> ============================================================================================= >> Installing F2CBLASLAPACK; this may take several minutes >> ============================================================================================= >> >> ********************************************************************************************* >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): >> --------------------------------------------------------------------------------------------- >> Error moving >> /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 >> libraries >> >> With the error in configure.log (which is accurate, the file is not present in the folder): >> ============================================================================================= >> Installing F2CBLASLAPACK; this may take several minutes >> ============================================================================================= >> Executing: ['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] >> Executing: ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] >> stdout: cp: cannot stat 'libf2clapack.a': No such file or directory >> Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries: Could not execute "[['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'], ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib']]": >> cp: cannot stat 'libf2clapack.a': No such file or directory >> >> The configure command is: >> ./configure --prefix=/home/sophie/Workspace/xolotl-stable-build/external/petsc_install --with-fc=0 --with-cuda=0 --with-mpi --with-openmp=0 --with-debugging=0 --with-shared-libraries --with-64-bit-indices --download-kokkos --download-kokkos-kernels --download-hdf5 --download-hdf5-configure-arguments=--enable-parallel --download-boost --download-f2cblaslapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> The version of PETSc you are using is out-of-date, we recommend updating to the new release >> Available Version: 3.23.3 Installed Version: 3.22.2 >> >> Let me know what additional information I can provide to help identify the issue. >> >> Best, >> >> Sophie > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sat Jun 7 01:22:34 2025 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sat, 7 Jun 2025 09:22:34 +0300 Subject: [petsc-users] Error installing PETSc with --download-f2cblaslapack In-Reply-To: <6da6f377-936d-4e4b-b8d1-355b5c413699@email.android.com> References: <3E85CAFE-748D-4F71-9DDF-41505C7271BF@petsc.dev> <6da6f377-936d-4e4b-b8d1-355b5c413699@email.android.com> Message-ID: If conda is providing the external dependencies, why do you want download-f2cblaslapack and not use a conda provided one? Mkl or openblas for example Stefano On Sat, Jun 7, 2025, 02:20 Blondel, Sophie via petsc-users < petsc-users at mcs.anl.gov> wrote: > Correct, I tried a few different python versions and mpi versions, conda > is providing the external dependencies. Maybe it wrongly points to some > base builds instead of the specific environment I set up. > > Best, > > Sophie > > On Jun 6, 2025 7:12 PM, Barry Smith wrote: > > Send configure.log > > I assume you tried rerunning a few times? It seems like possibly a flaky > filesystem problem. > > Barry > > > On Jun 6, 2025, at 6:56?PM, Blondel, Sophie via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi, > > I am getting an error when trying to install PETSc with > --download-f2cblaslapack on Linux: > > ============================================================================================= > Installing F2CBLASLAPACK; this may take several minutes > > ============================================================================================= > > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > --------------------------------------------------------------------------------------------- > Error moving > > /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 > libraries > > With the error in configure.log (which is accurate, the file is not > present in the folder): > > ============================================================================================= > Installing F2CBLASLAPACK; this may take several minutes > > ============================================================================================= > Executing: ['mkdir', '-p', > '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] > Executing: ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', > '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] > stdout: cp: cannot stat 'libf2clapack.a': No such file or directory > Error moving > /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 > libraries: Could not execute "[['mkdir', '-p', > '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'], > ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', > '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib']]": > cp: cannot stat 'libf2clapack.a': No such file or directory > > The configure command is: > ./configure > --prefix=/home/sophie/Workspace/xolotl-stable-build/external/petsc_install > --with-fc=0 --with-cuda=0 --with-mpi --with-openmp=0 --with-debugging=0 > --with-shared-libraries --with-64-bit-indices --download-kokkos > --download-kokkos-kernels --download-hdf5 > --download-hdf5-configure-arguments=--enable-parallel --download-boost > --download-f2cblaslapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > The version of PETSc you are using is out-of-date, we recommend updating > to the new release > Available Version: 3.23.3 Installed Version: 3.22.2 > > Let me know what additional information I can provide to help identify the > issue. > > Best, > > Sophie > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sblondel at utk.edu Fri Jun 6 18:34:54 2025 From: sblondel at utk.edu (Blondel, Sophie) Date: Fri, 6 Jun 2025 23:34:54 +0000 Subject: [petsc-users] Error installing PETSc with --download-f2cblaslapack In-Reply-To: References: <3E85CAFE-748D-4F71-9DDF-41505C7271BF@petsc.dev> <6da6f377-936d-4e4b-b8d1-355b5c413699@email.android.com> Message-ID: Sorry about that, I clearly didn't read the email well... Best, Sophie ________________________________ From: Barry Smith Sent: Friday, June 6, 2025 19:27 To: Blondel, Sophie Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Error installing PETSc with --download-f2cblaslapack you forgot configure.log On Jun 6, 2025, at 7:19?PM, Blondel, Sophie wrote: Correct, I tried a few different python versions and mpi versions, conda is providing the external dependencies. Maybe it wrongly points to some base builds instead of the specific environment I set up. Best, Sophie On Jun 6, 2025 7:12 PM, Barry Smith wrote: Send configure.log I assume you tried rerunning a few times? It seems like possibly a flaky filesystem problem. Barry On Jun 6, 2025, at 6:56?PM, Blondel, Sophie via petsc-users wrote: Hi, I am getting an error when trying to install PETSc with --download-f2cblaslapack on Linux: ============================================================================================= Installing F2CBLASLAPACK; this may take several minutes ============================================================================================= ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries With the error in configure.log (which is accurate, the file is not present in the folder): ============================================================================================= Installing F2CBLASLAPACK; this may take several minutes ============================================================================================= Executing: ['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] Executing: ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] stdout: cp: cannot stat 'libf2clapack.a': No such file or directory Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries: Could not execute "[['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'], ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib']]": cp: cannot stat 'libf2clapack.a': No such file or directory The configure command is: ./configure --prefix=/home/sophie/Workspace/xolotl-stable-build/external/petsc_install --with-fc=0 --with-cuda=0 --with-mpi --with-openmp=0 --with-debugging=0 --with-shared-libraries --with-64-bit-indices --download-kokkos --download-kokkos-kernels --download-hdf5 --download-hdf5-configure-arguments=--enable-parallel --download-boost --download-f2cblaslapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The version of PETSc you are using is out-of-date, we recommend updating to the new release Available Version: 3.23.3 Installed Version: 3.22.2 Let me know what additional information I can provide to help identify the issue. Best, Sophie -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 8038682 bytes Desc: configure.log URL: From balay.anl at fastmail.org Sun Jun 8 10:49:08 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Sun, 8 Jun 2025 10:49:08 -0500 (CDT) Subject: [petsc-users] Error installing PETSc with --download-f2cblaslapack In-Reply-To: References: <3E85CAFE-748D-4F71-9DDF-41505C7271BF@petsc.dev> <6da6f377-936d-4e4b-b8d1-355b5c413699@email.android.com> Message-ID: <1f56df04-d14e-f459-a1c9-887089189090@fastmail.org> >>>> cgeesx.c: In function 'cgeesx_': cgeesx.c:509:27: error: too many arguments to function 'select'; expected 0, have 1 509 | bwork[i__] = (*select)(&w[i__]); | ~^~~~~~~~ ~~~~~~~ <<<<< >>> Executing: mpicc --version stdout: x86_64-conda-linux-gnu-cc (conda-forge gcc 15.1.0-2) 15.1.0 <<<< Ah - ok - this is a gcc-15 compatibility issue. (I'm not sure what the appropriate code fix here is). The workaround is to use: -std=gnu17 petsc-3.23 should automatically set/use this flag. With older versions - perhaps you can try: COPTFLAGS="-O3 -std=gnu17" Satish On Fri, 6 Jun 2025, Blondel, Sophie via petsc-users wrote: > Sorry about that, I clearly didn't read the email well... > > Best, > > Sophie > ________________________________ > From: Barry Smith > Sent: Friday, June 6, 2025 19:27 > To: Blondel, Sophie > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Error installing PETSc with --download-f2cblaslapack > > you forgot configure.log > > > > On Jun 6, 2025, at 7:19?PM, Blondel, Sophie wrote: > > Correct, I tried a few different python versions and mpi versions, conda is providing the external dependencies. Maybe it wrongly points to some base builds instead of the specific environment I set up. > > Best, > > Sophie > > On Jun 6, 2025 7:12 PM, Barry Smith wrote: > > Send configure.log > > I assume you tried rerunning a few times? It seems like possibly a flaky filesystem problem. > > Barry > > > On Jun 6, 2025, at 6:56?PM, Blondel, Sophie via petsc-users wrote: > > Hi, > > I am getting an error when trying to install PETSc with --download-f2cblaslapack on Linux: > ============================================================================================= > Installing F2CBLASLAPACK; this may take several minutes > ============================================================================================= > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > Error moving > /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 > libraries > > With the error in configure.log (which is accurate, the file is not present in the folder): > ============================================================================================= > Installing F2CBLASLAPACK; this may take several minutes > ============================================================================================= > Executing: ['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] > Executing: ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] > stdout: cp: cannot stat 'libf2clapack.a': No such file or directory > Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries: Could not execute "[['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'], ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib']]": > cp: cannot stat 'libf2clapack.a': No such file or directory > > The configure command is: > ./configure --prefix=/home/sophie/Workspace/xolotl-stable-build/external/petsc_install --with-fc=0 --with-cuda=0 --with-mpi --with-openmp=0 --with-debugging=0 --with-shared-libraries --with-64-bit-indices --download-kokkos --download-kokkos-kernels --download-hdf5 --download-hdf5-configure-arguments=--enable-parallel --download-boost --download-f2cblaslapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > The version of PETSc you are using is out-of-date, we recommend updating to the new release > Available Version: 3.23.3 Installed Version: 3.22.2 > > Let me know what additional information I can provide to help identify the issue. > > Best, > > Sophie > > > From liufield at gmail.com Mon Jun 9 18:47:18 2025 From: liufield at gmail.com (neil liu) Date: Mon, 9 Jun 2025 19:47:18 -0400 Subject: [petsc-users] Memory leak related to MatSetValue Message-ID: Dear Petsc community, Recently, I encountered a memory leak while using Valgrind (3.25.1) with MPI (2 processes, MPICH 4.21.1) to test a PETSc-based code, which is a variant of the pestc's built-in example. *#include * *static char help[] = "Demonstrate PCFIELDSPLIT after MatZeroRowsColumns() inside PCREDISTRIBUTE";* *int main(int argc, char **argv)* *{* * PetscMPIInt rank, size;* * Mat A;* * PetscCall(PetscInitialize(&argc, &argv, NULL, help));* * PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));* * PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));* * PetscCheck(size == 2, PETSC_COMM_WORLD, PETSC_ERR_WRONG_MPI_SIZE, "Must be run with 2 MPI processes");* * // Set up a small problem with 2 dofs on rank 0 and 4 on rank 1* * PetscCall(MatCreate(PETSC_COMM_WORLD, &A));* * PetscCall(MatSetSizes(A, !rank ? 2 : 4, !rank ? 2 : 4, PETSC_DETERMINE, PETSC_DETERMINE));* * PetscCall(MatSetFromOptions(A));* * if (rank == 0) {* * PetscCall(MatSetValue(A, 0, 0, 2.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 0, 1, -1.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 1, 1, 3.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 1, 2, -1.0, ADD_VALUES));* * } else if (rank == 1) {* * PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));//Additional line added* * PetscCall(MatSetValue(A, 2, 2, 4.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 2, 3, -1.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 3, 3, 5.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 3, 4, -1.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 4, 4, 6.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 4, 5, -1.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 5, 5, 7.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 5, 4, -0.5, ADD_VALUES));* * }* * PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));* * PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));* * PetscCall(MatView(A, PETSC_VIEWER_STDOUT_WORLD));* * PetscCall(MatDestroy(&A));* * PetscCall(PetscFinalize());* * return 0;* *}* Rank 0 and 1 own 2 (from 0 to 1) and 4 (from 2 to 5) local rows respectively. I tried to add * PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));* for rank 1. //1 is owned by rank 0 only but is also modified in rank 1 now. After adding this line, a memory leak occurred. Does this imply that we cannot assign values to entries owned by other processors? In my case, I am assembling a global matrix from a DMPlex. With overlap=0, it seems necessary to use MatSetValues for rows owned by other processes. I'm not certain whether these two scenarios are equivalent, but they both appear to trigger the same memory leak. Did I miss something? Thanks a lot, Xiaodong ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 2 of 3 ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) ==3932339== by 0xA267A97: MPL_malloc (mpl_trmem.h:373) ==3932339== by 0xA267CE4: MPIR_Datatype_set_contents (mpir_datatype.h:420) ==3932339== by 0xA26E26E: MPIR_Type_create_struct_impl (type_create.c:919) ==3932339== by 0xA068F45: internal_Type_create_struct (c_binding.c:36491) ==3932339== by 0xA06911F: PMPI_Type_create_struct (c_binding.c:36551) ==3932339== by 0x4E78222: PMPI_Type_create_struct (libmpiwrap.c:2752) ==3932339== by 0x6F5C7D6: MatStashBlockTypeSetUp (matstash.c:772) ==3932339== by 0x6F61162: MatStashScatterBegin_BTS (matstash.c:838) ==3932339== by 0x6F54511: MatStashScatterBegin_Private (matstash.c:437) ==3932339== by 0x60BA9BD: MatAssemblyBegin_MPI_Hash (mpihashmat.h:59) ==3932339== by 0x6E8D3AE: MatAssemblyBegin (matrix.c:5749) ==3932339== ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 3 of 3 ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) ==3932339== by 0xA2385FF: MPL_malloc (mpl_trmem.h:373) ==3932339== by 0xA2395F8: MPII_Dataloop_alloc_and_copy (dataloop.c:400) ==3932339== by 0xA2393DC: MPII_Dataloop_alloc (dataloop.c:319) ==3932339== by 0xA23B239: MPIR_Dataloop_create_contiguous (dataloop_create_contig.c:56) ==3932339== by 0xA23BFF9: MPIR_Dataloop_create_indexed (dataloop_create_indexed.c:89) ==3932339== by 0xA23D5DC: create_basic_all_bytes_struct (dataloop_create_struct.c:252) ==3932339== by 0xA23D178: MPIR_Dataloop_create_struct (dataloop_create_struct.c:146) ==3932339== by 0xA25D4DB: MPIR_Typerep_commit (typerep_dataloop_commit.c:284) ==3932339== by 0xA261549: MPIR_Type_commit_impl (datatype_impl.c:185) ==3932339== by 0xA0624CA: internal_Type_commit (c_binding.c:34506) ==3932339== by 0xA062679: PMPI_Type_commit (c_binding.c:34553) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Jun 9 20:56:18 2025 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 9 Jun 2025 21:56:18 -0400 Subject: [petsc-users] Memory leak related to MatSetValue In-Reply-To: References: Message-ID: <427D9F44-1B6C-4E8B-8A68-ED4AE4A66F33@petsc.dev> src/mat/utils/matstash.c In MatStashBlockTypeSetUp() PetscCallMPI(MPI_Type_create_resized(stype, 0, stash->blocktype_size, &stash->blocktype)); In MatStashScatterDestroy_BTS(MatStash *stash) if (stash->blocktype != MPI_DATATYPE_NULL) PetscCallMPI(MPI_Type_free(&stash->blocktype)); So either 1) PETSc logic is preventing the correct MPI_Type_free() call from being made or 2) a bug has crept into MPICH that prevents the MPI_Type_free from freeing everything it needs to You can use the debugger (or even print statements inserted in the PETSc source) to determine in your very simple code that the MPI_Type_create_resized() is called exactly once and also the matching MPI_Type_free() to determine if the problem is PETSc logic or MPICH logic. Since we test PETSc in our CI with valgrind it is unlikely a PETSc bug Barry > On Jun 9, 2025, at 7:47?PM, neil liu wrote: > > Dear Petsc community, > Recently, I encountered a memory leak while using Valgrind (3.25.1) with MPI (2 processes, MPICH 4.21.1) to test a PETSc-based code, which is a variant of the pestc's built-in example. > > #include > static char help[] = "Demonstrate PCFIELDSPLIT after MatZeroRowsColumns() inside PCREDISTRIBUTE"; > int main(int argc, char **argv) > { > PetscMPIInt rank, size; > Mat A; > > PetscCall(PetscInitialize(&argc, &argv, NULL, help)); > PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); > PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); > PetscCheck(size == 2, PETSC_COMM_WORLD, PETSC_ERR_WRONG_MPI_SIZE, "Must be run with 2 MPI processes"); > > // Set up a small problem with 2 dofs on rank 0 and 4 on rank 1 > PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); > PetscCall(MatSetSizes(A, !rank ? 2 : 4, !rank ? 2 : 4, PETSC_DETERMINE, PETSC_DETERMINE)); > PetscCall(MatSetFromOptions(A)); > if (rank == 0) { > PetscCall(MatSetValue(A, 0, 0, 2.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 0, 1, -1.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 1, 1, 3.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 1, 2, -1.0, ADD_VALUES)); > } else if (rank == 1) { > PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));//Additional line added > PetscCall(MatSetValue(A, 2, 2, 4.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 2, 3, -1.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 3, 3, 5.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 3, 4, -1.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 4, 4, 6.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 4, 5, -1.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 5, 5, 7.0, ADD_VALUES)); > PetscCall(MatSetValue(A, 5, 4, -0.5, ADD_VALUES)); > } > PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY)); > PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY)); > PetscCall(MatView(A, PETSC_VIEWER_STDOUT_WORLD)); > > PetscCall(MatDestroy(&A)); > PetscCall(PetscFinalize()); > return 0; > } > > Rank 0 and 1 own 2 (from 0 to 1) and 4 (from 2 to 5) local rows respectively. > I tried to add > PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES)); > for rank 1. //1 is owned by rank 0 only but is also modified in rank 1 now. > After adding this line, a memory leak occurred. Does this imply that we cannot assign values to entries owned by other processors? In my case, I am assembling a global matrix from a DMPlex. With overlap=0, it seems necessary to use MatSetValues for rows owned by other processes. I'm not certain whether these two scenarios are equivalent, but they both appear to trigger the same memory leak. > Did I miss something? > > Thanks a lot, > > Xiaodong > > ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 2 of 3 > ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) > ==3932339== by 0xA267A97: MPL_malloc (mpl_trmem.h:373) > ==3932339== by 0xA267CE4: MPIR_Datatype_set_contents (mpir_datatype.h:420) > ==3932339== by 0xA26E26E: MPIR_Type_create_struct_impl (type_create.c:919) > ==3932339== by 0xA068F45: internal_Type_create_struct (c_binding.c:36491) > ==3932339== by 0xA06911F: PMPI_Type_create_struct (c_binding.c:36551) > ==3932339== by 0x4E78222: PMPI_Type_create_struct (libmpiwrap.c:2752) > ==3932339== by 0x6F5C7D6: MatStashBlockTypeSetUp (matstash.c:772) > ==3932339== by 0x6F61162: MatStashScatterBegin_BTS (matstash.c:838) > ==3932339== by 0x6F54511: MatStashScatterBegin_Private (matstash.c:437) > ==3932339== by 0x60BA9BD: MatAssemblyBegin_MPI_Hash (mpihashmat.h:59) > ==3932339== by 0x6E8D3AE: MatAssemblyBegin (matrix.c:5749) > ==3932339== > ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 3 of 3 > ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) > ==3932339== by 0xA2385FF: MPL_malloc (mpl_trmem.h:373) > ==3932339== by 0xA2395F8: MPII_Dataloop_alloc_and_copy (dataloop.c:400) > ==3932339== by 0xA2393DC: MPII_Dataloop_alloc (dataloop.c:319) > ==3932339== by 0xA23B239: MPIR_Dataloop_create_contiguous (dataloop_create_contig.c:56) > ==3932339== by 0xA23BFF9: MPIR_Dataloop_create_indexed (dataloop_create_indexed.c:89) > ==3932339== by 0xA23D5DC: create_basic_all_bytes_struct (dataloop_create_struct.c:252) > ==3932339== by 0xA23D178: MPIR_Dataloop_create_struct (dataloop_create_struct.c:146) > ==3932339== by 0xA25D4DB: MPIR_Typerep_commit (typerep_dataloop_commit.c:284) > ==3932339== by 0xA261549: MPIR_Type_commit_impl (datatype_impl.c:185) > ==3932339== by 0xA0624CA: internal_Type_commit (c_binding.c:34506) > ==3932339== by 0xA062679: PMPI_Type_commit (c_binding.c:34553) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Mon Jun 9 21:35:54 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 9 Jun 2025 21:35:54 -0500 Subject: [petsc-users] Memory leak related to MatSetValue In-Reply-To: References: Message-ID: On Mon, Jun 9, 2025 at 6:47?PM neil liu wrote: > Dear Petsc community, > Recently, I encountered a memory leak while using Valgrind (3.25.1) with > MPI (2 processes, MPICH 4.21.1) to test a PETSc-based code, which is a > variant of the pestc's built-in example. > > *#include * > *static char help[] = "Demonstrate PCFIELDSPLIT after MatZeroRowsColumns() > inside PCREDISTRIBUTE";* > *int main(int argc, char **argv)* > *{* > * PetscMPIInt rank, size;* > * Mat A;* > > * PetscCall(PetscInitialize(&argc, &argv, NULL, help));* > * PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));* > * PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));* > * PetscCheck(size == 2, PETSC_COMM_WORLD, PETSC_ERR_WRONG_MPI_SIZE, "Must > be run with 2 MPI processes");* > > * // Set up a small problem with 2 dofs on rank 0 and 4 on rank 1* > * PetscCall(MatCreate(PETSC_COMM_WORLD, &A));* > * PetscCall(MatSetSizes(A, !rank ? 2 : 4, !rank ? 2 : 4, PETSC_DETERMINE, > PETSC_DETERMINE));* > * PetscCall(MatSetFromOptions(A));* > * if (rank == 0) {* > * PetscCall(MatSetValue(A, 0, 0, 2.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 0, 1, -1.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 1, 1, 3.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 1, 2, -1.0, ADD_VALUES));* > * } else if (rank == 1) {* > * PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));//Additional line > added* > * PetscCall(MatSetValue(A, 2, 2, 4.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 2, 3, -1.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 3, 3, 5.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 3, 4, -1.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 4, 4, 6.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 4, 5, -1.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 5, 5, 7.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 5, 4, -0.5, ADD_VALUES));* > * }* > * PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));* > * PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));* > * PetscCall(MatView(A, PETSC_VIEWER_STDOUT_WORLD));* > > * PetscCall(MatDestroy(&A));* > * PetscCall(PetscFinalize());* > * return 0;* > *}* > > Rank 0 and 1 own 2 (from 0 to 1) and 4 (from 2 to 5) local rows > respectively. > I tried to add > * PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));* > for rank 1. //1 is owned by rank 0 only but is also modified in rank 1 > now. > After adding this line, a memory leak occurred. Does this imply that we > cannot assign values to entries owned by other processors? > Yes, you can always assign to remote entries owned by others. BTW, I ran your small test and didn't see memory leaks. In my case, I am assembling a global matrix from a DMPlex. With overlap=0, > it seems necessary to use MatSetValues for rows owned by other processes. > I'm not certain whether these two scenarios are equivalent, but they both > appear to trigger the same memory leak. > Did I miss something? > > Thanks a lot, > > Xiaodong > > ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 2 of 3 > ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) > ==3932339== by 0xA267A97: MPL_malloc (mpl_trmem.h:373) > ==3932339== by 0xA267CE4: MPIR_Datatype_set_contents > (mpir_datatype.h:420) > ==3932339== by 0xA26E26E: MPIR_Type_create_struct_impl > (type_create.c:919) > ==3932339== by 0xA068F45: internal_Type_create_struct > (c_binding.c:36491) > ==3932339== by 0xA06911F: PMPI_Type_create_struct (c_binding.c:36551) > ==3932339== by 0x4E78222: PMPI_Type_create_struct (libmpiwrap.c:2752) > ==3932339== by 0x6F5C7D6: MatStashBlockTypeSetUp (matstash.c:772) > ==3932339== by 0x6F61162: MatStashScatterBegin_BTS (matstash.c:838) > ==3932339== by 0x6F54511: MatStashScatterBegin_Private (matstash.c:437) > ==3932339== by 0x60BA9BD: MatAssemblyBegin_MPI_Hash (mpihashmat.h:59) > ==3932339== by 0x6E8D3AE: MatAssemblyBegin (matrix.c:5749) > ==3932339== > ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 3 of 3 > ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) > ==3932339== by 0xA2385FF: MPL_malloc (mpl_trmem.h:373) > ==3932339== by 0xA2395F8: MPII_Dataloop_alloc_and_copy (dataloop.c:400) > ==3932339== by 0xA2393DC: MPII_Dataloop_alloc (dataloop.c:319) > ==3932339== by 0xA23B239: MPIR_Dataloop_create_contiguous > (dataloop_create_contig.c:56) > ==3932339== by 0xA23BFF9: MPIR_Dataloop_create_indexed > (dataloop_create_indexed.c:89) > ==3932339== by 0xA23D5DC: create_basic_all_bytes_struct > (dataloop_create_struct.c:252) > ==3932339== by 0xA23D178: MPIR_Dataloop_create_struct > (dataloop_create_struct.c:146) > ==3932339== by 0xA25D4DB: MPIR_Typerep_commit > (typerep_dataloop_commit.c:284) > ==3932339== by 0xA261549: MPIR_Type_commit_impl (datatype_impl.c:185) > ==3932339== by 0xA0624CA: internal_Type_commit (c_binding.c:34506) > ==3932339== by 0xA062679: PMPI_Type_commit (c_binding.c:34553) > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Mon Jun 9 21:48:46 2025 From: liufield at gmail.com (neil liu) Date: Mon, 9 Jun 2025 22:48:46 -0400 Subject: [petsc-users] Memory leak related to MatSetValue In-Reply-To: <427D9F44-1B6C-4E8B-8A68-ED4AE4A66F33@petsc.dev> References: <427D9F44-1B6C-4E8B-8A68-ED4AE4A66F33@petsc.dev> Message-ID: Thanks a lot for your information. Very helpful. Have a good night. On Mon, Jun 9, 2025 at 9:56?PM Barry Smith wrote: > > src/mat/utils/matstash.c > > In MatStashBlockTypeSetUp() > > PetscCallMPI(MPI_Type_create_resized(stype, 0, stash->blocktype_size, > &stash->blocktype)); > > In MatStashScatterDestroy_BTS(MatStash *stash) > > if (stash->blocktype != MPI_DATATYPE_NULL) > PetscCallMPI(MPI_Type_free(&stash->blocktype)); > > So either > > 1) PETSc logic is preventing the correct MPI_Type_free() call from being > made or > > 2) a bug has crept into MPICH that prevents the MPI_Type_free from freeing > everything it needs to > > You can use the debugger (or even print statements inserted in the PETSc > source) to determine in your very simple code > that the MPI_Type_create_resized() is called exactly once and also the > matching MPI_Type_free() to determine if the problem is PETSc logic or > MPICH logic. > > Since we test PETSc in our CI with valgrind it is unlikely a PETSc bug > > Barry > > > On Jun 9, 2025, at 7:47?PM, neil liu wrote: > > Dear Petsc community, > Recently, I encountered a memory leak while using Valgrind (3.25.1) with > MPI (2 processes, MPICH 4.21.1) to test a PETSc-based code, which is a > variant of the pestc's built-in example. > > *#include * > *static char help[] = "Demonstrate PCFIELDSPLIT after MatZeroRowsColumns() > inside PCREDISTRIBUTE";* > *int main(int argc, char **argv)* > *{* > * PetscMPIInt rank, size;* > * Mat A;* > > * PetscCall(PetscInitialize(&argc, &argv, NULL, help));* > * PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));* > * PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));* > * PetscCheck(size == 2, PETSC_COMM_WORLD, PETSC_ERR_WRONG_MPI_SIZE, "Must > be run with 2 MPI processes");* > > * // Set up a small problem with 2 dofs on rank 0 and 4 on rank 1* > * PetscCall(MatCreate(PETSC_COMM_WORLD, &A));* > * PetscCall(MatSetSizes(A, !rank ? 2 : 4, !rank ? 2 : 4, PETSC_DETERMINE, > PETSC_DETERMINE));* > * PetscCall(MatSetFromOptions(A));* > * if (rank == 0) {* > * PetscCall(MatSetValue(A, 0, 0, 2.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 0, 1, -1.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 1, 1, 3.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 1, 2, -1.0, ADD_VALUES));* > * } else if (rank == 1) {* > * PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));//Additional line > added* > * PetscCall(MatSetValue(A, 2, 2, 4.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 2, 3, -1.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 3, 3, 5.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 3, 4, -1.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 4, 4, 6.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 4, 5, -1.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 5, 5, 7.0, ADD_VALUES));* > * PetscCall(MatSetValue(A, 5, 4, -0.5, ADD_VALUES));* > * }* > * PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));* > * PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));* > * PetscCall(MatView(A, PETSC_VIEWER_STDOUT_WORLD));* > > * PetscCall(MatDestroy(&A));* > * PetscCall(PetscFinalize());* > * return 0;* > *}* > > Rank 0 and 1 own 2 (from 0 to 1) and 4 (from 2 to 5) local rows > respectively. > I tried to add > * PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));* > for rank 1. //1 is owned by rank 0 only but is also modified in rank 1 > now. > After adding this line, a memory leak occurred. Does this imply that we > cannot assign values to entries owned by other processors? In my case, I am > assembling a global matrix from a DMPlex. With overlap=0, it seems > necessary to use MatSetValues for rows owned by other processes. I'm not > certain whether these two scenarios are equivalent, but they both appear to > trigger the same memory leak. > Did I miss something? > > Thanks a lot, > > Xiaodong > > ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 2 of 3 > ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) > ==3932339== by 0xA267A97: MPL_malloc (mpl_trmem.h:373) > ==3932339== by 0xA267CE4: MPIR_Datatype_set_contents > (mpir_datatype.h:420) > ==3932339== by 0xA26E26E: MPIR_Type_create_struct_impl > (type_create.c:919) > ==3932339== by 0xA068F45: internal_Type_create_struct > (c_binding.c:36491) > ==3932339== by 0xA06911F: PMPI_Type_create_struct (c_binding.c:36551) > ==3932339== by 0x4E78222: PMPI_Type_create_struct (libmpiwrap.c:2752) > ==3932339== by 0x6F5C7D6: MatStashBlockTypeSetUp (matstash.c:772) > ==3932339== by 0x6F61162: MatStashScatterBegin_BTS (matstash.c:838) > ==3932339== by 0x6F54511: MatStashScatterBegin_Private (matstash.c:437) > ==3932339== by 0x60BA9BD: MatAssemblyBegin_MPI_Hash (mpihashmat.h:59) > ==3932339== by 0x6E8D3AE: MatAssemblyBegin (matrix.c:5749) > ==3932339== > ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 3 of 3 > ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) > ==3932339== by 0xA2385FF: MPL_malloc (mpl_trmem.h:373) > ==3932339== by 0xA2395F8: MPII_Dataloop_alloc_and_copy (dataloop.c:400) > ==3932339== by 0xA2393DC: MPII_Dataloop_alloc (dataloop.c:319) > ==3932339== by 0xA23B239: MPIR_Dataloop_create_contiguous > (dataloop_create_contig.c:56) > ==3932339== by 0xA23BFF9: MPIR_Dataloop_create_indexed > (dataloop_create_indexed.c:89) > ==3932339== by 0xA23D5DC: create_basic_all_bytes_struct > (dataloop_create_struct.c:252) > ==3932339== by 0xA23D178: MPIR_Dataloop_create_struct > (dataloop_create_struct.c:146) > ==3932339== by 0xA25D4DB: MPIR_Typerep_commit > (typerep_dataloop_commit.c:284) > ==3932339== by 0xA261549: MPIR_Type_commit_impl (datatype_impl.c:185) > ==3932339== by 0xA0624CA: internal_Type_commit (c_binding.c:34506) > ==3932339== by 0xA062679: PMPI_Type_commit (c_binding.c:34553) > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.scotto at irt-saintexupery.com Tue Jun 10 04:30:16 2025 From: alexandre.scotto at irt-saintexupery.com (SCOTTO Alexandre) Date: Tue, 10 Jun 2025 09:30:16 +0000 Subject: [petsc-users] Profiling objects creation Message-ID: <44a85f8ad37c480c9bbb5d98975acce2@irt-saintexupery.com> Dear PETSc community, I am using the PETSc API petsc4py and I am interested in profiling the number of PETSc objects created during the run of my script. I have used the -log_view option to get information on the run in a dedicated file, and I got this (among other infirmation): [cid:image001.png at 01DBD9FB.09669CA0] This is precisely the information I am interested in, but here only the process 0 is tracked. I is possible to have the same type of information for all the processes? I went through the documentation and was not able to figure it out. Thanks in advance, Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6518 bytes Desc: image001.png URL: From knepley at gmail.com Tue Jun 10 06:27:41 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 10 Jun 2025 07:27:41 -0400 Subject: [petsc-users] Profiling objects creation In-Reply-To: <44a85f8ad37c480c9bbb5d98975acce2@irt-saintexupery.com> References: <44a85f8ad37c480c9bbb5d98975acce2@irt-saintexupery.com> Message-ID: On Tue, Jun 10, 2025 at 5:30?AM SCOTTO Alexandre via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc community, > > > > I am using the PETSc API petsc4py and I am interested in profiling the > number of PETSc objects created during the run of my script. > > > > I have used the -log_view option to get information on the run in a > dedicated file, and I got this (among other infirmation): > > > > > > This is precisely the information I am interested in, but here only the > process 0 is tracked. > > > > I is possible to have the same type of information for all the processes? > I went through the documentation and was not able to figure it out. > You want to access https://urldefense.us/v3/__https://petsc.org/main/manualpages/Log/PetscLogState/__;!!G_uCfscf7eWS!YfTSxLTF2TerpXGAhCfoL7bYAcosI6xV6JXwYXOjaVkt88NjAYm_-_WFn7x-IoWWTraAZ1iXE8ZCAY8QEn26$ which you get using https://urldefense.us/v3/__https://petsc.org/main/manualpages/Log/PetscLogGetState/__;!!G_uCfscf7eWS!YfTSxLTF2TerpXGAhCfoL7bYAcosI6xV6JXwYXOjaVkt88NjAYm_-_WFn7x-IoWWTraAZ1iXE8ZCASEp9DWe$ Thanks, Matt > Thanks in advance, > > Regards. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YfTSxLTF2TerpXGAhCfoL7bYAcosI6xV6JXwYXOjaVkt88NjAYm_-_WFn7x-IoWWTraAZ1iXE8ZCAXQMXkrF$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6518 bytes Desc: not available URL: From alexandre.scotto at irt-saintexupery.com Tue Jun 10 06:44:49 2025 From: alexandre.scotto at irt-saintexupery.com (SCOTTO Alexandre) Date: Tue, 10 Jun 2025 11:44:49 +0000 Subject: [petsc-users] Profiling objects creation In-Reply-To: References: <44a85f8ad37c480c9bbb5d98975acce2@irt-saintexupery.com> Message-ID: <7bb008e41faa40938127d951c73f1de5@irt-saintexupery.com> Hello Mat, Thanks for your answer. I think this object is not available in petsc4py, at least I do not see any related elements in the documentation https://urldefense.us/v3/__https://petsc.org/main/petsc4py/reference/petsc4py.PETSc.html__;!!G_uCfscf7eWS!cDXOAoWknEhkWKDx0ghSlgUUMywVuLUCTz5sHEaHAVIJB_gsVlwjIfTb3qCF-mlz87Pdu0YEgddFhM80kaKVUDU_2z4ZrUvnN4KnKmo1uA$ So maybe this is not accessible in Python. Regards, Alexandre. De : Matthew Knepley Envoy? : mardi 10 juin 2025 13:28 ? : SCOTTO Alexandre Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] Profiling objects creation On Tue, Jun 10, 2025 at 5:30?AM SCOTTO Alexandre via petsc-users > wrote: Dear PETSc community, I am using the PETSc API petsc4py and I am interested in profiling the number of PETSc objects created during the run of my script. I have used the -log_view option to get information on the run in a dedicated file, and I got this (among other infirmation): [cid:image001.png at 01DBDA0D.D4F90080] This is precisely the information I am interested in, but here only the process 0 is tracked. I is possible to have the same type of information for all the processes? I went through the documentation and was not able to figure it out. You want to access https://urldefense.us/v3/__https://petsc.org/main/manualpages/Log/PetscLogState/__;!!G_uCfscf7eWS!cDXOAoWknEhkWKDx0ghSlgUUMywVuLUCTz5sHEaHAVIJB_gsVlwjIfTb3qCF-mlz87Pdu0YEgddFhM80kaKVUDU_2z4ZrUvnN4K-TsOjuw$ which you get using https://urldefense.us/v3/__https://petsc.org/main/manualpages/Log/PetscLogGetState/__;!!G_uCfscf7eWS!cDXOAoWknEhkWKDx0ghSlgUUMywVuLUCTz5sHEaHAVIJB_gsVlwjIfTb3qCF-mlz87Pdu0YEgddFhM80kaKVUDU_2z4ZrUvnN4LDDoJ1Ng$ Thanks, Matt Thanks in advance, Regards. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cDXOAoWknEhkWKDx0ghSlgUUMywVuLUCTz5sHEaHAVIJB_gsVlwjIfTb3qCF-mlz87Pdu0YEgddFhM80kaKVUDU_2z4ZrUvnN4IYMkm07A$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6518 bytes Desc: image001.png URL: From knepley at gmail.com Tue Jun 10 08:33:41 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 10 Jun 2025 09:33:41 -0400 Subject: [petsc-users] Profiling objects creation In-Reply-To: <7bb008e41faa40938127d951c73f1de5@irt-saintexupery.com> References: <44a85f8ad37c480c9bbb5d98975acce2@irt-saintexupery.com> <7bb008e41faa40938127d951c73f1de5@irt-saintexupery.com> Message-ID: On Tue, Jun 10, 2025 at 7:44?AM SCOTTO Alexandre < alexandre.scotto at irt-saintexupery.com> wrote: > Hello Mat, > > > > Thanks for your answer. I think this object is not available in petsc4py, > at least I do not see any related elements in the documentation > https://urldefense.us/v3/__https://petsc.org/main/petsc4py/reference/petsc4py.PETSc.html__;!!G_uCfscf7eWS!cksrPnF_zRr19WbGYSajXPttTjWjk6YeKgg0Gp8Lk-nyTXp1XdibretqDhGHAAPWTauMuS1SLlgxba-1ldY8$ > > > > So maybe this is not accessible in Python. > Yes, it looks like the new LogHandler interface, necessary for nested logging like flamegraphs, is not in petsc4py. We will have to add it. Thanks, Matt > > > Regards, > > Alexandre. > > > > *De :* Matthew Knepley > *Envoy? :* mardi 10 juin 2025 13:28 > *? :* SCOTTO Alexandre > *Cc :* petsc-users at mcs.anl.gov > *Objet :* Re: [petsc-users] Profiling objects creation > > > > On Tue, Jun 10, 2025 at 5:30?AM SCOTTO Alexandre via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Dear PETSc community, > > > > I am using the PETSc API petsc4py and I am interested in profiling the > number of PETSc objects created during the run of my script. > > > > I have used the -log_view option to get information on the run in a > dedicated file, and I got this (among other infirmation): > > > > > > This is precisely the information I am interested in, but here only the > process 0 is tracked. > > > > I is possible to have the same type of information for all the processes? > I went through the documentation and was not able to figure it out. > > > > You want to access > > > > https://urldefense.us/v3/__https://petsc.org/main/manualpages/Log/PetscLogState/__;!!G_uCfscf7eWS!cksrPnF_zRr19WbGYSajXPttTjWjk6YeKgg0Gp8Lk-nyTXp1XdibretqDhGHAAPWTauMuS1SLlgxbRt_afMq$ > > > > which you get using > > > > https://urldefense.us/v3/__https://petsc.org/main/manualpages/Log/PetscLogGetState/__;!!G_uCfscf7eWS!cksrPnF_zRr19WbGYSajXPttTjWjk6YeKgg0Gp8Lk-nyTXp1XdibretqDhGHAAPWTauMuS1SLlgxbelDML9w$ > > > > Thanks, > > > > Matt > > > > Thanks in advance, > > Regards. > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cksrPnF_zRr19WbGYSajXPttTjWjk6YeKgg0Gp8Lk-nyTXp1XdibretqDhGHAAPWTauMuS1SLlgxbbSWAeiw$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cksrPnF_zRr19WbGYSajXPttTjWjk6YeKgg0Gp8Lk-nyTXp1XdibretqDhGHAAPWTauMuS1SLlgxbbSWAeiw$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6518 bytes Desc: not available URL: From sblondel at utk.edu Tue Jun 10 15:23:10 2025 From: sblondel at utk.edu (Blondel, Sophie) Date: Tue, 10 Jun 2025 20:23:10 +0000 Subject: [petsc-users] Error installing PETSc with --download-f2cblaslapack In-Reply-To: References: <3E85CAFE-748D-4F71-9DDF-41505C7271BF@petsc.dev> <6da6f377-936d-4e4b-b8d1-355b5c413699@email.android.com> Message-ID: Thank you for the suggestion, Getting openblas through conda fixed my issue! Best, Sophie ________________________________ From: Stefano Zampini Sent: Saturday, June 7, 2025 02:22 To: Blondel, Sophie Cc: Barry Smith ; PETSc users list Subject: Re: [petsc-users] Error installing PETSc with --download-f2cblaslapack You don't often get email from stefano.zampini at gmail.com. Learn why this is important If conda is providing the external dependencies, why do you want download-f2cblaslapack and not use a conda provided one? Mkl or openblas for example Stefano On Sat, Jun 7, 2025, 02:20 Blondel, Sophie via petsc-users > wrote: Correct, I tried a few different python versions and mpi versions, conda is providing the external dependencies. Maybe it wrongly points to some base builds instead of the specific environment I set up. Best, Sophie On Jun 6, 2025 7:12 PM, Barry Smith > wrote: Send configure.log I assume you tried rerunning a few times? It seems like possibly a flaky filesystem problem. Barry On Jun 6, 2025, at 6:56?PM, Blondel, Sophie via petsc-users > wrote: Hi, I am getting an error when trying to install PETSc with --download-f2cblaslapack on Linux: ============================================================================================= Installing F2CBLASLAPACK; this may take several minutes ============================================================================================= ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries With the error in configure.log (which is accurate, the file is not present in the folder): ============================================================================================= Installing F2CBLASLAPACK; this may take several minutes ============================================================================================= Executing: ['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] Executing: ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'] stdout: cp: cannot stat 'libf2clapack.a': No such file or directory Error moving /home/sophie/Workspace/xolotl-stable-source/external/petsc/arch-linux-c-opt/externalpackages/f2cblaslapack-3.8.0.q2 libraries: Could not execute "[['mkdir', '-p', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib'], ['cp', '-f', 'libf2clapack.a', 'libf2cblas.a', '/home/sophie/Workspace/xolotl-stable-build/external/petsc_install/lib']]": cp: cannot stat 'libf2clapack.a': No such file or directory The configure command is: ./configure --prefix=/home/sophie/Workspace/xolotl-stable-build/external/petsc_install --with-fc=0 --with-cuda=0 --with-mpi --with-openmp=0 --with-debugging=0 --with-shared-libraries --with-64-bit-indices --download-kokkos --download-kokkos-kernels --download-hdf5 --download-hdf5-configure-arguments=--enable-parallel --download-boost --download-f2cblaslapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The version of PETSc you are using is out-of-date, we recommend updating to the new release Available Version: 3.23.3 Installed Version: 3.22.2 Let me know what additional information I can provide to help identify the issue. Best, Sophie -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali.ali_ahmad at utt.fr Wed Jun 11 03:45:27 2025 From: ali.ali_ahmad at utt.fr (Ali ALI AHMAD) Date: Wed, 11 Jun 2025 10:45:27 +0200 (CEST) Subject: [petsc-users] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) Message-ID: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> Dear PETSc team, I hope this message finds you well. I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? Thank you very much in advance for your help and for the great work on PETSc! Best regards, Ali ALI AHMAD PhD Student University of Technology of Troyes - UTT - France GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 12 rue Marie Curie - CS 42060 10004 TROYES Cedex -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Jun 12 03:57:52 2025 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 12 Jun 2025 04:57:52 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> Message-ID: Adding this to the PETSc mailing list, On Thu, Jun 12, 2025 at 3:43?AM hexioafeng wrote: > > Dear Professor, > > I hope this message finds you well. > > I am an employee at a CAE company and a heavy user of the PETSc library. I > would like to thank you for your contributions to PETSc and express my deep > appreciation for your work. > > Recently, I encountered some difficulties when using PETSc to solve > structural mechanics problems with Lagrange multiplier constraints. After > searching extensively online and reviewing several papers, I found your > previous paper titled "*Algebraic multigrid methods for constrained > linear systems with applications to contact problems in solid mechanics*" > seems to be the most relevant and helpful. > > The stiffness matrix I'm working with, *K*, is a block saddle-point > matrix of the form (A00 A01; A10 0), where *A00 is singular*?just as > described in your paper, and different from many other articles . I have a > few questions regarding your work and would greatly appreciate your > insights: > > 1. Is the *AMG/KKT* method presented in your paper available in PETSc? I > tried using *CG+GAMG* directly but received a *KSP_DIVERGED_PC_FAILED* > error. I also attempted to use *CG+PCFIELDSPLIT* with the following > options: > No > > -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point > -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp > -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly > -fieldsplit_1_pc_type bjacobi > > Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* error. > Do you have any suggestions? > > 2. In your paper, you compare the method with *Uzawa*-type approaches. To > my understanding, Uzawa methods typically require A00 to be invertible. How > did you handle the singularity of A00 to construct an M-matrix that is > invertible? > > You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa 3. Can i implement the AMG/KKT method in your paper using existing *AMG > APIs*? Implementing a production-level AMG solver from scratch would be > quite challenging for me, so I?m hoping to utilize existing AMG interfaces > within PETSc or other packages. > > You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. > 4. For saddle-point systems where A00 is singular, can you recommend any > more robust or efficient solutions? Alternatively, are you aware of any > open-source software packages that can handle such cases out-of-the-box? > > No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) Thanks, Mark > > Thank you very much for taking the time to read my email. Looking forward > to hearing from you. > > > > Sincerely, > > Xiaofeng He > ----------------------------------------------------- > > Research Engineer > > Internet Based Engineering, Beijing, China > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hexiaofeng at buaa.edu.cn Thu Jun 12 04:08:31 2025 From: hexiaofeng at buaa.edu.cn (hexioafeng) Date: Thu, 12 Jun 2025 17:08:31 +0800 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> Message-ID: Dear sir, Thanks for your thorough and helpful reply, and I will try the Uzawa method first. Best regards, Xiaofeng > On Jun 12, 2025, at 16:57, Mark Adams wrote: > > Adding this to the PETSc mailing list, > > On Thu, Jun 12, 2025 at 3:43?AM hexioafeng > wrote: >> >> Dear Professor, >> >> I hope this message finds you well. >> >> I am an employee at a CAE company and a heavy user of the PETSc library. I would like to thank you for your contributions to PETSc and express my deep appreciation for your work. >> >> Recently, I encountered some difficulties when using PETSc to solve structural mechanics problems with Lagrange multiplier constraints. After searching extensively online and reviewing several papers, I found your previous paper titled "Algebraic multigrid methods for constrained linear systems with applications to contact problems in solid mechanics" seems to be the most relevant and helpful. >> >> The stiffness matrix I'm working with, K, is a block saddle-point matrix of the form (A00 A01; A10 0), where A00 is singular?just as described in your paper, and different from many other articles . I have a few questions regarding your work and would greatly appreciate your insights: >> >> 1. Is the AMG/KKT method presented in your paper available in PETSc? I tried using CG+GAMG directly but received a KSP_DIVERGED_PC_FAILED error. I also attempted to use CG+PCFIELDSPLIT with the following options: > > No > >> >> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >> >> Unfortunately, this also resulted in a KSP_DIVERGED_PC_FAILED error. Do you have any suggestions? >> >> 2. In your paper, you compare the method with Uzawa-type approaches. To my understanding, Uzawa methods typically require A00 to be invertible. How did you handle the singularity of A00 to construct an M-matrix that is invertible? >> > > You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa > > >> 3. Can i implement the AMG/KKT method in your paper using existing AMG APIs? Implementing a production-level AMG solver from scratch would be quite challenging for me, so I?m hoping to utilize existing AMG interfaces within PETSc or other packages. >> > > You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. > > >> 4. For saddle-point systems where A00 is singular, can you recommend any more robust or efficient solutions? Alternatively, are you aware of any open-source software packages that can handle such cases out-of-the-box? >> > > No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. > I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) > > Thanks, > Mark >> >> Thank you very much for taking the time to read my email. Looking forward to hearing from you. >> >> >> >> Sincerely, >> >> Xiaofeng He >> ----------------------------------------------------- >> >> Research Engineer >> >> Internet Based Engineering, Beijing, China -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jun 12 07:43:57 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 12 Jun 2025 08:43:57 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> Message-ID: On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: > Adding this to the PETSc mailing list, > > On Thu, Jun 12, 2025 at 3:43?AM hexioafeng wrote: > >> >> Dear Professor, >> >> I hope this message finds you well. >> >> I am an employee at a CAE company and a heavy user of the PETSc library. >> I would like to thank you for your contributions to PETSc and express my >> deep appreciation for your work. >> >> Recently, I encountered some difficulties when using PETSc to solve >> structural mechanics problems with Lagrange multiplier constraints. After >> searching extensively online and reviewing several papers, I found your >> previous paper titled "*Algebraic multigrid methods for constrained >> linear systems with applications to contact problems in solid mechanics*" >> seems to be the most relevant and helpful. >> >> The stiffness matrix I'm working with, *K*, is a block saddle-point >> matrix of the form (A00 A01; A10 0), where *A00 is singular*?just as >> described in your paper, and different from many other articles . I have a >> few questions regarding your work and would greatly appreciate your >> insights: >> >> 1. Is the *AMG/KKT* method presented in your paper available in PETSc? I >> tried using *CG+GAMG* directly but received a *KSP_DIVERGED_PC_FAILED* >> error. I also attempted to use *CG+PCFIELDSPLIT* with the following >> options: >> > > No > > >> >> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >> -fieldsplit_1_pc_type bjacobi >> >> Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* >> error. Do you have any suggestions? >> >> 2. In your paper, you compare the method with *Uzawa*-type approaches. >> To my understanding, Uzawa methods typically require A00 to be invertible. >> How did you handle the singularity of A00 to construct an M-matrix that is >> invertible? >> >> > You add a regularization term like A01 * A10 (like springs). See the paper > or any reference to augmented lagrange or Uzawa > > > 3. Can i implement the AMG/KKT method in your paper using existing *AMG >> APIs*? Implementing a production-level AMG solver from scratch would be >> quite challenging for me, so I?m hoping to utilize existing AMG interfaces >> within PETSc or other packages. >> >> > You can do Uzawa and make the regularization matrix with matrix-matrix > products. Just use AMG for the A00 block. > > > >> 4. For saddle-point systems where A00 is singular, can you recommend any >> more robust or efficient solutions? Alternatively, are you aware of any >> open-source software packages that can handle such cases out-of-the-box? >> >> > No, and I don't think PETSc can do this out-of-the-box, but others may be > able to give you a better idea of what PETSc can do. > I think PETSc can do Uzawa or other similar algorithms but it will not do > the regularization automatically (it is a bit more complicated than just > A01 * A10) > One other trick you can use is to have -fieldsplit_0_mg_coarse_pc_type svd This will use SVD on the coarse grid of GAMG, which can handle the null space in A00 as long as the prolongation does not put it back in. I have used this for the Laplacian with Neumann conditions and for freely floating elastic problems. Thanks, Matt > Thanks, > Mark > >> >> Thank you very much for taking the time to read my email. Looking forward >> to hearing from you. >> >> >> >> Sincerely, >> >> Xiaofeng He >> ----------------------------------------------------- >> >> Research Engineer >> >> Internet Based Engineering, Beijing, China >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bhA0xS1Lyq7mWlSN2W2572y18Q38WKonPljSMeGMVE5TY4Nw-ueWz0h78sUCB1AquAVUULPnogDQWoDKoLzb$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Jun 12 07:50:51 2025 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 12 Jun 2025 08:50:51 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> Message-ID: On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley wrote: > On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: > >> Adding this to the PETSc mailing list, >> >> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng >> wrote: >> >>> >>> Dear Professor, >>> >>> I hope this message finds you well. >>> >>> I am an employee at a CAE company and a heavy user of the PETSc library. >>> I would like to thank you for your contributions to PETSc and express my >>> deep appreciation for your work. >>> >>> Recently, I encountered some difficulties when using PETSc to solve >>> structural mechanics problems with Lagrange multiplier constraints. After >>> searching extensively online and reviewing several papers, I found your >>> previous paper titled "*Algebraic multigrid methods for constrained >>> linear systems with applications to contact problems in solid mechanics*" >>> seems to be the most relevant and helpful. >>> >>> The stiffness matrix I'm working with, *K*, is a block saddle-point >>> matrix of the form (A00 A01; A10 0), where *A00 is singular*?just as >>> described in your paper, and different from many other articles . I have a >>> few questions regarding your work and would greatly appreciate your >>> insights: >>> >>> 1. Is the *AMG/KKT* method presented in your paper available in PETSc? >>> I tried using *CG+GAMG* directly but received a *KSP_DIVERGED_PC_FAILED* >>> error. I also attempted to use *CG+PCFIELDSPLIT* with the following >>> options: >>> >> >> No >> >> >>> >>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>> -fieldsplit_1_pc_type bjacobi >>> >>> Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* >>> error. Do you have any suggestions? >>> >>> 2. In your paper, you compare the method with *Uzawa*-type approaches. >>> To my understanding, Uzawa methods typically require A00 to be invertible. >>> How did you handle the singularity of A00 to construct an M-matrix that is >>> invertible? >>> >>> >> You add a regularization term like A01 * A10 (like springs). See the >> paper or any reference to augmented lagrange or Uzawa >> >> >> 3. Can i implement the AMG/KKT method in your paper using existing *AMG >>> APIs*? Implementing a production-level AMG solver from scratch would be >>> quite challenging for me, so I?m hoping to utilize existing AMG interfaces >>> within PETSc or other packages. >>> >>> >> You can do Uzawa and make the regularization matrix with matrix-matrix >> products. Just use AMG for the A00 block. >> >> >> >>> 4. For saddle-point systems where A00 is singular, can you recommend any >>> more robust or efficient solutions? Alternatively, are you aware of any >>> open-source software packages that can handle such cases out-of-the-box? >>> >>> >> No, and I don't think PETSc can do this out-of-the-box, but others may be >> able to give you a better idea of what PETSc can do. >> I think PETSc can do Uzawa or other similar algorithms but it will not do >> the regularization automatically (it is a bit more complicated than just >> A01 * A10) >> > > One other trick you can use is to have > > -fieldsplit_0_mg_coarse_pc_type svd > > This will use SVD on the coarse grid of GAMG, which can handle the null > space in A00 as long as the prolongation does not put it back in. I have > used this for the Laplacian with Neumann conditions and for freely floating > elastic problems. > > Good point. You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. > Thanks, > > Matt > > >> Thanks, >> Mark >> >>> >>> Thank you very much for taking the time to read my email. Looking >>> forward to hearing from you. >>> >>> >>> >>> Sincerely, >>> >>> Xiaofeng He >>> ----------------------------------------------------- >>> >>> Research Engineer >>> >>> Internet Based Engineering, Beijing, China >>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cOziFPTYQhj3Q2QdRoQiCJ__wU-JwGk5zFk_C1QziJVJ9_DiXgLkG5d2QpCFzatvJgHtMFVSUfYjbKrho9pJS8A$ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jun 12 07:57:40 2025 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 12 Jun 2025 08:57:40 -0400 Subject: [petsc-users] [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) In-Reply-To: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> References: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> Message-ID: Do you wish to use a different norm 1) ONLY for displaying (printing out) the residual norms to track progress 2) in the convergence testing 3) to change the numerical algorithm (for example using the L2 inner product instead of the usual linear algebra R^N l2 inner product). For 1) use SNESMonitorSet() and in your monitor function use SNESGetSolution() to grab the solution and then VecGetArray(). Now you can compute any weighted norm you want on the solution. For 2) similar but you need to use SNESSetConvergenceTest For 3) yes, but you need to ask us specifically. Barry > On Jun 11, 2025, at 4:45?AM, Ali ALI AHMAD wrote: > > Dear PETSc team, > > I hope this message finds you well. > > I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). > > My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. > > Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? > > Thank you very much in advance for your help and for the great work on PETSc! > > Best regards, > > Ali ALI AHMAD > PhD Student > University of Technology of Troyes - UTT - France > GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 > 12 rue Marie Curie - CS 42060 10004 TROYES Cedex -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali.ali_ahmad at utt.fr Thu Jun 12 08:28:02 2025 From: ali.ali_ahmad at utt.fr (Ali ALI AHMAD) Date: Thu, 12 Jun 2025 15:28:02 +0200 (CEST) Subject: [petsc-users] [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) In-Reply-To: References: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> Message-ID: <1703896473.7853283.1749734882144.JavaMail.zimbra@utt.fr> Thank you for your answer. I am currently working with the nonlinear solvers newtonls (with bt , l2 , etc.) and newtontr (using newton , cauchy , and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. I also use the Eisenstat-Walker method for newtonls , as my initial guess is often very far from the exact solution. What I would like to do now is to replace the standard Euclidean L 2 norm with the L 2 norm in the Lebesgue sense , because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. Would you be able to advise me on how to implement this change properly? I would deeply appreciate any guidance or suggestions you could provide. Thank you in advance for your help. Best regards, Ali ALI AHMAD De: "Barry Smith" ?: "Ali ALI AHMAD" Cc: "petsc-users" , "petsc-maint" Envoy?: Jeudi 12 Juin 2025 14:57:40 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) Do you wish to use a different norm 1) ONLY for displaying (printing out) the residual norms to track progress 2) in the convergence testing 3) to change the numerical algorithm (for example using the L2 inner product instead of the usual linear algebra R^N l2 inner product). For 1) use SNESMonitorSet() and in your monitor function use SNESGetSolution() to grab the solution and then VecGetArray(). Now you can compute any weighted norm you want on the solution. For 2) similar but you need to use SNESSetConvergenceTest For 3) yes, but you need to ask us specifically. Barry On Jun 11, 2025, at 4:45 AM, Ali ALI AHMAD wrote: Dear PETSc team, I hope this message finds you well. I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? Thank you very much in advance for your help and for the great work on PETSc! Best regards, Ali ALI AHMAD PhD Student University of Technology of Troyes - UTT - France GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 12 rue Marie Curie - CS 42060 10004 TROYES Cedex -------------- next part -------------- An HTML attachment was scrubbed... URL: From hexiaofeng at buaa.edu.cn Thu Jun 12 08:29:03 2025 From: hexiaofeng at buaa.edu.cn (hexiaofeng at buaa.edu.cn) Date: Thu, 12 Jun 2025 21:29:03 +0800 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: Message-ID: <5F173B4C-8659-4D8C-BE6C-FFFE5FBF1009@buaa.edu.cn> Thanks, i?ll try those options and check whether they?re helpful. Xiaofeng > On Jun 12, 2025, at 20:51, Mark Adams wrote: > > Good point. > You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. From ali.ali_ahmad at utt.fr Thu Jun 12 08:42:56 2025 From: ali.ali_ahmad at utt.fr (Ali ALI AHMAD) Date: Thu, 12 Jun 2025 15:42:56 +0200 (CEST) Subject: [petsc-users] [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) In-Reply-To: <1703896473.7853283.1749734882144.JavaMail.zimbra@utt.fr> References: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> <1703896473.7853283.1749734882144.JavaMail.zimbra@utt.fr> Message-ID: <461035026.7868511.1749735776853.JavaMail.zimbra@utt.fr> Thank you for your answer. I am currently working with the nonlinear solvers newtonls (with bt , l2 , etc.) and newtontr (using newton , cauchy , and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. I also use the Eisenstat-Walker method for newtonls , as my initial guess is often very far from the exact solution. What I would like to do now is to replace the standard Euclidean L 2 norm with the L 2 norm in the Lebesgue sense in the above numerical algorithm , because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. Would you be able to advise me on how to implement this change properly? I would deeply appreciate any guidance or suggestions you could provide. Thank you in advance for your help. Best regards, Ali ALI AHMAD De: "Ali ALI AHMAD" ?: "Barry Smith" Cc: "petsc-users" , "petsc-maint" Envoy?: Jeudi 12 Juin 2025 15:28:02 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) Thank you for your answer. I am currently working with the nonlinear solvers newtonls (with bt , l2 , etc.) and newtontr (using newton , cauchy , and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. I also use the Eisenstat-Walker method for newtonls , as my initial guess is often very far from the exact solution. What I would like to do now is to replace the standard Euclidean L 2 norm with the L 2 norm in the Lebesgue sense , because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. Would you be able to advise me on how to implement this change properly? I would deeply appreciate any guidance or suggestions you could provide. Thank you in advance for your help. Best regards, Ali ALI AHMAD De: "Barry Smith" ?: "Ali ALI AHMAD" Cc: "petsc-users" , "petsc-maint" Envoy?: Jeudi 12 Juin 2025 14:57:40 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) Do you wish to use a different norm 1) ONLY for displaying (printing out) the residual norms to track progress 2) in the convergence testing 3) to change the numerical algorithm (for example using the L2 inner product instead of the usual linear algebra R^N l2 inner product). For 1) use SNESMonitorSet() and in your monitor function use SNESGetSolution() to grab the solution and then VecGetArray(). Now you can compute any weighted norm you want on the solution. For 2) similar but you need to use SNESSetConvergenceTest For 3) yes, but you need to ask us specifically. Barry On Jun 11, 2025, at 4:45 AM, Ali ALI AHMAD wrote: Dear PETSc team, I hope this message finds you well. I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? Thank you very much in advance for your help and for the great work on PETSc! Best regards, Ali ALI AHMAD PhD Student University of Technology of Troyes - UTT - France GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 12 rue Marie Curie - CS 42060 10004 TROYES Cedex -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Thu Jun 12 14:08:33 2025 From: dontbugthedevs at proton.me (Noam T.) Date: Thu, 12 Jun 2025 19:08:33 +0000 Subject: [petsc-users] Nodes added to Vertex Set Message-ID: Hello, In the mesh attached, a 1x1x1 cube with 8 nodes, physical groups are defined for the 6 bounding faces only. According to the GUI, and looking at the MSH file in $Entities, there are no physical groups for nodes. Using the following code: --- #include #include int main(int argc, char *argv[]) { DM dm; PetscInt v; PetscCall(PetscInitialize(&argc, &argv, NULL, NULL)); PetscCall(PetscOptionsInsertString(NULL, " -dm_plex_gmsh_mark_vertices ")); PetscCall(DMPlexCreateFromFile(PETSC_COMM_WORLD, "mesh.msh", "test", PETSC_TRUE, &dm)); PetscCall(DMGetLabelSize(dm, "Vertex Sets", &v)); printf("\nVertex set size %d\n", v); PetscCall(PetscFinalize()); } --- it says "Vertex Sets" has size 6, with IS [1, 2, 3, 4, 5, 6]. Is that intended behavior? Up until now I relied on having an empty set whenever physical tags where not explicitly defined, but this one mesh seems to be an exception. Thanks. Noam PS: The size is v = 0 when removing the flag "-dm_plex_gmsh_mark_vertices". -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mesh.msh Type: model/mesh Size: 1778 bytes Desc: not available URL: From dontbugthedevs at proton.me Thu Jun 12 15:26:27 2025 From: dontbugthedevs at proton.me (Noam T.) Date: Thu, 12 Jun 2025 20:26:27 +0000 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: Thank you for the code; it provides exactly what I was looking for. Following up on this matter, does this method not work for higher order elements? For example, using an 8-node quadrilateral, exporting to a PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node coordinates in geometry/vertices (here a quadrilateral in [0, 10]) 5.0, 5.0 0.0, 0.0 10.0, 0.0 10.0, 10.0 0.0, 10.0 5.0, 0.0 10.0, 5.0 5.0, 10.0 0.0, 5.0 but the connectivity in viz/topology is 0 1 2 3 which are likely the corner nodes of the initial, first-order element, before adding extra nodes for the higher degree element. This connectivity values [0, 1, 2, 3, ...] are always the same, including for other elements, whereas the coordinates are correct E.g. for 3rd order triangle in [0, 1], coordinates are given left to right, bottom to top 0, 0 1/3, 0, 2/3, 0, 1, 0 0, 1/3 1/3, 1/3 2/3, 1/3 0, 2/3, 1/3, 2/3 0, 1 but the connectivity (viz/topology/cells) is [0, 1, 2]. Test meshes were created with gmsh from the python API, using gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... Thank you. Noam On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley wrote: > On Thu, May 22, 2025 at 12:25?PM Noam T. wrote: > >> Hello, >> >> Thank you the various options. >> >> Use case here would be obtaining the exact output generated by option 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix generated under /viz/topology/cells. >> >>> There are several ways you might do this. It helps to know what you are aiming for. >>> >>> 1) If you just want this output, it might be easier to just DMView() with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the cell-vertex topology and coordinates >> >> Is it possible to get this information in memory, onto a Mat, Vec or some other Int array object directly? it would be handy to have it in order to manipulate it and/or save it to a different format/file. Saving to an HDF5 and loading it again seems redundant. >> >>> 2) You can call DMPlexUninterpolate() to produce a mesh with just cells and vertices, and output it in any format. >>> >>> 3) If you want it in memory, but still with global indices (I don't understand this use case), then you can use DMPlexCreatePointNumbering() for an overall global numbering, or DMPlexCreateCellNumbering() and DMPlexCreateVertexNumbering() for separate global numberings. >> >> Perhaps I missed it, but getting the connectivity matrix in /viz/topology/cells/ did not seem directly trivial to me from the list of global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I assume all the operations done when calling DMView()). > > Something like > > DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); > DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); > DMPlexGetVertexNumbering(dm, &globalVertexNumbers); > ISGetIndices(globalVertexNumbers, &gv); > for (PetscInt c = cStart; c < cEnd; ++c) { > PetscInt *closure = NULL; > > DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); > for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { > if (closure[cl] < vStart || closure[cl] >= vEnd) continue; > const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : gv[closure[cl]]; > > // Do something with v > } > DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); > } > ISRestoreIndices(globalVertexNumbers, &gv); > ISDestroy(&globalVertexNumbers); > > Thanks, > > Matt > >> Thanks, >> Noam. > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!cZvIn6-WAARgI-p1R_s8_n7yI_bYSvNOSQA6c46k1dLOGF9AxVvPAYS3bARKuXY6jcvg_zko2D-CUgfpOxYtN35083cDfLZY$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Thu Jun 12 16:19:40 2025 From: dontbugthedevs at proton.me (Noam T.) Date: Thu, 12 Jun 2025 21:19:40 +0000 Subject: [petsc-users] Nodes added to Vertex Set In-Reply-To: References: Message-ID: I might be able to answer myself. in the $Nodes section, the mid-face nodes are included with dimension 2 a.k.a faces; e.g. the ninth node 2 6 0 1 <-- 2 is the element dimension, 6 is the tag 9 0.5 0.5 1.0 So when creating groups for the surfaces, these nodes are also included into their own physical groups, which PETSc adds into Vertex Sets. I don't quite get why these nodes have dimension 2, but that's probably a gmsh-related question. On Thursday, June 12th, 2025 at 7:08 PM, Noam T. wrote: > Hello, > > In the mesh attached, a 1x1x1 cube with 8 nodes, physical groups are defined for the 6 bounding faces only. According to the GUI, and looking at the MSH file in $Entities, there are no physical groups for nodes. > > Using the following code: > > --- > #include > #include > > int main(int argc, char *argv[]) { > > DM dm; > PetscInt v; > > PetscCall(PetscInitialize(&argc, &argv, NULL, NULL)); > > PetscCall(PetscOptionsInsertString(NULL, " -dm_plex_gmsh_mark_vertices ")); > PetscCall(DMPlexCreateFromFile(PETSC_COMM_WORLD, "mesh.msh", "test", PETSC_TRUE, &dm)); > PetscCall(DMGetLabelSize(dm, "Vertex Sets", &v)); > printf("\nVertex set size %d\n", v); > > PetscCall(PetscFinalize()); > } > --- > it says "Vertex Sets" has size 6, with IS [1, 2, 3, 4, 5, 6]. Is that intended behavior? Up until now I relied on having an empty set whenever physical tags where not explicitly defined, but this one mesh seems to be an exception. > > Thanks. > > Noam > > PS: The size is v = 0 when removing the flag "-dm_plex_gmsh_mark_vertices". -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jun 12 18:15:43 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 12 Jun 2025 19:15:43 -0400 Subject: [petsc-users] Nodes added to Vertex Set In-Reply-To: References: Message-ID: On Thu, Jun 12, 2025 at 5:19?PM Noam T. via petsc-users < petsc-users at mcs.anl.gov> wrote: > I might be able to answer myself. > > in the $Nodes section, the mid-face nodes are included with dimension 2 > a.k.a faces; e.g. the ninth node > > 2 6 0 1 <-- 2 is the element dimension, 6 is the tag > 9 > 0.5 0.5 1.0 > > So when creating groups for the surfaces, these nodes are also included > into their own physical groups, which PETSc adds into Vertex Sets. > > I don't quite get why these nodes have dimension 2, but that's probably a > gmsh-related question. > Yes, I believe you are right Thanks, Matt > On Thursday, June 12th, 2025 at 7:08 PM, Noam T. > wrote: > > Hello, > > In the mesh attached, a 1x1x1 cube with 8 nodes, physical groups are > defined for the 6 bounding faces only. According to the GUI, and looking at > the MSH file in $Entities, there are no physical groups for nodes. > > Using the following code: > > --- > #include > #include > > int main(int argc, char *argv[]) { > > DM dm; > PetscInt v; > > PetscCall(PetscInitialize(&argc, &argv, NULL, NULL)); > > PetscCall(PetscOptionsInsertString(NULL, " -dm_plex_gmsh_mark_vertices > ")); > PetscCall(DMPlexCreateFromFile(PETSC_COMM_WORLD, "mesh.msh", "test", > PETSC_TRUE, &dm)); > PetscCall(DMGetLabelSize(dm, "Vertex Sets", &v)); > printf("\nVertex set size %d\n", v); > > PetscCall(PetscFinalize()); > } > --- > it says "Vertex Sets" has size 6, with IS [1, 2, 3, 4, 5, 6]. Is that > intended behavior? Up until now I relied on having an empty set whenever > physical tags where not explicitly defined, but this one mesh seems to be > an exception. > > Thanks. > > Noam > > PS: The size is v = 0 when removing the flag " > -dm_plex_gmsh_mark_vertices". > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z4curmdejNw0AM-03QfIfgbbKhetBLW0uc73MJdIqhLFlHZPkh5P68z3Pfok2SXAMNYcwxnUCv7iJ4v8rj81$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jun 12 20:14:06 2025 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 12 Jun 2025 21:14:06 -0400 Subject: [petsc-users] [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) In-Reply-To: <461035026.7868511.1749735776853.JavaMail.zimbra@utt.fr> References: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> <1703896473.7853283.1749734882144.JavaMail.zimbra@utt.fr> <461035026.7868511.1749735776853.JavaMail.zimbra@utt.fr> Message-ID: You haven't answered my question. Where (conceptually) and for what purpose do you want to use the L2 norm. 1) displaying norms to observe the convergence behavior 2) in the convergence testing to determine when to stop 3) changing the "inner product" in the algorithm which amounts to preconditioning. Barry > On Jun 12, 2025, at 9:42?AM, Ali ALI AHMAD wrote: > > Thank you for your answer. > > I am currently working with the nonlinear solvers newtonls (with bt, l2, etc.) and newtontr (using newton, cauchy, and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. > > I also use the Eisenstat-Walker method for newtonls, as my initial guess is often very far from the exact solution. > > What I would like to do now is to replace the standard Euclidean L2 norm with the L2 norm in the Lebesgue sense in the above numerical algorithm, because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. > > Would you be able to advise me on how to implement this change properly? > > I would deeply appreciate any guidance or suggestions you could provide. > > Thank you in advance for your help. > > Best regards, > Ali ALI AHMAD > > De: "Ali ALI AHMAD" > ?: "Barry Smith" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Jeudi 12 Juin 2025 15:28:02 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > Thank you for your answer. > > I am currently working with the nonlinear solvers newtonls (with bt, l2, etc.) and newtontr (using newton, cauchy, and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. > > I also use the Eisenstat-Walker method for newtonls, as my initial guess is often very far from the exact solution. > > What I would like to do now is to replace the standard Euclidean L2 norm with the L2 norm in the Lebesgue sense, because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. > > Would you be able to advise me on how to implement this change properly? > > I would deeply appreciate any guidance or suggestions you could provide. > > Thank you in advance for your help. > > Best regards, > Ali ALI AHMAD > De: "Barry Smith" > ?: "Ali ALI AHMAD" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Jeudi 12 Juin 2025 14:57:40 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > Do you wish to use a different norm > > 1) ONLY for displaying (printing out) the residual norms to track progress > > 2) in the convergence testing > > 3) to change the numerical algorithm (for example using the L2 inner product instead of the usual linear algebra R^N l2 inner product). > > For 1) use SNESMonitorSet() and in your monitor function use SNESGetSolution() to grab the solution and then VecGetArray(). Now you can compute any weighted norm you want on the solution. > > For 2) similar but you need to use SNESSetConvergenceTest > > For 3) yes, but you need to ask us specifically. > > Barry > > > On Jun 11, 2025, at 4:45?AM, Ali ALI AHMAD wrote: > > Dear PETSc team, > > I hope this message finds you well. > > I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). > > My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. > > Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? > > Thank you very much in advance for your help and for the great work on PETSc! > > Best regards, > > Ali ALI AHMAD > PhD Student > University of Technology of Troyes - UTT - France > GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 > 12 rue Marie Curie - CS 42060 10004 TROYES Cedex > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hexiaofeng at buaa.edu.cn Thu Jun 12 21:54:52 2025 From: hexiaofeng at buaa.edu.cn (hexioafeng) Date: Fri, 13 Jun 2025 10:54:52 +0800 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> Message-ID: Dear authors, I tried -pc_type game -pc_gamg_parallel_coarse_grid_solver and -pc_type field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi _fieldsplit_1_sub_pc_type for , both options got the KSP_DIVERGE_PC_FAILED error. Thanks, Xiaofeng > On Jun 12, 2025, at 20:50, Mark Adams wrote: > > > > On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley > wrote: >> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams > wrote: >>> Adding this to the PETSc mailing list, >>> >>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng > wrote: >>>> >>>> Dear Professor, >>>> >>>> I hope this message finds you well. >>>> >>>> I am an employee at a CAE company and a heavy user of the PETSc library. I would like to thank you for your contributions to PETSc and express my deep appreciation for your work. >>>> >>>> Recently, I encountered some difficulties when using PETSc to solve structural mechanics problems with Lagrange multiplier constraints. After searching extensively online and reviewing several papers, I found your previous paper titled "Algebraic multigrid methods for constrained linear systems with applications to contact problems in solid mechanics" seems to be the most relevant and helpful. >>>> >>>> The stiffness matrix I'm working with, K, is a block saddle-point matrix of the form (A00 A01; A10 0), where A00 is singular?just as described in your paper, and different from many other articles . I have a few questions regarding your work and would greatly appreciate your insights: >>>> >>>> 1. Is the AMG/KKT method presented in your paper available in PETSc? I tried using CG+GAMG directly but received a KSP_DIVERGED_PC_FAILED error. I also attempted to use CG+PCFIELDSPLIT with the following options: >>> >>> No >>> >>>> >>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>> >>>> Unfortunately, this also resulted in a KSP_DIVERGED_PC_FAILED error. Do you have any suggestions? >>>> >>>> 2. In your paper, you compare the method with Uzawa-type approaches. To my understanding, Uzawa methods typically require A00 to be invertible. How did you handle the singularity of A00 to construct an M-matrix that is invertible? >>>> >>> >>> You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa >>> >>> >>>> 3. Can i implement the AMG/KKT method in your paper using existing AMG APIs? Implementing a production-level AMG solver from scratch would be quite challenging for me, so I?m hoping to utilize existing AMG interfaces within PETSc or other packages. >>>> >>> >>> You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. >>> >>> >>>> 4. For saddle-point systems where A00 is singular, can you recommend any more robust or efficient solutions? Alternatively, are you aware of any open-source software packages that can handle such cases out-of-the-box? >>>> >>> >>> No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. >>> I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) >> >> One other trick you can use is to have >> >> -fieldsplit_0_mg_coarse_pc_type svd >> >> This will use SVD on the coarse grid of GAMG, which can handle the null space in A00 as long as the prolongation does not put it back in. I have used this for the Laplacian with Neumann conditions and for freely floating elastic problems. >> > > Good point. > You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. > >> Thanks, >> >> Matt >> >>> Thanks, >>> Mark >>>> >>>> Thank you very much for taking the time to read my email. Looking forward to hearing from you. >>>> >>>> >>>> >>>> Sincerely, >>>> >>>> Xiaofeng He >>>> ----------------------------------------------------- >>>> >>>> Research Engineer >>>> >>>> Internet Based Engineering, Beijing, China >>>> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YnpI8yJI5A6InOexOleRZqZzYOiuR4sw1PSdo040lUsJmZvbh-i6AWIkRtbavv78rzOsshSHnj3REra61DB188ztbxlYbg$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali.ali_ahmad at utt.fr Fri Jun 13 03:55:45 2025 From: ali.ali_ahmad at utt.fr (Ali ALI AHMAD) Date: Fri, 13 Jun 2025 10:55:45 +0200 (CEST) Subject: [petsc-users] [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) In-Reply-To: References: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> <1703896473.7853283.1749734882144.JavaMail.zimbra@utt.fr> <461035026.7868511.1749735776853.JavaMail.zimbra@utt.fr> Message-ID: <323745907.8383516.1749804945465.JavaMail.zimbra@utt.fr> Thank you for your message. To answer your question: I would like to use the L 2 norm in the sense of Lebesgue for all three purposes , especially the third one . 1- For displaying residuals during the nonlinear iterations, I would like to observe the convergence behavior using a norm that better reflects the physical properties of the problem. 2- For convergence testing , I would like the stopping criterion to be based on a weighted L 2 norm that accounts for the geometry of the mesh (since I am working with unstructured, anisotropic triangular meshes). 3 - Most importantly , I would like to modify the inner product used in the algorithm so that it aligns with the weighted L 2 norm (since I am working with unstructured, anisotropic triangular meshes). Best regards, Ali ALI AHMAD De: "Barry Smith" ?: "Ali ALI AHMAD" Cc: "petsc-users" , "petsc-maint" Envoy?: Vendredi 13 Juin 2025 03:14:06 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) You haven't answered my question. Where (conceptually) and for what purpose do you want to use the L2 norm. 1) displaying norms to observe the convergence behavior 2) in the convergence testing to determine when to stop 3) changing the "inner product" in the algorithm which amounts to preconditioning. Barry On Jun 12, 2025, at 9:42 AM, Ali ALI AHMAD wrote: Thank you for your answer. I am currently working with the nonlinear solvers newtonls (with bt , l2 , etc.) and newtontr (using newton , cauchy , and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. I also use the Eisenstat-Walker method for newtonls , as my initial guess is often very far from the exact solution. What I would like to do now is to replace the standard Euclidean L 2 norm with the L 2 norm in the Lebesgue sense in the above numerical algorithm , because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. Would you be able to advise me on how to implement this change properly? I would deeply appreciate any guidance or suggestions you could provide. Thank you in advance for your help. Best regards, Ali ALI AHMAD De: "Ali ALI AHMAD" ?: "Barry Smith" Cc: "petsc-users" , "petsc-maint" Envoy?: Jeudi 12 Juin 2025 15:28:02 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) Thank you for your answer. I am currently working with the nonlinear solvers newtonls (with bt , l2 , etc.) and newtontr (using newton , cauchy , and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. I also use the Eisenstat-Walker method for newtonls , as my initial guess is often very far from the exact solution. What I would like to do now is to replace the standard Euclidean L 2 norm with the L 2 norm in the Lebesgue sense , because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. Would you be able to advise me on how to implement this change properly? I would deeply appreciate any guidance or suggestions you could provide. Thank you in advance for your help. Best regards, Ali ALI AHMAD De: "Barry Smith" ?: "Ali ALI AHMAD" Cc: "petsc-users" , "petsc-maint" Envoy?: Jeudi 12 Juin 2025 14:57:40 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) Do you wish to use a different norm 1) ONLY for displaying (printing out) the residual norms to track progress 2) in the convergence testing 3) to change the numerical algorithm (for example using the L2 inner product instead of the usual linear algebra R^N l2 inner product). For 1) use SNESMonitorSet() and in your monitor function use SNESGetSolution() to grab the solution and then VecGetArray(). Now you can compute any weighted norm you want on the solution. For 2) similar but you need to use SNESSetConvergenceTest For 3) yes, but you need to ask us specifically. Barry BQ_BEGIN On Jun 11, 2025, at 4:45 AM, Ali ALI AHMAD wrote: Dear PETSc team, I hope this message finds you well. I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? Thank you very much in advance for your help and for the great work on PETSc! Best regards, Ali ALI AHMAD PhD Student University of Technology of Troyes - UTT - France GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 12 rue Marie Curie - CS 42060 10004 TROYES Cedex BQ_END -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.scotto at irt-saintexupery.com Fri Jun 13 04:32:40 2025 From: alexandre.scotto at irt-saintexupery.com (SCOTTO Alexandre) Date: Fri, 13 Jun 2025 09:32:40 +0000 Subject: [petsc-users] Insertion mode for Scatter Message-ID: Dear PETSc community, I am currently struggling with the ADD_VALUE mode of the Scatter object. Here is a simple piece of (Python) code to illustrate the issue: vec_1 = PETSc.Vec().createMPI(size=10) vec_1.shift(2.0) vec_2 = PETSc.Vec().createMPI(size=10) vec_2.shift(1.0) index_set = PETSc.IS().createStride(10, step=1) scatter = PETSc.Scatter().create(vec_1, index_set, vec_2, index_set) scatter.scatter(vec_1, vec_2, addv=True) Vectors vec_1 and vec_2 are respectively filled-in with 2.0 and 1.0. After the scattering, I would expect to have in vec_2 the sum of the values initially in vec_2 (that is 1.0) plus the values coming from vec_1 (that is 2.0). But instead of having vec_2 filled in with 3.0 it is filled-in with 9.0. My understanding is that the number of processes (here 4 processes) plays a role since 9.0 = 1.0 (initial value) + 2.0 (values coming from vec_1) x 4 (number of processes). Is there a way to have simply 3.0 as a result? Hoping to have been clear enough, best regards, Alexandre. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Fri Jun 13 04:43:15 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 13 Jun 2025 11:43:15 +0200 Subject: [petsc-users] Insertion mode for Scatter In-Reply-To: References: Message-ID: <88F2D763-F4BE-452B-8ACE-5BFCCCE3C733@joliv.et> > On 13 Jun 2025, at 11:32?AM, SCOTTO Alexandre via petsc-users wrote: > > Dear PETSc community, > > I am currently struggling with the ADD_VALUE mode of the Scatter object. Here is a simple piece of (Python) code to illustrate the issue: > > vec_1 = PETSc.Vec().createMPI(size=10) > vec_1.shift(2.0) > > vec_2 = PETSc.Vec().createMPI(size=10) > vec_2.shift(1.0) > > index_set = PETSc.IS().createStride(10, step=1) > > scatter = PETSc.Scatter().create(vec_1, index_set, vec_2, index_set) > scatter.scatter(vec_1, vec_2, addv=True) > > Vectors vec_1 and vec_2 are respectively filled-in with 2.0 and 1.0. After the scattering, I would expect to have in vec_2 the sum of the values initially in vec_2 (that is 1.0) plus the values coming from vec_1 (that is 2.0). > > But instead of having vec_2 filled in with 3.0 it is filled-in with 9.0. My understanding is that the number of processes (here 4 processes) plays a role since 9.0 = 1.0 (initial value) + 2.0 (values coming from vec_1) x 4 (number of processes). > > Is there a way to have simply 3.0 as a result? With the index_set you are supplying, you are basically saying that each process should scatter the complete vector (not just its local portion). If you use an index_set which does not induce communication (e.g., of size the local size of the Vec and with the same start as the first local row of the Vec), then you?ll get 3.0 as a result. Thanks, Pierre > Hoping to have been clear enough, best regards, > Alexandre. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Fri Jun 13 04:58:32 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 13 Jun 2025 11:58:32 +0200 Subject: [petsc-users] Insertion mode for Scatter In-Reply-To: References: <88F2D763-F4BE-452B-8ACE-5BFCCCE3C733@joliv.et> Message-ID: <1041170C-9E78-4EFE-BC86-30139D42F2C2@joliv.et> Please always keep the list in copy. > On 13 Jun 2025, at 11:55?AM, SCOTTO Alexandre wrote: > > Hello Pierre, > > Thank you for you answer. > > I think I see the subtlety here, and it makes realized that I have not properly understood yet how index sets should be manipulated, in particular whether the provided indices are local or global. > > It seems that this code solves my issue: > > index_set_1 = PETSc.IS().createStride( > vec_1.local_size, first=vec_1.owner_range[0], step=1 > ) > index_set_2 = PETSc.IS().createStride( > vec_2.local_size, first=vec_2.owner_range[0], step=1 > ) > > scatter = PETSc.Scatter().create(vec_1, index_set_1, vec_2, index_set_2) > scatter.scatter(vec_1, vec_2, addv=True) > > Do you think that this is a decent manner of transfering the whole content of a vector to another of same dimension? > > I have a lot of this scattering to perform so if you have a better recommendation I would be pleased. The ?scattering? you are defining and using is basically a VecAXPY(), so using a scatter for such an operation is definitely _not_ the way to go. Thanks, Pierre > Best regards, > Alexandre. > > > De : Pierre Jolivet > Envoy? : vendredi 13 juin 2025 11:43 > ? : SCOTTO Alexandre > Cc : petsc-users at mcs.anl.gov > Objet : Re: [petsc-users] Insertion mode for Scatter > > > > > On 13 Jun 2025, at 11:32?AM, SCOTTO Alexandre via petsc-users > wrote: > > Dear PETSc community, > > I am currently struggling with the ADD_VALUE mode of the Scatter object. Here is a simple piece of (Python) code to illustrate the issue: > > vec_1 = PETSc.Vec().createMPI(size=10) > vec_1.shift(2.0) > > vec_2 = PETSc.Vec().createMPI(size=10) > vec_2.shift(1.0) > > index_set = PETSc.IS().createStride(10, step=1) > > scatter = PETSc.Scatter().create(vec_1, index_set, vec_2, index_set) > scatter.scatter(vec_1, vec_2, addv=True) > > Vectors vec_1 and vec_2 are respectively filled-in with 2.0 and 1.0. After the scattering, I would expect to have in vec_2 the sum of the values initially in vec_2 (that is 1.0) plus the values coming from vec_1 (that is 2.0). > > But instead of having vec_2 filled in with 3.0 it is filled-in with 9.0. My understanding is that the number of processes (here 4 processes) plays a role since 9.0 = 1.0 (initial value) + 2.0 (values coming from vec_1) x 4 (number of processes). > > Is there a way to have simply 3.0 as a result? > > With the index_set you are supplying, you are basically saying that each process should scatter the complete vector (not just its local portion). > If you use an index_set which does not induce communication (e.g., of size the local size of the Vec and with the same start as the first local row of the Vec), then you?ll get 3.0 as a result. > > Thanks, > Pierre > > > Hoping to have been clear enough, best regards, > Alexandre. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jun 13 08:05:21 2025 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 13 Jun 2025 09:05:21 -0400 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: On Thu, Jun 12, 2025 at 4:26?PM Noam T. wrote: > > Thank you for the code; it provides exactly what I was looking for. > > Following up on this matter, does this method not work for higher order > elements? For example, using an 8-node quadrilateral, exporting to a > PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node > coordinates in geometry/vertices > If you wanted to include edges/faces, you could do it. First, you would need to decide how you would number things For example, would you number all points contiguously, or separately number cells, vertices, faces and edges. Second, you would check for faces/edges in the closure loop. Right now, we only check for vertices. I would say that this is what convinced me not to do FEM this way. Thanks, Matt > (here a quadrilateral in [0, 10]) > 5.0, 5.0 > 0.0, 0.0 > 10.0, 0.0 > 10.0, 10.0 > 0.0, 10.0 > 5.0, 0.0 > 10.0, 5.0 > 5.0, 10.0 > 0.0, 5.0 > > but the connectivity in viz/topology is > > 0 1 2 3 > > which are likely the corner nodes of the initial, first-order element, > before adding extra nodes for the higher degree element. > > This connectivity values [0, 1, 2, 3, ...] are always the same, including > for other elements, whereas the coordinates are correct > > E.g. for 3rd order triangle in [0, 1], coordinates are given left to > right, bottom to top > 0, 0 > 1/3, 0, > 2/3, 0, > 1, 0 > 0, 1/3 > 1/3, 1/3 > 2/3, 1/3 > 0, 2/3, > 1/3, 2/3 > 0, 1 > > but the connectivity (viz/topology/cells) is [0, 1, 2]. > > Test meshes were created with gmsh from the python API, using > gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... > > Thank you. > Noam > On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley > wrote: > > On Thu, May 22, 2025 at 12:25?PM Noam T. wrote: > >> Hello, >> >> Thank you the various options. >> >> Use case here would be obtaining the exact output generated by option 1), >> DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix generated >> under /viz/topology/cells. >> >> There are several ways you might do this. It helps to know what you are >> aiming for. >> >> 1) If you just want this output, it might be easier to just DMView() with >> the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the cell-vertex >> topology and coordinates >> >> >> Is it possible to get this information in memory, onto a Mat, Vec or some >> other Int array object directly? it would be handy to have it in order to >> manipulate it and/or save it to a different format/file. Saving to an HDF5 >> and loading it again seems redundant. >> >> >> 2) You can call DMPlexUninterpolate() to produce a mesh with just cells >> and vertices, and output it in any format. >> >> 3) If you want it in memory, but still with global indices (I don't >> understand this use case), then you can use DMPlexCreatePointNumbering() >> for an overall global numbering, or DMPlexCreateCellNumbering() and >> DMPlexCreateVertexNumbering() for separate global numberings. >> >> >> Perhaps I missed it, but getting the connectivity matrix in >> /viz/topology/cells/ did not seem directly trivial to me from the list of >> global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I >> assume all the operations done when calling DMView()). >> > > Something like > > DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); > DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); > DMPlexGetVertexNumbering(dm, &globalVertexNumbers); > ISGetIndices(globalVertexNumbers, &gv); > for (PetscInt c = cStart; c < cEnd; ++c) { > PetscInt *closure = NULL; > > DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); > for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { > if (closure[cl] < vStart || closure[cl] >= vEnd) continue; > const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : > gv[closure[cl]]; > > // Do something with v > } > DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); > } > ISRestoreIndices(globalVertexNumbers, &gv); > ISDestroy(&globalVertexNumbers); > > Thanks, > > Matt > > Thanks, >> Noam. >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZA5yOw1wQdQdd1QT0ndwemC5JH9XGk4qzeUI8a6lD8ieAGoEqoAcyXjE2Jf7p7NhTntf6c-fCz4hsvh5OdhC$ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZA5yOw1wQdQdd1QT0ndwemC5JH9XGk4qzeUI8a6lD8ieAGoEqoAcyXjE2Jf7p7NhTntf6c-fCz4hsvh5OdhC$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jun 13 08:09:44 2025 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 13 Jun 2025 09:09:44 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> Message-ID: On Thu, Jun 12, 2025 at 10:55?PM hexioafeng wrote: > Dear authors, > > I tried *-pc_type game -pc_gamg_parallel_coarse_grid_solver* and *-pc_type > field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type > pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad > -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi > _fieldsplit_1_sub_pc_type for* , both options got the > KSP_DIVERGE_PC_FAILED error. > With any question about convergence, we need to see the output of -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason and all the error output. Thanks, Matt > Thanks, > > Xiaofeng > > > On Jun 12, 2025, at 20:50, Mark Adams wrote: > > > > On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley wrote: > >> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: >> >>> Adding this to the PETSc mailing list, >>> >>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng >>> wrote: >>> >>>> >>>> Dear Professor, >>>> >>>> I hope this message finds you well. >>>> >>>> I am an employee at a CAE company and a heavy user of the PETSc >>>> library. I would like to thank you for your contributions to PETSc and >>>> express my deep appreciation for your work. >>>> >>>> Recently, I encountered some difficulties when using PETSc to solve >>>> structural mechanics problems with Lagrange multiplier constraints. After >>>> searching extensively online and reviewing several papers, I found your >>>> previous paper titled "*Algebraic multigrid methods for constrained >>>> linear systems with applications to contact problems in solid mechanics*" >>>> seems to be the most relevant and helpful. >>>> >>>> The stiffness matrix I'm working with, *K*, is a block saddle-point >>>> matrix of the form (A00 A01; A10 0), where *A00 is singular*?just as >>>> described in your paper, and different from many other articles . I have a >>>> few questions regarding your work and would greatly appreciate your >>>> insights: >>>> >>>> 1. Is the *AMG/KKT* method presented in your paper available in PETSc? >>>> I tried using *CG+GAMG* directly but received a >>>> *KSP_DIVERGED_PC_FAILED* error. I also attempted to use >>>> *CG+PCFIELDSPLIT* with the following options: >>>> >>> >>> No >>> >>> >>>> >>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>>> -fieldsplit_1_pc_type bjacobi >>>> >>>> Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* error. >>>> Do you have any suggestions? >>>> >>>> 2. In your paper, you compare the method with *Uzawa*-type approaches. >>>> To my understanding, Uzawa methods typically require A00 to be invertible. >>>> How did you handle the singularity of A00 to construct an M-matrix that is >>>> invertible? >>>> >>>> >>> You add a regularization term like A01 * A10 (like springs). See the >>> paper or any reference to augmented lagrange or Uzawa >>> >>> >>> 3. Can i implement the AMG/KKT method in your paper using existing *AMG >>>> APIs*? Implementing a production-level AMG solver from scratch would >>>> be quite challenging for me, so I?m hoping to utilize existing AMG >>>> interfaces within PETSc or other packages. >>>> >>>> >>> You can do Uzawa and make the regularization matrix with matrix-matrix >>> products. Just use AMG for the A00 block. >>> >>> >>> >>>> 4. For saddle-point systems where A00 is singular, can you recommend >>>> any more robust or efficient solutions? Alternatively, are you aware of any >>>> open-source software packages that can handle such cases out-of-the-box? >>>> >>>> >>> No, and I don't think PETSc can do this out-of-the-box, but others may >>> be able to give you a better idea of what PETSc can do. >>> I think PETSc can do Uzawa or other similar algorithms but it will not >>> do the regularization automatically (it is a bit more complicated than just >>> A01 * A10) >>> >> >> One other trick you can use is to have >> >> -fieldsplit_0_mg_coarse_pc_type svd >> >> This will use SVD on the coarse grid of GAMG, which can handle the null >> space in A00 as long as the prolongation does not put it back in. I have >> used this for the Laplacian with Neumann conditions and for freely floating >> elastic problems. >> >> > Good point. > You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a > on level iterative solver for the coarse grid. > > >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Mark >>> >>>> >>>> Thank you very much for taking the time to read my email. Looking >>>> forward to hearing from you. >>>> >>>> >>>> >>>> Sincerely, >>>> >>>> Xiaofeng He >>>> ----------------------------------------------------- >>>> >>>> Research Engineer >>>> >>>> Internet Based Engineering, Beijing, China >>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f-YJSzthRa7atIa1xs1GPHW53hGIqSenvp1eO2kDsSyf4jv1_Vp0kL9Lg8pyyPeG8al4Im8XlLqGRJrGY1qm$ >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f-YJSzthRa7atIa1xs1GPHW53hGIqSenvp1eO2kDsSyf4jv1_Vp0kL9Lg8pyyPeG8al4Im8XlLqGRJrGY1qm$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.katz at earth.ox.ac.uk Fri Jun 13 03:34:21 2025 From: richard.katz at earth.ox.ac.uk (Richard Katz) Date: Fri, 13 Jun 2025 08:34:21 +0000 Subject: [petsc-users] Modifying an SNES/DM Jacobian Message-ID: <26ABE445-8F1E-428B-AAD1-926F595F8665@earth.ox.ac.uk> Hi all, I am solving PDEs on a 1-D domain with a physical length that varies with time as a part of the solution. By change of variables, the discretised space coordinate is in [0,1]. This introduces a coupling between the time-dependent physical domain size and the PDEs. I would like to include the domain-size variable as a DOF in my SNES solve, and this can be accommodated as one of the DM-defined DOFs. However, doing so couples this particular DOF beyond the stencil, to DOFs throughout the domain. So I would like to modify the auto-generated Jacobian to include non-zeros that capture this coupling (via finite differences). For the moment, I am not concerned about breaking the parallelism of the code. Any advice on the approach and PETSc functions that would be best suited would be appreciated. Thanks, Richard ___________________________ Richard Foa Katz Professor of Geodynamics Dept Earth Sciences, Univ Oxford https://urldefense.us/v3/__http://foalab.earth.ox.ac.uk__;!!G_uCfscf7eWS!b_oizbw-XnXax6rasRGtsC_i3pUtzbYYkWhRQifVKLUsOk-SsyuNX4kFQfNq7mrf4hiQxhTOt7sa1y3C7LIW-yAqSN0y-aIoV9k$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 13 18:06:52 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 13 Jun 2025 19:06:52 -0400 Subject: [petsc-users] [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) In-Reply-To: <323745907.8383516.1749804945465.JavaMail.zimbra@utt.fr> References: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> <1703896473.7853283.1749734882144.JavaMail.zimbra@utt.fr> <461035026.7868511.1749735776853.JavaMail.zimbra@utt.fr> <323745907.8383516.1749804945465.JavaMail.zimbra@utt.fr> Message-ID: <85CD2CA9-7B77-4288-87BA-9E108D40C7E8@petsc.dev> I appreciate the clarification. I would call 3) preconditioning. To increase my understanding, you are already using Newton's method? That is, you compute the Jacobian of the function and use - J^{-1}(u^n) F(u^n) as your update direction? When you switch the inner product (or precondition) how will the search direction be different? Thanks Barry The case you need support for is becoming important to PETSc so we need to understand it well and support it well which is why I am asking these (perhaps to you) trivial questions. > On Jun 13, 2025, at 4:55?AM, Ali ALI AHMAD wrote: > > Thank you for your message. > > To answer your question: I would like to use the L2 norm in the sense of Lebesgue for all three purposes, especially the third one. > > 1- For displaying residuals during the nonlinear iterations, I would like to observe the convergence behavior using a norm that better reflects the physical properties of the problem. > > 2- For convergence testing, I would like the stopping criterion to be based on a weighted L2 norm that accounts for the geometry of the mesh (since I am working with unstructured, anisotropic triangular meshes). > > 3 - Most importantly, I would like to modify the inner product used in the algorithm so that it aligns with the weighted L2 norm (since I am working with unstructured, anisotropic triangular meshes). > > Best regards, > Ali ALI AHMAD > De: "Barry Smith" > ?: "Ali ALI AHMAD" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Vendredi 13 Juin 2025 03:14:06 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > You haven't answered my question. Where (conceptually) and for what purpose do you want to use the L2 norm. > 1) displaying norms to observe the convergence behavior > > 2) in the convergence testing to determine when to stop > > 3) changing the "inner product" in the algorithm which amounts to preconditioning. > > Barry > > > On Jun 12, 2025, at 9:42?AM, Ali ALI AHMAD wrote: > > Thank you for your answer. > > I am currently working with the nonlinear solvers newtonls (with bt, l2, etc.) and newtontr (using newton, cauchy, and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. > > I also use the Eisenstat-Walker method for newtonls, as my initial guess is often very far from the exact solution. > > What I would like to do now is to replace the standard Euclidean L2 norm with the L2 norm in the Lebesgue sense in the above numerical algorithm, because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. > > Would you be able to advise me on how to implement this change properly? > > I would deeply appreciate any guidance or suggestions you could provide. > > Thank you in advance for your help. > > Best regards, > Ali ALI AHMAD > > De: "Ali ALI AHMAD" > ?: "Barry Smith" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Jeudi 12 Juin 2025 15:28:02 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > Thank you for your answer. > > I am currently working with the nonlinear solvers newtonls (with bt, l2, etc.) and newtontr (using newton, cauchy, and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. > > I also use the Eisenstat-Walker method for newtonls, as my initial guess is often very far from the exact solution. > > What I would like to do now is to replace the standard Euclidean L2 norm with the L2 norm in the Lebesgue sense, because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. > > Would you be able to advise me on how to implement this change properly? > > I would deeply appreciate any guidance or suggestions you could provide. > > Thank you in advance for your help. > > Best regards, > Ali ALI AHMAD > De: "Barry Smith" > ?: "Ali ALI AHMAD" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Jeudi 12 Juin 2025 14:57:40 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > Do you wish to use a different norm > > 1) ONLY for displaying (printing out) the residual norms to track progress > > 2) in the convergence testing > > 3) to change the numerical algorithm (for example using the L2 inner product instead of the usual linear algebra R^N l2 inner product). > > For 1) use SNESMonitorSet() and in your monitor function use SNESGetSolution() to grab the solution and then VecGetArray(). Now you can compute any weighted norm you want on the solution. > > For 2) similar but you need to use SNESSetConvergenceTest > > For 3) yes, but you need to ask us specifically. > > Barry > > > On Jun 11, 2025, at 4:45?AM, Ali ALI AHMAD wrote: > > Dear PETSc team, > > I hope this message finds you well. > > I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). > > My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. > > Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? > > Thank you very much in advance for your help and for the great work on PETSc! > > Best regards, > > Ali ALI AHMAD > PhD Student > University of Technology of Troyes - UTT - France > GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 > 12 rue Marie Curie - CS 42060 10004 TROYES Cedex > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 13 18:28:57 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 13 Jun 2025 19:28:57 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> Message-ID: <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> Matt, Perhaps we should add options -ksp_monitor_debug and -snes_monitor_debug that turn on all possible monitoring for the (possibly) nested solvers and all of their converged reasons also? Note this is not completely trivial because each preconditioner will have to supply its list based on the current solver options for it. Then we won't need to constantly list a big string of problem specific monitor options to ask the user to use. Barry > On Jun 13, 2025, at 9:09?AM, Matthew Knepley wrote: > > On Thu, Jun 12, 2025 at 10:55?PM hexioafeng > wrote: >> Dear authors, >> >> I tried -pc_type game -pc_gamg_parallel_coarse_grid_solver and -pc_type field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi _fieldsplit_1_sub_pc_type for , both options got the KSP_DIVERGE_PC_FAILED error. > > With any question about convergence, we need to see the output of > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason > > and all the error output. > > Thanks, > > Matt > >> Thanks, >> >> Xiaofeng >> >> >>> On Jun 12, 2025, at 20:50, Mark Adams > wrote: >>> >>> >>> >>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley > wrote: >>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams > wrote: >>>>> Adding this to the PETSc mailing list, >>>>> >>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng > wrote: >>>>>> >>>>>> Dear Professor, >>>>>> >>>>>> I hope this message finds you well. >>>>>> >>>>>> I am an employee at a CAE company and a heavy user of the PETSc library. I would like to thank you for your contributions to PETSc and express my deep appreciation for your work. >>>>>> >>>>>> Recently, I encountered some difficulties when using PETSc to solve structural mechanics problems with Lagrange multiplier constraints. After searching extensively online and reviewing several papers, I found your previous paper titled "Algebraic multigrid methods for constrained linear systems with applications to contact problems in solid mechanics" seems to be the most relevant and helpful. >>>>>> >>>>>> The stiffness matrix I'm working with, K, is a block saddle-point matrix of the form (A00 A01; A10 0), where A00 is singular?just as described in your paper, and different from many other articles . I have a few questions regarding your work and would greatly appreciate your insights: >>>>>> >>>>>> 1. Is the AMG/KKT method presented in your paper available in PETSc? I tried using CG+GAMG directly but received a KSP_DIVERGED_PC_FAILED error. I also attempted to use CG+PCFIELDSPLIT with the following options: >>>>> >>>>> No >>>>> >>>>>> >>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>>> >>>>>> Unfortunately, this also resulted in a KSP_DIVERGED_PC_FAILED error. Do you have any suggestions? >>>>>> >>>>>> 2. In your paper, you compare the method with Uzawa-type approaches. To my understanding, Uzawa methods typically require A00 to be invertible. How did you handle the singularity of A00 to construct an M-matrix that is invertible? >>>>>> >>>>> >>>>> You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa >>>>> >>>>> >>>>>> 3. Can i implement the AMG/KKT method in your paper using existing AMG APIs? Implementing a production-level AMG solver from scratch would be quite challenging for me, so I?m hoping to utilize existing AMG interfaces within PETSc or other packages. >>>>>> >>>>> >>>>> You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. >>>>> >>>>> >>>>>> 4. For saddle-point systems where A00 is singular, can you recommend any more robust or efficient solutions? Alternatively, are you aware of any open-source software packages that can handle such cases out-of-the-box? >>>>>> >>>>> >>>>> No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. >>>>> I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) >>>> >>>> One other trick you can use is to have >>>> >>>> -fieldsplit_0_mg_coarse_pc_type svd >>>> >>>> This will use SVD on the coarse grid of GAMG, which can handle the null space in A00 as long as the prolongation does not put it back in. I have used this for the Laplacian with Neumann conditions and for freely floating elastic problems. >>>> >>> >>> Good point. >>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> Thanks, >>>>> Mark >>>>>> >>>>>> Thank you very much for taking the time to read my email. Looking forward to hearing from you. >>>>>> >>>>>> >>>>>> >>>>>> Sincerely, >>>>>> >>>>>> Xiaofeng He >>>>>> ----------------------------------------------------- >>>>>> >>>>>> Research Engineer >>>>>> >>>>>> Internet Based Engineering, Beijing, China >>>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZuoxH7bPBC1jOgKxEkOzrXMRMf5S0R1zN2saKcW7TzoFI_OEmGiIFpgThYwQTcRZl5Jng_QkCfFT_HbMDqES5mI$ > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZuoxH7bPBC1jOgKxEkOzrXMRMf5S0R1zN2saKcW7TzoFI_OEmGiIFpgThYwQTcRZl5Jng_QkCfFT_HbMDqES5mI$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Sat Jun 14 11:45:52 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Sat, 14 Jun 2025 17:45:52 +0100 Subject: [petsc-users] dmplex, triangle and generating unstructured mesh Message-ID: Hi, I'm just wondering if it is possible to generate a simple unstructured mesh in 2D with DMPlex and one of the external generators, like triangle? I'm running snes/tutorials/ex20 which uses triangle and I can only seem to generate structured triangular meshes. It is a simple 2D box so it makes sense a Delaunay might output a structured mesh in that case. I know I could get the DM to read in an unstructured mesh from a file that I have previously generated, but for ease I'm wondering if this can be done directly. I've tried a few different things to try and force an unstructured mesh being generated, like changing the number of faces to 1 (-dm_plex_box_faces 1,1) so it only sees the corners of the box domain. I've also tried to set some of the options to triangle with DMPlexTriangleSetOptions but that doesn't seem to have any effect, so I'm guessing I'm potentially not using it correctly (and I can't seem to find any examples that use this function to help). Anyone have any suggestions? Thanks for your help Steven -------------- next part -------------- An HTML attachment was scrubbed... URL: From hexiaofeng at buaa.edu.cn Sun Jun 15 20:45:43 2025 From: hexiaofeng at buaa.edu.cn (hexioafeng) Date: Mon, 16 Jun 2025 09:45:43 +0800 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> Message-ID: <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> Hello, Here are the options and outputs: options: -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -fieldsplit_1_mat_schur_complement_ainv_type lump -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason output: 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to SUBPC_ERROR KSP Object: 1 MPI processes type: cg maximum iterations=200, initial guess is zero tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 1 Complexity: grid = 1.00176 Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=7, cols=7 package used to perform factorization: petsc total: nonzeros=45, allocated nonzeros=45 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=7, cols=7 total: nonzeros=45, allocated nonzeros=45 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=7, cols=7 total: nonzeros=45, allocated nonzeros=45 total number of mallocs used during MatSetValues calls=0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 3 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev eigenvalue estimates used: min = 0., max = 0. eigenvalues estimate via gmres min 0., max 0. eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_1_esteig_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=624, cols=624 total: nonzeros=25536, allocated nonzeros=25536 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 336 nodes, limit used is 5 estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=624, cols=624 total: nonzeros=25536, allocated nonzeros=25536 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 336 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=624, cols=624 total: nonzeros=25536, allocated nonzeros=25536 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 336 nodes, limit used is 5 Best regards, Xiaofeng > On Jun 14, 2025, at 07:28, Barry Smith wrote: > > > Matt, > > Perhaps we should add options -ksp_monitor_debug and -snes_monitor_debug that turn on all possible monitoring for the (possibly) nested solvers and all of their converged reasons also? Note this is not completely trivial because each preconditioner will have to supply its list based on the current solver options for it. > > Then we won't need to constantly list a big string of problem specific monitor options to ask the user to use. > > Barry > > > > >> On Jun 13, 2025, at 9:09?AM, Matthew Knepley wrote: >> >> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng > wrote: >>> Dear authors, >>> >>> I tried -pc_type game -pc_gamg_parallel_coarse_grid_solver and -pc_type field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi _fieldsplit_1_sub_pc_type for , both options got the KSP_DIVERGE_PC_FAILED error. >> >> With any question about convergence, we need to see the output of >> >> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >> >> and all the error output. >> >> Thanks, >> >> Matt >> >>> Thanks, >>> >>> Xiaofeng >>> >>> >>>> On Jun 12, 2025, at 20:50, Mark Adams > wrote: >>>> >>>> >>>> >>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley > wrote: >>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams > wrote: >>>>>> Adding this to the PETSc mailing list, >>>>>> >>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng > wrote: >>>>>>> >>>>>>> Dear Professor, >>>>>>> >>>>>>> I hope this message finds you well. >>>>>>> >>>>>>> I am an employee at a CAE company and a heavy user of the PETSc library. I would like to thank you for your contributions to PETSc and express my deep appreciation for your work. >>>>>>> >>>>>>> Recently, I encountered some difficulties when using PETSc to solve structural mechanics problems with Lagrange multiplier constraints. After searching extensively online and reviewing several papers, I found your previous paper titled "Algebraic multigrid methods for constrained linear systems with applications to contact problems in solid mechanics" seems to be the most relevant and helpful. >>>>>>> >>>>>>> The stiffness matrix I'm working with, K, is a block saddle-point matrix of the form (A00 A01; A10 0), where A00 is singular?just as described in your paper, and different from many other articles . I have a few questions regarding your work and would greatly appreciate your insights: >>>>>>> >>>>>>> 1. Is the AMG/KKT method presented in your paper available in PETSc? I tried using CG+GAMG directly but received a KSP_DIVERGED_PC_FAILED error. I also attempted to use CG+PCFIELDSPLIT with the following options: >>>>>> >>>>>> No >>>>>> >>>>>>> >>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>>>> >>>>>>> Unfortunately, this also resulted in a KSP_DIVERGED_PC_FAILED error. Do you have any suggestions? >>>>>>> >>>>>>> 2. In your paper, you compare the method with Uzawa-type approaches. To my understanding, Uzawa methods typically require A00 to be invertible. How did you handle the singularity of A00 to construct an M-matrix that is invertible? >>>>>>> >>>>>> >>>>>> You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa >>>>>> >>>>>> >>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing AMG APIs? Implementing a production-level AMG solver from scratch would be quite challenging for me, so I?m hoping to utilize existing AMG interfaces within PETSc or other packages. >>>>>>> >>>>>> >>>>>> You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. >>>>>> >>>>>> >>>>>>> 4. For saddle-point systems where A00 is singular, can you recommend any more robust or efficient solutions? Alternatively, are you aware of any open-source software packages that can handle such cases out-of-the-box? >>>>>>> >>>>>> >>>>>> No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. >>>>>> I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) >>>>> >>>>> One other trick you can use is to have >>>>> >>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>> >>>>> This will use SVD on the coarse grid of GAMG, which can handle the null space in A00 as long as the prolongation does not put it back in. I have used this for the Laplacian with Neumann conditions and for freely floating elastic problems. >>>>> >>>> >>>> Good point. >>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> Thanks, >>>>>> Mark >>>>>>> >>>>>>> Thank you very much for taking the time to read my email. Looking forward to hearing from you. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Sincerely, >>>>>>> >>>>>>> Xiaofeng He >>>>>>> ----------------------------------------------------- >>>>>>> >>>>>>> Research Engineer >>>>>>> >>>>>>> Internet Based Engineering, Beijing, China >>>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e0AMQ70ZmiHfz7acc2y7bj16p9tQITJx5wjrNfKN3d2Iu_q0ghMi5CQvuisbw_fPYd8w-s_2iyDRH5xvxgTZ9JBIm5UKKQ$ >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!e0AMQ70ZmiHfz7acc2y7bj16p9tQITJx5wjrNfKN3d2Iu_q0ghMi5CQvuisbw_fPYd8w-s_2iyDRH5xvxgTZ9JBIm5UKKQ$ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.scotto at irt-saintexupery.com Mon Jun 16 01:24:50 2025 From: alexandre.scotto at irt-saintexupery.com (SCOTTO Alexandre) Date: Mon, 16 Jun 2025 06:24:50 +0000 Subject: [petsc-users] Insertion mode for Scatter In-Reply-To: <1041170C-9E78-4EFE-BC86-30139D42F2C2@joliv.et> References: <88F2D763-F4BE-452B-8ACE-5BFCCCE3C733@joliv.et> <1041170C-9E78-4EFE-BC86-30139D42F2C2@joliv.et> Message-ID: Hello Pierre, My example is indeed not very relevant for scattering, but in practice I have vectors of same global size but distributed differently over the processes, and I need to transfer the whole data from one vector to the other. In this case, I think that the solution below may be reasonably efficient. Regards, Alexandre. De : Pierre Jolivet Envoy? : vendredi 13 juin 2025 11:59 ? : SCOTTO Alexandre Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] Insertion mode for Scatter Please always keep the list in copy. On 13 Jun 2025, at 11:55?AM, SCOTTO Alexandre > wrote: Hello Pierre, Thank you for you answer. I think I see the subtlety here, and it makes realized that I have not properly understood yet how index sets should be manipulated, in particular whether the provided indices are local or global. It seems that this code solves my issue: index_set_1 = PETSc.IS().createStride( vec_1.local_size, first=vec_1.owner_range[0], step=1 ) index_set_2 = PETSc.IS().createStride( vec_2.local_size, first=vec_2.owner_range[0], step=1 ) scatter = PETSc.Scatter().create(vec_1, index_set_1, vec_2, index_set_2) scatter.scatter(vec_1, vec_2, addv=True) Do you think that this is a decent manner of transfering the whole content of a vector to another of same dimension? I have a lot of this scattering to perform so if you have a better recommendation I would be pleased. The ?scattering? you are defining and using is basically a VecAXPY(), so using a scatter for such an operation is definitely _not_ the way to go. Thanks, Pierre Best regards, Alexandre. De : Pierre Jolivet > Envoy? : vendredi 13 juin 2025 11:43 ? : SCOTTO Alexandre > Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] Insertion mode for Scatter On 13 Jun 2025, at 11:32?AM, SCOTTO Alexandre via petsc-users > wrote: Dear PETSc community, I am currently struggling with the ADD_VALUE mode of the Scatter object. Here is a simple piece of (Python) code to illustrate the issue: vec_1 = PETSc.Vec().createMPI(size=10) vec_1.shift(2.0) vec_2 = PETSc.Vec().createMPI(size=10) vec_2.shift(1.0) index_set = PETSc.IS().createStride(10, step=1) scatter = PETSc.Scatter().create(vec_1, index_set, vec_2, index_set) scatter.scatter(vec_1, vec_2, addv=True) Vectors vec_1 and vec_2 are respectively filled-in with 2.0 and 1.0. After the scattering, I would expect to have in vec_2 the sum of the values initially in vec_2 (that is 1.0) plus the values coming from vec_1 (that is 2.0). But instead of having vec_2 filled in with 3.0 it is filled-in with 9.0. My understanding is that the number of processes (here 4 processes) plays a role since 9.0 = 1.0 (initial value) + 2.0 (values coming from vec_1) x 4 (number of processes). Is there a way to have simply 3.0 as a result? With the index_set you are supplying, you are basically saying that each process should scatter the complete vector (not just its local portion). If you use an index_set which does not induce communication (e.g., of size the local size of the Vec and with the same start as the first local row of the Vec), then you?ll get 3.0 as a result. Thanks, Pierre Hoping to have been clear enough, best regards, Alexandre. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jun 16 05:40:10 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 16 Jun 2025 06:40:10 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> Message-ID: On Sun, Jun 15, 2025 at 9:46?PM hexioafeng wrote: > Hello, > > Here are the options and outputs: > > options: > > -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver > -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur > -pc_fieldsplit_schur_precondition selfp > -fieldsplit_1_mat_schur_complement_ainv_type lump > -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd > -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi > -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual > -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual > -fieldsplit_0_mg_levels_ksp_converged_reason > -fieldsplit_1_ksp_monitor_true_residual > -fieldsplit_1_ksp_converged_reason > This option was wrong: -fieldsplit_0_mg_coarse_pc_type_type svd from the output, we can see that it should have been -fieldsplit_0_mg_coarse_sub_pc_type_type svd THanks, Matt > output: > > 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm > 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC failed due to SUBPC_ERROR > KSP Object: 1 MPI processes > type: cg > maximum iterations=200, initial guess is zero > tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 > left preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: gamg > type is MULTIPLICATIVE, levels=2 cycles=v > Cycles per PCApply=1 > Using externally compute Galerkin coarse grid matrices > GAMG specific options > Threshold for dropping small values in graph on each level = > Threshold scaling factor for each level not specified = 1. > AGG specific options > Symmetric graph false > Number of levels to square graph 1 > Number smoothing steps 1 > Complexity: grid = 1.00176 > Coarse grid solver -- level ------------------------------- > KSP Object: (mg_coarse_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (mg_coarse_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the following KSP > and PC objects on rank 0: > KSP Object: (mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (mg_coarse_sub_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=7, cols=7 > package used to perform factorization: petsc > total: nonzeros=45, allocated nonzeros=45 > using I-node routines: found 3 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=7, cols=7 > total: nonzeros=45, allocated nonzeros=45 > total number of mallocs used during MatSetValues calls=0 > using I-node routines: found 3 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=7, cols=7 > total: nonzeros=45, allocated nonzeros=45 > total number of mallocs used during MatSetValues calls=0 > using nonscalable MatPtAP() implementation > using I-node (on process 0) routines: found 3 nodes, limit used > is 5 > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object: (mg_levels_1_) 1 MPI processes > type: chebyshev > eigenvalue estimates used: min = 0., max = 0. > eigenvalues estimate via gmres min 0., max 0. > eigenvalues estimated using gmres with translations [0. 0.1; 0. > 1.1] > KSP Object: (mg_levels_1_esteig_) 1 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (mg_levels_1_) 1 MPI processes > type: sor > type = local_symmetric, iterations = 1, local iterations = 1, > omega = 1. > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=624, cols=624 > total: nonzeros=25536, allocated nonzeros=25536 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 336 nodes, limit > used is 5 > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (mg_levels_1_) 1 MPI processes > type: sor > type = local_symmetric, iterations = 1, local iterations = 1, > omega = 1. linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=624, cols=624 > total: nonzeros=25536, allocated nonzeros=25536 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 336 nodes, limit > used is 5 Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=624, cols=624 > total: nonzeros=25536, allocated nonzeros=25536 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 336 nodes, limit used is > 5 > > > Best regards, > > Xiaofeng > > > On Jun 14, 2025, at 07:28, Barry Smith wrote: > > > Matt, > > Perhaps we should add options -ksp_monitor_debug and > -snes_monitor_debug that turn on all possible monitoring for the (possibly) > nested solvers and all of their converged reasons also? Note this is not > completely trivial because each preconditioner will have to supply its list > based on the current solver options for it. > > Then we won't need to constantly list a big string of problem specific > monitor options to ask the user to use. > > Barry > > > > > On Jun 13, 2025, at 9:09?AM, Matthew Knepley wrote: > > On Thu, Jun 12, 2025 at 10:55?PM hexioafeng > wrote: > >> Dear authors, >> >> I tried *-pc_type game -pc_gamg_parallel_coarse_grid_solver* and *-pc_type >> field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type >> pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad >> -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi >> _fieldsplit_1_sub_pc_type for* , both options got the >> KSP_DIVERGE_PC_FAILED error. >> > > With any question about convergence, we need to see the output of > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > -fieldsplit_0_mg_levels_ksp_monitor_true_residual > -fieldsplit_0_mg_levels_ksp_converged_reason > -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason > > and all the error output. > > Thanks, > > Matt > > >> Thanks, >> >> Xiaofeng >> >> >> On Jun 12, 2025, at 20:50, Mark Adams wrote: >> >> >> >> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley >> wrote: >> >>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: >>> >>>> Adding this to the PETSc mailing list, >>>> >>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng >>>> wrote: >>>> >>>>> >>>>> Dear Professor, >>>>> >>>>> I hope this message finds you well. >>>>> >>>>> I am an employee at a CAE company and a heavy user of the PETSc >>>>> library. I would like to thank you for your contributions to PETSc and >>>>> express my deep appreciation for your work. >>>>> >>>>> Recently, I encountered some difficulties when using PETSc to solve >>>>> structural mechanics problems with Lagrange multiplier constraints. After >>>>> searching extensively online and reviewing several papers, I found your >>>>> previous paper titled "*Algebraic multigrid methods for constrained >>>>> linear systems with applications to contact problems in solid mechanics*" >>>>> seems to be the most relevant and helpful. >>>>> >>>>> The stiffness matrix I'm working with, *K*, is a block saddle-point >>>>> matrix of the form (A00 A01; A10 0), where *A00 is singular*?just as >>>>> described in your paper, and different from many other articles . I have a >>>>> few questions regarding your work and would greatly appreciate your >>>>> insights: >>>>> >>>>> 1. Is the *AMG/KKT* method presented in your paper available in >>>>> PETSc? I tried using *CG+GAMG* directly but received a >>>>> *KSP_DIVERGED_PC_FAILED* error. I also attempted to use >>>>> *CG+PCFIELDSPLIT* with the following options: >>>>> >>>> >>>> No >>>> >>>> >>>>> >>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>>>> -fieldsplit_1_pc_type bjacobi >>>>> >>>>> Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* error. >>>>> Do you have any suggestions? >>>>> >>>>> 2. In your paper, you compare the method with *Uzawa*-type >>>>> approaches. To my understanding, Uzawa methods typically require A00 to be >>>>> invertible. How did you handle the singularity of A00 to construct an >>>>> M-matrix that is invertible? >>>>> >>>>> >>>> You add a regularization term like A01 * A10 (like springs). See the >>>> paper or any reference to augmented lagrange or Uzawa >>>> >>>> >>>> 3. Can i implement the AMG/KKT method in your paper using existing *AMG >>>>> APIs*? Implementing a production-level AMG solver from scratch would >>>>> be quite challenging for me, so I?m hoping to utilize existing AMG >>>>> interfaces within PETSc or other packages. >>>>> >>>>> >>>> You can do Uzawa and make the regularization matrix with matrix-matrix >>>> products. Just use AMG for the A00 block. >>>> >>>> >>>> >>>>> 4. For saddle-point systems where A00 is singular, can you recommend >>>>> any more robust or efficient solutions? Alternatively, are you aware of any >>>>> open-source software packages that can handle such cases out-of-the-box? >>>>> >>>>> >>>> No, and I don't think PETSc can do this out-of-the-box, but others may >>>> be able to give you a better idea of what PETSc can do. >>>> I think PETSc can do Uzawa or other similar algorithms but it will not >>>> do the regularization automatically (it is a bit more complicated than just >>>> A01 * A10) >>>> >>> >>> One other trick you can use is to have >>> >>> -fieldsplit_0_mg_coarse_pc_type svd >>> >>> This will use SVD on the coarse grid of GAMG, which can handle the null >>> space in A00 as long as the prolongation does not put it back in. I have >>> used this for the Laplacian with Neumann conditions and for freely floating >>> elastic problems. >>> >>> >> Good point. >> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use >> a on level iterative solver for the coarse grid. >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Mark >>>> >>>>> >>>>> Thank you very much for taking the time to read my email. Looking >>>>> forward to hearing from you. >>>>> >>>>> >>>>> >>>>> Sincerely, >>>>> >>>>> Xiaofeng He >>>>> ----------------------------------------------------- >>>>> >>>>> Research Engineer >>>>> >>>>> Internet Based Engineering, Beijing, China >>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dYETsi-moODALE1tmLrk5pxFKF9l552nNiC0cBgsCQ9ebugJWHtsNYa0QBS5Gmws9J_VC_Iec3Nx0Rvlk5er$ >>> >>> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dYETsi-moODALE1tmLrk5pxFKF9l552nNiC0cBgsCQ9ebugJWHtsNYa0QBS5Gmws9J_VC_Iec3Nx0Rvlk5er$ > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dYETsi-moODALE1tmLrk5pxFKF9l552nNiC0cBgsCQ9ebugJWHtsNYa0QBS5Gmws9J_VC_Iec3Nx0Rvlk5er$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Mon Jun 16 06:37:36 2025 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Mon, 16 Jun 2025 13:37:36 +0200 Subject: [petsc-users] dmplex, triangle and generating unstructured mesh In-Reply-To: References: Message-ID: <32b86109-d082-42ad-8b04-a42c02f39971@math.u-bordeaux.fr> On 14/06/2025 18:45, Steven Dargaville wrote: > Hi, I'm just wondering if it is possible to generate a simple > unstructured mesh in 2D with DMPlex and one of the external generators, > like triangle? > > I'm running snes/tutorials/ex20 which uses triangle and I can only seem > to generate structured triangular meshes. It is a simple 2D box so it > makes sense a?Delaunay might output a structured mesh in that case. I > know I could get the DM to read in an unstructured mesh from a file that > I have previously generated, but for ease I'm wondering if this can be > done directly. > > I've tried a few different things to?try and force an unstructured mesh > being generated, like changing the number of faces to 1 (- > dm_plex_box_faces 1,1) so it only sees?the corners of the box domain. > > I've also tried to set some of the options to triangle with > DMPlexTriangleSetOptions but that doesn't seem to have any effect, so > I'm guessing I'm potentially not using it correctly (and I can't seem to > find any examples that use this function to help). > > Anyone have any suggestions? Thanks for your help I don't know how the mesh is generated, but a workaround would be to generate a coarse mesh then use the adaptive remeshing functions (e.g. with MMG) - with a uniform metric if you want a uniform mesh. plex/tests/ex19 should show how it works. Thanks -- Nicolas > Steven From knepley at gmail.com Mon Jun 16 06:53:19 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 16 Jun 2025 07:53:19 -0400 Subject: [petsc-users] dmplex, triangle and generating unstructured mesh In-Reply-To: <32b86109-d082-42ad-8b04-a42c02f39971@math.u-bordeaux.fr> References: <32b86109-d082-42ad-8b04-a42c02f39971@math.u-bordeaux.fr> Message-ID: On Mon, Jun 16, 2025 at 7:38?AM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > On 14/06/2025 18:45, Steven Dargaville wrote: > > Hi, I'm just wondering if it is possible to generate a simple > > unstructured mesh in 2D with DMPlex and one of the external generators, > > like triangle? > > > > I'm running snes/tutorials/ex20 which uses triangle and I can only seem > > to generate structured triangular meshes. It is a simple 2D box so it > > makes sense a Delaunay might output a structured mesh in that case. I > > know I could get the DM to read in an unstructured mesh from a file that > > I have previously generated, but for ease I'm wondering if this can be > > done directly. > > > > I've tried a few different things to try and force an unstructured mesh > > being generated, like changing the number of faces to 1 (- > > dm_plex_box_faces 1,1) so it only sees the corners of the box domain. > > > > I've also tried to set some of the options to triangle with > > DMPlexTriangleSetOptions but that doesn't seem to have any effect, so > > I'm guessing I'm potentially not using it correctly (and I can't seem to > > find any examples that use this function to help). > > > > Anyone have any suggestions? Thanks for your help > > I don't know how the mesh is generated, but a workaround would be to > generate a coarse mesh then use the adaptive remeshing functions (e.g. > with MMG) - with a uniform metric if you want a uniform mesh. > plex/tests/ex19 should show how it works. > Is the idea just to get something that "looks unstructured"? You can usually do that by making the boundary more complicated. For example, here is NY. If instead you have some sort of criterion, then you might use AMR as Nicolas suggests. Thanks, Matt > Thanks > > -- > Nicolas > > Steven > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fUgJTt40Ns7Go6ZSrWZ5h3-IUfGrvM4n144mw_tHWwYpNSbsfcGerK4VdEYw7MbkA2qL_6Z35zIY3gKGRtkt$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: NYSMesh.png Type: image/png Size: 130472 bytes Desc: not available URL: From shenren_xu at nwpu.edu.cn Mon Jun 16 04:25:03 2025 From: shenren_xu at nwpu.edu.cn (shenren_xu at nwpu.edu.cn) Date: Mon, 16 Jun 2025 17:25:03 +0800 (GMT+08:00) Subject: [petsc-users] double orthogonalization for modified gram schmidt in KSPGMRES Message-ID: <1d97768e.605d8.197780e77a8.Coremail.shenren_xu@nwpu.edu.cn> Dear PETSC moderator, I found there are classical and modified gram-schmidt options for KSPGMRES solver. For classical GS, one could define additional orthogonalization sweeps, while for modified GS there is no such option. I got the impression that PETSC GMRES implementation assumes that the MGS is so robust that double orthogonalization is unnecessary. However, our recent experience indicated that even for MGS, double orth. is nenecessary. Otherwise, GMRES would produce an increase residual convergence history. I'm emailing for clarification on this: was it because I did not use the option correctly with some misunderstanding about the user guide, or is this indeed the current situation for the KSPGMRES solver implemention? As a side note, it was written in Yousef Saad's book 'Iterative methods for sparse linear systems, second editon' (page 162) that "However, there are cases where cancellations are so severe in the orthogonalization steps that even the Modi?ed Gram-Schmidt option is inadequate." It seems that Prof Saad was well aware of this, which backs our finding. Thanks and look forward to further discussion on this. Best regards, Shenren ??? ????????????? ???/?? ??/???18762660364 ?????shenren_xu at nwpu.edu.cn ?????https://urldefense.us/v3/__https://teacher.nwpu.edu.cn/xushenren.html__;!!G_uCfscf7eWS!ZQPNzqIpQl_f9D2grHzxnPVMeys4fV6pumRzr-GoGSdYnvlc6af7refw2J_yHEC2AJewkzY4oRv6Z4fmyMum0-iAy1Lk4Q$ Shenren Xu, PhD Associate Professor School of Power and Energy Northwestern Polytechnical University Xi'an 710129 , China P.R. Tel: +86-18762660364 Web: https://urldefense.us/v3/__https://teacher.nwpu.edu.cn/xushenren.html__;!!G_uCfscf7eWS!ZQPNzqIpQl_f9D2grHzxnPVMeys4fV6pumRzr-GoGSdYnvlc6af7refw2J_yHEC2AJewkzY4oRv6Z4fmyMum0-iAy1Lk4Q$ Email: shenren_xu at nwpu.edu.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Jun 16 11:38:26 2025 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 16 Jun 2025 16:38:26 +0000 Subject: [petsc-users] double orthogonalization for modified gram schmidt in KSPGMRES In-Reply-To: <1d97768e.605d8.197780e77a8.Coremail.shenren_xu@nwpu.edu.cn> References: <1d97768e.605d8.197780e77a8.Coremail.shenren_xu@nwpu.edu.cn> Message-ID: <655E1402-3118-45E6-A1CB-C56310E11E6B@dsic.upv.es> It is well known that MGS will not guarantee a fully orthogonal basis. However, the Krylov basis is usually good enough when you solve linear systems (GMRES). A different story is when you want to approximate eigenvalues, in which case the quality of the orthogonal basis is more critical. On the other hand, both MGS with reorthogonalization and CGS with reorthogonalization will give you a similar level of orthogonality. In that scenario, CGS is preferred because its performance is much better (both sequentially and in parallel). An implementation of MGS with reorthogonalization is available in SLEPc (for eigenvalues). Jose > El 16 jun 2025, a las 11:25, shenren_xu at nwpu.edu.cn escribi?: > > Dear PETSC moderator, > > I found there are classical and modified gram-schmidt options for KSPGMRES solver. For classical GS, one could define > additional orthogonalization sweeps, while for modified GS there is no such option. I got the impression that PETSC > GMRES implementation assumes that the MGS is so robust that double orthogonalization is unnecessary. However, our > recent experience indicated that even for MGS, double orth. is nenecessary. Otherwise, GMRES would produce an increase > residual convergence history. > I'm emailing for clarification on this: was it because I did not use the option correctly with some misunderstanding > about the user guide, or is this indeed the current situation for the KSPGMRES solver implemention? > > As a side note, it was written in Yousef Saad's book 'Iterative methods for sparse linear systems, second editon' > (page 162) that > "However, there are cases where cancellations are so severe in the orthogonalization steps that even the Modi?ed Gram-Schmidt option is inadequate." It seems that Prof Saad was well aware of this, which backs > our finding. > > Thanks and look forward to further discussion on this. > > Best regards, > Shenren > > ??? > ????????????? ???/?? > ??/???18762660364 > ?????shenren_xu at nwpu.edu.cn > ?????https://urldefense.us/v3/__https://teacher.nwpu.edu.cn/xushenren.html__;!!G_uCfscf7eWS!fXV3dUwA-5jwXIF4vDoO6e_6nqAlIm_SVSvmZ8pQxdq7xnPr4btoBkBStRwtqJSR1MVayLPV4QAtsa_11iePHK6B$ Shenren Xu, PhD > Associate Professor > School of Power and Energy > Northwestern Polytechnical University > Xi'an 710129 , China P.R. > Tel: +86-18762660364 > Web: https://urldefense.us/v3/__https://teacher.nwpu.edu.cn/xushenren.html__;!!G_uCfscf7eWS!fXV3dUwA-5jwXIF4vDoO6e_6nqAlIm_SVSvmZ8pQxdq7xnPr4btoBkBStRwtqJSR1MVayLPV4QAtsa_11iePHK6B$ > Email: shenren_xu at nwpu.edu.cn > From mac3bar at gmail.com Mon Jun 16 13:05:43 2025 From: mac3bar at gmail.com (Art) Date: Mon, 16 Jun 2025 14:05:43 -0400 Subject: [petsc-users] High Memory Usage with PETSc BDF Solver Compared to scikits.odes CVODE Message-ID: Hi everyone, I?m porting a code from scikits.odes.sundials.cvode to PETSc, since it is compatible with FEniCSx and can run in parallel with MPI. First, I used Petsc with the "rk" solver, and it worked well, both serially and in parallel for a system with 14000 nodes (42000 dofs). However, when using an implicit solver like bdf, the solver takes up all the memory (16 gb), even on a small system. To do this, use this: def ifunction(self, ts, t, y, ydot, f): y.ghostUpdate(PETSc.InsertMode.INSERT_VALUES,PETSc.ScatterMode.FORWARD) y.copy(result=self.yv.x.petsc_vec) self.yv.x.scatter_forward() dydt = self.rhs(self.yv) dydt.x.scatter_forward() ydot.copy(result=f) f.axpy(-1.0, dydt.x.petsc_vec) return 0 y = y0.petsc_vec.copy() ts.setType(ts.Type.BDF) ts.setIFunction(ifunction) ts.setTime(0.0) ts.setTimeStep(1e-14) ts.setStepLimits(1e-17,1e-12) ts.setMaxTime(1.0e-12) ts.setExactFinalTime(PETSc.TS.ExactFinalTime.STEPOVER) ts.setTolerances(rtol=1e-6, atol=1e-6) snes = ts.getSNES() ksp = snes.getKSP() ksp.setType("gmres") ksp.getPC().setType("none") ksp.setFromOptions() For the scikits.odes.sundials.cvode library, in serial mode, I have used: solver = CVODE(rhs, old_api=False, linsolver='spgmr', rtol=1e-6, atol=1e-6, max_steps=5000, order=2) In this case, the solver worked perfectly and obtained similar results to the rk solver in PETSC. I suspect the issue might be related to the way the Jacobian is built in PETSC, but scikits.odes.sundials.cvode works perfectly without requiring the Jacobian. I would greatly appreciate any suggestions or examples on how to properly set up the BDF solver with PETSc. Thanks Art -------------- next part -------------- An HTML attachment was scrubbed... URL: From shenren_xu at nwpu.edu.cn Mon Jun 16 13:32:06 2025 From: shenren_xu at nwpu.edu.cn (shenren_xu at nwpu.edu.cn) Date: Tue, 17 Jun 2025 02:32:06 +0800 (GMT+08:00) Subject: [petsc-users] double orthogonalization for modified gram schmidt in KSPGMRES In-Reply-To: <655E1402-3118-45E6-A1CB-C56310E11E6B@dsic.upv.es> References: <1d97768e.605d8.197780e77a8.Coremail.shenren_xu@nwpu.edu.cn> <655E1402-3118-45E6-A1CB-C56310E11E6B@dsic.upv.es> Message-ID: <36c9f41a.60c96.1977a034f5f.Coremail.shenren_xu@nwpu.edu.cn> Dear Prof. Roman, Thank you very much for the swift reply. I agree with your first paragraph. And your second paragraph answers my planned follow-up question. It's good to know that CGS with reorthogonalization would render a similar result compared with MGS with double orthogonalization. I never thought about that as I assuemd that MGS is always the default option in my limited experience. I really appreciate your help on this. Cheers, Shenren > -----????----- > ???: "Jose E. Roman" > ????:2025-06-17 00:38:26 (???) > ???: "shenren_xu at nwpu.edu.cn" > ??: "petsc-users at mcs.anl.gov" , ??? > ??: Re: [petsc-users] double orthogonalization for modified gram schmidt in KSPGMRES > > It is well known that MGS will not guarantee a fully orthogonal basis. However, the Krylov basis is usually good enough when you solve linear systems (GMRES). A different story is when you want to approximate eigenvalues, in which case the quality of the orthogonal basis is more critical. > > On the other hand, both MGS with reorthogonalization and CGS with reorthogonalization will give you a similar level of orthogonality. In that scenario, CGS is preferred because its performance is much better (both sequentially and in parallel). > > An implementation of MGS with reorthogonalization is available in SLEPc (for eigenvalues). > > Jose > > > > El 16 jun 2025, a las 11:25, shenren_xu at nwpu.edu.cn escribi?: > > > > Dear PETSC moderator, > > > > I found there are classical and modified gram-schmidt options for KSPGMRES solver. For classical GS, one could define > > additional orthogonalization sweeps, while for modified GS there is no such option. I got the impression that PETSC > > GMRES implementation assumes that the MGS is so robust that double orthogonalization is unnecessary. However, our > > recent experience indicated that even for MGS, double orth. is nenecessary. Otherwise, GMRES would produce an increase > > residual convergence history. > > I'm emailing for clarification on this: was it because I did not use the option correctly with some misunderstanding > > about the user guide, or is this indeed the current situation for the KSPGMRES solver implemention? > > > > As a side note, it was written in Yousef Saad's book 'Iterative methods for sparse linear systems, second editon' > > (page 162) that > > "However, there are cases where cancellations are so severe in the orthogonalization steps that even the Modi?ed Gram-Schmidt option is inadequate." It seems that Prof Saad was well aware of this, which backs > > our finding. > > > > Thanks and look forward to further discussion on this. > > > > Best regards, > > Shenren > > > > ??? > > ????????????? ???/?? > > ??/???18762660364 > > ?????shenren_xu at nwpu.edu.cn > > ?????https://urldefense.us/v3/__https://teacher.nwpu.edu.cn/xushenren.html__;!!G_uCfscf7eWS!YoYB4_XPhklQn6ykT0zNJh7eMh1sPOYNDSXEb4BjRYNBgoQdlRlqKreuZ2JEGhr4NAMsqsscF_EHoOIMAPn5phleh1vBxQ$ Shenren Xu, PhD > > Associate Professor > > School of Power and Energy > > Northwestern Polytechnical University > > Xi'an 710129 , China P.R. > > Tel: +86-18762660364 > > Web: https://urldefense.us/v3/__https://teacher.nwpu.edu.cn/xushenren.html__;!!G_uCfscf7eWS!YoYB4_XPhklQn6ykT0zNJh7eMh1sPOYNDSXEb4BjRYNBgoQdlRlqKreuZ2JEGhr4NAMsqsscF_EHoOIMAPn5phleh1vBxQ$ > > Email: shenren_xu at nwpu.edu.cn > > > ------------------------------ ??? ????????????? ???/?? ??/???18762660364 ?????shenren_xu at nwpu.edu.cn ?????https://urldefense.us/v3/__https://teacher.nwpu.edu.cn/xushenren.html__;!!G_uCfscf7eWS!YoYB4_XPhklQn6ykT0zNJh7eMh1sPOYNDSXEb4BjRYNBgoQdlRlqKreuZ2JEGhr4NAMsqsscF_EHoOIMAPn5phleh1vBxQ$ Shenren Xu, PhD Associate Professor School of Power and Energy Northwestern Polytechnical University Xi'an 710129 , China P.R. Tel: +86-18762660364 Web: https://urldefense.us/v3/__https://teacher.nwpu.edu.cn/xushenren.html__;!!G_uCfscf7eWS!YoYB4_XPhklQn6ykT0zNJh7eMh1sPOYNDSXEb4BjRYNBgoQdlRlqKreuZ2JEGhr4NAMsqsscF_EHoOIMAPn5phleh1vBxQ$ Email: shenren_xu at nwpu.edu.cn From bsmith at petsc.dev Mon Jun 16 19:51:38 2025 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 16 Jun 2025 20:51:38 -0400 Subject: [petsc-users] High Memory Usage with PETSc BDF Solver Compared to scikits.odes CVODE In-Reply-To: References: Message-ID: <604E1BFE-1C3C-415A-BD99-1B7E5BCBDF28@petsc.dev> Can you please post the entire code so we can run it ourselves to reproduce the problem. I suspect the code maybe using a dense matrix to represent the Jacobian. This ill require a huge amount of memory. Sundials is likely using Jacobian free Newton Krylov to solve the linear system. You can do this also with PETSc but it might not be defaulting to it. With the code I can quickly check what is happening. Barry > On Jun 16, 2025, at 2:05?PM, Art wrote: > > Hi everyone, > > I?m porting a code from scikits.odes.sundials.cvode to PETSc, since it is compatible with FEniCSx and can run in parallel with MPI. First, I used Petsc with the "rk" solver, and it worked well, both serially and in parallel for a system with 14000 nodes (42000 dofs). However, when using an implicit solver like bdf, the solver takes up all the memory (16 gb), even on a small system. To do this, use this: > > > def ifunction(self, ts, t, y, ydot, f): > > y.ghostUpdate(PETSc.InsertMode.INSERT_VALUES,PETSc.ScatterMode.FORWARD) > y.copy(result=self.yv.x.petsc_vec) > self.yv.x.scatter_forward() > > dydt = self.rhs(self.yv) > dydt.x.scatter_forward() > > ydot.copy(result=f) > f.axpy(-1.0, dydt.x.petsc_vec) > > return 0 > > > y = y0.petsc_vec.copy() > ts.setType(ts.Type.BDF) > ts.setIFunction(ifunction) > ts.setTime(0.0) > ts.setTimeStep(1e-14) > ts.setStepLimits(1e-17,1e-12) > ts.setMaxTime(1.0e-12) > ts.setExactFinalTime(PETSc.TS.ExactFinalTime.STEPOVER) > ts.setTolerances(rtol=1e-6, atol=1e-6) > snes = ts.getSNES() > ksp = snes.getKSP() > ksp.setType("gmres") > ksp.getPC().setType("none") > ksp.setFromOptions() > > For the scikits.odes.sundials.cvode library, in serial mode, I have used: > > solver = CVODE(rhs, > old_api=False, > linsolver='spgmr', > rtol=1e-6, > atol=1e-6, > max_steps=5000, > order=2) > > In this case, the solver worked perfectly and obtained similar results to the rk solver in PETSC. I suspect the issue might be related to the way the Jacobian is built in PETSC, but scikits.odes.sundials.cvode works perfectly without requiring the Jacobian. I would greatly appreciate any suggestions or examples on how to properly set up the BDF solver with PETSc. > Thanks > Art From mfadams at lbl.gov Tue Jun 17 06:05:52 2025 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 17 Jun 2025 07:05:52 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> Message-ID: And don't use -pc_gamg_parallel_coarse_grid_solver You can use that in production but for debugging use -mg_coarse_pc_type svd Also, use -options_left and remove anything that is not used. (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) Mark On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley wrote: > On Sun, Jun 15, 2025 at 9:46?PM hexioafeng wrote: > >> Hello, >> >> Here are the options and outputs: >> >> options: >> >> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver >> -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur >> -pc_fieldsplit_schur_precondition selfp >> -fieldsplit_1_mat_schur_complement_ainv_type lump >> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd >> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >> -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual >> -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual >> -fieldsplit_0_mg_levels_ksp_converged_reason >> -fieldsplit_1_ksp_monitor_true_residual >> -fieldsplit_1_ksp_converged_reason >> > > This option was wrong: > > -fieldsplit_0_mg_coarse_pc_type_type svd > > from the output, we can see that it should have been > > -fieldsplit_0_mg_coarse_sub_pc_type_type svd > > THanks, > > Matt > > >> output: >> >> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC failed due to SUBPC_ERROR >> KSP Object: 1 MPI processes >> type: cg >> maximum iterations=200, initial guess is zero >> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >> left preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on each level = >> Threshold scaling factor for each level not specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00176 >> Coarse grid solver -- level ------------------------------- >> KSP Object: (mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP >> and PC objects on rank 0: >> KSP Object: (mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (mg_coarse_sub_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=7, cols=7 >> package used to perform factorization: petsc >> total: nonzeros=45, allocated nonzeros=45 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=7, cols=7 >> total: nonzeros=45, allocated nonzeros=45 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=7, cols=7 >> total: nonzeros=45, allocated nonzeros=45 >> total number of mallocs used during MatSetValues calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 nodes, limit used >> is 5 >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object: (mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0., max = 0. >> eigenvalues estimate via gmres min 0., max 0. >> eigenvalues estimated using gmres with translations [0. 0.1; 0. >> 1.1] >> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local iterations = 1, >> omega = 1. >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=624, cols=624 >> total: nonzeros=25536, allocated nonzeros=25536 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 336 nodes, >> limit used is 5 >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local iterations = 1, >> omega = 1. linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=624, cols=624 >> total: nonzeros=25536, allocated nonzeros=25536 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 336 nodes, limit >> used is 5 Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=624, cols=624 >> total: nonzeros=25536, allocated nonzeros=25536 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 336 nodes, limit used >> is 5 >> >> >> Best regards, >> >> Xiaofeng >> >> >> On Jun 14, 2025, at 07:28, Barry Smith wrote: >> >> >> Matt, >> >> Perhaps we should add options -ksp_monitor_debug and >> -snes_monitor_debug that turn on all possible monitoring for the (possibly) >> nested solvers and all of their converged reasons also? Note this is not >> completely trivial because each preconditioner will have to supply its list >> based on the current solver options for it. >> >> Then we won't need to constantly list a big string of problem specific >> monitor options to ask the user to use. >> >> Barry >> >> >> >> >> On Jun 13, 2025, at 9:09?AM, Matthew Knepley wrote: >> >> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng >> wrote: >> >>> Dear authors, >>> >>> I tried *-pc_type game -pc_gamg_parallel_coarse_grid_solver* and *-pc_type >>> field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type >>> pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad >>> -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi >>> _fieldsplit_1_sub_pc_type for* , both options got the >>> KSP_DIVERGE_PC_FAILED error. >>> >> >> With any question about convergence, we need to see the output of >> >> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >> -fieldsplit_0_mg_levels_ksp_converged_reason >> -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >> >> and all the error output. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> >>> Xiaofeng >>> >>> >>> On Jun 12, 2025, at 20:50, Mark Adams wrote: >>> >>> >>> >>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley >>> wrote: >>> >>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: >>>> >>>>> Adding this to the PETSc mailing list, >>>>> >>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng >>>>> wrote: >>>>> >>>>>> >>>>>> Dear Professor, >>>>>> >>>>>> I hope this message finds you well. >>>>>> >>>>>> I am an employee at a CAE company and a heavy user of the PETSc >>>>>> library. I would like to thank you for your contributions to PETSc and >>>>>> express my deep appreciation for your work. >>>>>> >>>>>> Recently, I encountered some difficulties when using PETSc to solve >>>>>> structural mechanics problems with Lagrange multiplier constraints. After >>>>>> searching extensively online and reviewing several papers, I found your >>>>>> previous paper titled "*Algebraic multigrid methods for constrained >>>>>> linear systems with applications to contact problems in solid mechanics*" >>>>>> seems to be the most relevant and helpful. >>>>>> >>>>>> The stiffness matrix I'm working with, *K*, is a block saddle-point >>>>>> matrix of the form (A00 A01; A10 0), where *A00 is singular*?just as >>>>>> described in your paper, and different from many other articles . I have a >>>>>> few questions regarding your work and would greatly appreciate your >>>>>> insights: >>>>>> >>>>>> 1. Is the *AMG/KKT* method presented in your paper available in >>>>>> PETSc? I tried using *CG+GAMG* directly but received a >>>>>> *KSP_DIVERGED_PC_FAILED* error. I also attempted to use >>>>>> *CG+PCFIELDSPLIT* with the following options: >>>>>> >>>>> >>>>> No >>>>> >>>>> >>>>>> >>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>>>>> -fieldsplit_1_pc_type bjacobi >>>>>> >>>>>> Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* error. >>>>>> Do you have any suggestions? >>>>>> >>>>>> 2. In your paper, you compare the method with *Uzawa*-type >>>>>> approaches. To my understanding, Uzawa methods typically require A00 to be >>>>>> invertible. How did you handle the singularity of A00 to construct an >>>>>> M-matrix that is invertible? >>>>>> >>>>>> >>>>> You add a regularization term like A01 * A10 (like springs). See the >>>>> paper or any reference to augmented lagrange or Uzawa >>>>> >>>>> >>>>> 3. Can i implement the AMG/KKT method in your paper using existing *AMG >>>>>> APIs*? Implementing a production-level AMG solver from scratch would >>>>>> be quite challenging for me, so I?m hoping to utilize existing AMG >>>>>> interfaces within PETSc or other packages. >>>>>> >>>>>> >>>>> You can do Uzawa and make the regularization matrix with matrix-matrix >>>>> products. Just use AMG for the A00 block. >>>>> >>>>> >>>>> >>>>>> 4. For saddle-point systems where A00 is singular, can you recommend >>>>>> any more robust or efficient solutions? Alternatively, are you aware of any >>>>>> open-source software packages that can handle such cases out-of-the-box? >>>>>> >>>>>> >>>>> No, and I don't think PETSc can do this out-of-the-box, but others may >>>>> be able to give you a better idea of what PETSc can do. >>>>> I think PETSc can do Uzawa or other similar algorithms but it will not >>>>> do the regularization automatically (it is a bit more complicated than just >>>>> A01 * A10) >>>>> >>>> >>>> One other trick you can use is to have >>>> >>>> -fieldsplit_0_mg_coarse_pc_type svd >>>> >>>> This will use SVD on the coarse grid of GAMG, which can handle the null >>>> space in A00 as long as the prolongation does not put it back in. I have >>>> used this for the Laplacian with Neumann conditions and for freely floating >>>> elastic problems. >>>> >>>> >>> Good point. >>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use >>> a on level iterative solver for the coarse grid. >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Mark >>>>> >>>>>> >>>>>> Thank you very much for taking the time to read my email. Looking >>>>>> forward to hearing from you. >>>>>> >>>>>> >>>>>> >>>>>> Sincerely, >>>>>> >>>>>> Xiaofeng He >>>>>> ----------------------------------------------------- >>>>>> >>>>>> Research Engineer >>>>>> >>>>>> Internet Based Engineering, Beijing, China >>>>>> >>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fyBp94uNH39ug71rmxdPNy-fg09OxzGTv8H1_ulFiPksc9CZAblBrgYK7D2QWSJMYv-jihmxWTifwLLc1Sn2DSo$ >>>> >>>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fyBp94uNH39ug71rmxdPNy-fg09OxzGTv8H1_ulFiPksc9CZAblBrgYK7D2QWSJMYv-jihmxWTifwLLc1Sn2DSo$ >> >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fyBp94uNH39ug71rmxdPNy-fg09OxzGTv8H1_ulFiPksc9CZAblBrgYK7D2QWSJMYv-jihmxWTifwLLc1Sn2DSo$ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mac3bar at gmail.com Tue Jun 17 09:36:20 2025 From: mac3bar at gmail.com (Art) Date: Tue, 17 Jun 2025 10:36:20 -0400 Subject: [petsc-users] High Memory Usage with PETSc BDF Solver Compared to scikits.odes CVODE In-Reply-To: <604E1BFE-1C3C-415A-BD99-1B7E5BCBDF28@petsc.dev> References: <604E1BFE-1C3C-415A-BD99-1B7E5BCBDF28@petsc.dev> Message-ID: Hi Barry, Thank you for your assistance . Since this is an ongoing development project, I would prefer not to share the full code publicly at this stage (public petsc-users). I?d be glad to share it with you privately via email, if that's an option you?re open to. Thanks again for your support thanks Art El lun, 16 jun 2025 a las 20:51, Barry Smith () escribi?: > > Can you please post the entire code so we can run it ourselves to > reproduce the problem. > > I suspect the code maybe using a dense matrix to represent the > Jacobian. This ill require a huge amount of memory. Sundials is likely > using Jacobian free Newton Krylov to solve the linear system. You can do > this also with PETSc but it might not be defaulting to it. With the code I > can quickly check what is happening. > > Barry > > > > On Jun 16, 2025, at 2:05?PM, Art wrote: > > > > Hi everyone, > > > > I?m porting a code from scikits.odes.sundials.cvode to PETSc, since it > is compatible with FEniCSx and can run in parallel with MPI. First, I used > Petsc with the "rk" solver, and it worked well, both serially and in > parallel for a system with 14000 nodes (42000 dofs). However, when using > an implicit solver like bdf, the solver takes up all the memory (16 gb), > even on a small system. To do this, use this: > > > > > > def ifunction(self, ts, t, y, ydot, f): > > > > > y.ghostUpdate(PETSc.InsertMode.INSERT_VALUES,PETSc.ScatterMode.FORWARD) > > y.copy(result=self.yv.x.petsc_vec) > > self.yv.x.scatter_forward() > > > > dydt = self.rhs(self.yv) > > dydt.x.scatter_forward() > > > > ydot.copy(result=f) > > f.axpy(-1.0, dydt.x.petsc_vec) > > > > return 0 > > > > > > y = y0.petsc_vec.copy() > > ts.setType(ts.Type.BDF) > > ts.setIFunction(ifunction) > > ts.setTime(0.0) > > ts.setTimeStep(1e-14) > > ts.setStepLimits(1e-17,1e-12) > > ts.setMaxTime(1.0e-12) > > ts.setExactFinalTime(PETSc.TS.ExactFinalTime.STEPOVER) > > ts.setTolerances(rtol=1e-6, atol=1e-6) > > snes = ts.getSNES() > > ksp = snes.getKSP() > > ksp.setType("gmres") > > ksp.getPC().setType("none") > > ksp.setFromOptions() > > > > For the scikits.odes.sundials.cvode library, in serial mode, I have > used: > > > > solver = CVODE(rhs, > > old_api=False, > > linsolver='spgmr', > > rtol=1e-6, > > atol=1e-6, > > max_steps=5000, > > order=2) > > > > In this case, the solver worked perfectly and obtained similar results > to the rk solver in PETSC. I suspect the issue might be related to the way > the Jacobian is built in PETSC, but scikits.odes.sundials.cvode works > perfectly without requiring the Jacobian. I would greatly appreciate any > suggestions or examples on how to properly set up the BDF solver with PETSc. > > Thanks > > Art > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Jun 17 09:57:07 2025 From: jed at jedbrown.org (Jed Brown) Date: Tue, 17 Jun 2025 08:57:07 -0600 Subject: [petsc-users] High Memory Usage with PETSc BDF Solver Compared to scikits.odes CVODE In-Reply-To: References: <604E1BFE-1C3C-415A-BD99-1B7E5BCBDF28@petsc.dev> Message-ID: <875xgu76ws.fsf@jedbrown.org> You can send email to petsc-maint at mcs.anl.gov, which is kept confidential. Art writes: > Hi Barry, > > Thank you for your assistance . Since this is an ongoing development > project, I would prefer not to share the full code publicly at this stage > (public petsc-users). I?d be glad to share it with you privately via email, > if that's an option you?re open to. > > Thanks again for your support > > thanks > > Art > > El lun, 16 jun 2025 a las 20:51, Barry Smith () escribi?: > >> >> Can you please post the entire code so we can run it ourselves to >> reproduce the problem. >> >> I suspect the code maybe using a dense matrix to represent the >> Jacobian. This ill require a huge amount of memory. Sundials is likely >> using Jacobian free Newton Krylov to solve the linear system. You can do >> this also with PETSc but it might not be defaulting to it. With the code I >> can quickly check what is happening. >> >> Barry >> >> >> > On Jun 16, 2025, at 2:05?PM, Art wrote: >> > >> > Hi everyone, >> > >> > I?m porting a code from scikits.odes.sundials.cvode to PETSc, since it >> is compatible with FEniCSx and can run in parallel with MPI. First, I used >> Petsc with the "rk" solver, and it worked well, both serially and in >> parallel for a system with 14000 nodes (42000 dofs). However, when using >> an implicit solver like bdf, the solver takes up all the memory (16 gb), >> even on a small system. To do this, use this: >> > >> > >> > def ifunction(self, ts, t, y, ydot, f): >> > >> > >> y.ghostUpdate(PETSc.InsertMode.INSERT_VALUES,PETSc.ScatterMode.FORWARD) >> > y.copy(result=self.yv.x.petsc_vec) >> > self.yv.x.scatter_forward() >> > >> > dydt = self.rhs(self.yv) >> > dydt.x.scatter_forward() >> > >> > ydot.copy(result=f) >> > f.axpy(-1.0, dydt.x.petsc_vec) >> > >> > return 0 >> > >> > >> > y = y0.petsc_vec.copy() >> > ts.setType(ts.Type.BDF) >> > ts.setIFunction(ifunction) >> > ts.setTime(0.0) >> > ts.setTimeStep(1e-14) >> > ts.setStepLimits(1e-17,1e-12) >> > ts.setMaxTime(1.0e-12) >> > ts.setExactFinalTime(PETSc.TS.ExactFinalTime.STEPOVER) >> > ts.setTolerances(rtol=1e-6, atol=1e-6) >> > snes = ts.getSNES() >> > ksp = snes.getKSP() >> > ksp.setType("gmres") >> > ksp.getPC().setType("none") >> > ksp.setFromOptions() >> > >> > For the scikits.odes.sundials.cvode library, in serial mode, I have >> used: >> > >> > solver = CVODE(rhs, >> > old_api=False, >> > linsolver='spgmr', >> > rtol=1e-6, >> > atol=1e-6, >> > max_steps=5000, >> > order=2) >> > >> > In this case, the solver worked perfectly and obtained similar results >> to the rk solver in PETSC. I suspect the issue might be related to the way >> the Jacobian is built in PETSC, but scikits.odes.sundials.cvode works >> perfectly without requiring the Jacobian. I would greatly appreciate any >> suggestions or examples on how to properly set up the BDF solver with PETSc. >> > Thanks >> > Art >> >> From dontbugthedevs at proton.me Tue Jun 17 11:43:24 2025 From: dontbugthedevs at proton.me (Noam T.) Date: Tue, 17 Jun 2025 16:43:24 +0000 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: Thank you. For now, I am dealing with vertices only. Perhaps I did not explain myself properly, or I misunderstood your response. What I meant to say is, given an element of order higher than one, the connectivity matrix I obtain this way only contains as many entries as the first order element: 3 for a triangle, 4 for a tetrahedron, etc. Looking at the closure of any cell in the mesh, this is also the case.However, the nodes are definitely present; e.g. from DMPlexGetCellCoordinates(dm, cell, NULL, nc, NULL, NULL) nc returns the expected value (12 for a 2nd order 6-node planar triangle, 30 for a 2nd order 10-node tetrahedron, etc). The question is, are the indices of these extra nodes obtainable in a similar way as with the code shared before? So that one can have e.g. [0, 1, 2, 3, 4, 5] for a second order triangle, not just [0, 1, 2]. Thank you. Noam On Friday, June 13th, 2025 at 3:05 PM, Matthew Knepley wrote: > On Thu, Jun 12, 2025 at 4:26?PM Noam T. wrote: > >> Thank you for the code; it provides exactly what I was looking for. >> >> Following up on this matter, does this method not work for higher order elements? For example, using an 8-node quadrilateral, exporting to a PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node coordinates in geometry/vertices > > If you wanted to include edges/faces, you could do it. First, you would need to decide how you would number things For example, would you number all points contiguously, or separately number cells, vertices, faces and edges. Second, you would check for faces/edges in the closure loop. Right now, we only check for vertices. > > I would say that this is what convinced me not to do FEM this way. > > Thanks, > > Matt > >> (here a quadrilateral in [0, 10]) >> 5.0, 5.0 >> 0.0, 0.0 >> 10.0, 0.0 >> 10.0, 10.0 >> 0.0, 10.0 >> 5.0, 0.0 >> 10.0, 5.0 >> 5.0, 10.0 >> 0.0, 5.0 >> >> but the connectivity in viz/topology is >> >> 0 1 2 3 >> >> which are likely the corner nodes of the initial, first-order element, before adding extra nodes for the higher degree element. >> >> This connectivity values [0, 1, 2, 3, ...] are always the same, including for other elements, whereas the coordinates are correct >> >> E.g. for 3rd order triangle in [0, 1], coordinates are given left to right, bottom to top >> 0, 0 >> 1/3, 0, >> 2/3, 0, >> 1, 0 >> 0, 1/3 >> 1/3, 1/3 >> 2/3, 1/3 >> 0, 2/3, >> 1/3, 2/3 >> 0, 1 >> >> but the connectivity (viz/topology/cells) is [0, 1, 2]. >> >> Test meshes were created with gmsh from the python API, using >> gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... >> >> Thank you. >> Noam >> On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley wrote: >> >>> On Thu, May 22, 2025 at 12:25?PM Noam T. wrote: >>> >>>> Hello, >>>> >>>> Thank you the various options. >>>> >>>> Use case here would be obtaining the exact output generated by option 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix generated under /viz/topology/cells. >>>> >>>>> There are several ways you might do this. It helps to know what you are aiming for. >>>>> >>>>> 1) If you just want this output, it might be easier to just DMView() with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the cell-vertex topology and coordinates >>>> >>>> Is it possible to get this information in memory, onto a Mat, Vec or some other Int array object directly? it would be handy to have it in order to manipulate it and/or save it to a different format/file. Saving to an HDF5 and loading it again seems redundant. >>>> >>>>> 2) You can call DMPlexUninterpolate() to produce a mesh with just cells and vertices, and output it in any format. >>>>> >>>>> 3) If you want it in memory, but still with global indices (I don't understand this use case), then you can use DMPlexCreatePointNumbering() for an overall global numbering, or DMPlexCreateCellNumbering() and DMPlexCreateVertexNumbering() for separate global numberings. >>>> >>>> Perhaps I missed it, but getting the connectivity matrix in /viz/topology/cells/ did not seem directly trivial to me from the list of global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I assume all the operations done when calling DMView()). >>> >>> Something like >>> >>> DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>> DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); >>> DMPlexGetVertexNumbering(dm, &globalVertexNumbers); >>> ISGetIndices(globalVertexNumbers, &gv); >>> for (PetscInt c = cStart; c < cEnd; ++c) { >>> PetscInt *closure = NULL; >>> >>> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>> for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { >>> if (closure[cl] < vStart || closure[cl] >= vEnd) continue; >>> const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : gv[closure[cl]]; >>> >>> // Do something with v >>> } >>> DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>> } >>> ISRestoreIndices(globalVertexNumbers, &gv); >>> ISDestroy(&globalVertexNumbers); >>> >>> Thanks, >>> >>> Matt >>> >>>> Thanks, >>>> Noam. >>> >>> -- >>> >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!auwV8wTSwLcsJ8sDHyW_i-VTpb-2nu5SMxYOBxNVljOhAIQEeIy8yRyp5hfOho00flBooHSX92kCWgO6wrTmYwQKaJd1QT1x$ > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!auwV8wTSwLcsJ8sDHyW_i-VTpb-2nu5SMxYOBxNVljOhAIQEeIy8yRyp5hfOho00flBooHSX92kCWgO6wrTmYwQKaJd1QT1x$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jun 17 12:42:15 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Jun 2025 13:42:15 -0400 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: On Tue, Jun 17, 2025 at 12:43?PM Noam T. wrote: > Thank you. For now, I am dealing with vertices only. > > Perhaps I did not explain myself properly, or I misunderstood your > response. > What I meant to say is, given an element of order higher than one, the > connectivity matrix I obtain this way only contains as many entries as the > first order element: 3 for a triangle, 4 for a tetrahedron, etc. > > Looking at the closure of any cell in the mesh, this is also the > case.However, the nodes are definitely present; e.g. from > > DMPlexGetCellCoordinates(dm, cell, NULL, nc, NULL, NULL) > > nc returns the expected value (12 for a 2nd order 6-node planar triangle, > 30 for a 2nd order 10-node tetrahedron, etc). > > The question is, are the indices of these extra nodes obtainable in a > similar way as with the code shared before? So that one can have e.g. [0, > 1, 2, 3, 4, 5] for a second order triangle, not just [0, 1, 2]. > I am having a hard time understanding what you are after. I think this is because many FEM approaches confuse topology with analysis. The Plex stores topology, and you can retrieve adjacencies between any two mesh points. The PetscSection maps mesh points (cells, faces, edges , vertices) to sets of dofs. This is how higher order elements are implemented. Thus, we do not have to change topology to get different function spaces. The intended interface is for you to call DMPlexVecGetClosure() to get the closure of a cell (or face, or edge). You can also call DMPlexGetClosureIndices(), but index wrangling is what I intended to eliminate. What exactly are you looking for here? Thanks, Matt > Thank you. > Noam > On Friday, June 13th, 2025 at 3:05 PM, Matthew Knepley > wrote: > > On Thu, Jun 12, 2025 at 4:26?PM Noam T. wrote: > >> >> Thank you for the code; it provides exactly what I was looking for. >> >> Following up on this matter, does this method not work for higher order >> elements? For example, using an 8-node quadrilateral, exporting to a >> PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node >> coordinates in geometry/vertices >> > > If you wanted to include edges/faces, you could do it. First, you would > need to decide how you would number things For example, would you number > all points contiguously, or separately number cells, vertices, faces and > edges. Second, you would check for faces/edges in the closure loop. Right > now, we only check for vertices. > > I would say that this is what convinced me not to do FEM this way. > > Thanks, > > Matt > >> (here a quadrilateral in [0, 10]) >> 5.0, 5.0 >> 0.0, 0.0 >> 10.0, 0.0 >> 10.0, 10.0 >> 0.0, 10.0 >> 5.0, 0.0 >> 10.0, 5.0 >> 5.0, 10.0 >> 0.0, 5.0 >> >> but the connectivity in viz/topology is >> >> 0 1 2 3 >> >> which are likely the corner nodes of the initial, first-order element, >> before adding extra nodes for the higher degree element. >> >> This connectivity values [0, 1, 2, 3, ...] are always the same, including >> for other elements, whereas the coordinates are correct >> >> E.g. for 3rd order triangle in [0, 1], coordinates are given left to >> right, bottom to top >> 0, 0 >> 1/3, 0, >> 2/3, 0, >> 1, 0 >> 0, 1/3 >> 1/3, 1/3 >> 2/3, 1/3 >> 0, 2/3, >> 1/3, 2/3 >> 0, 1 >> >> but the connectivity (viz/topology/cells) is [0, 1, 2]. >> >> Test meshes were created with gmsh from the python API, using >> gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... >> >> Thank you. >> Noam >> On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley >> wrote: >> >> On Thu, May 22, 2025 at 12:25?PM Noam T. >> wrote: >> >>> Hello, >>> >>> Thank you the various options. >>> >>> Use case here would be obtaining the exact output generated by option >>> 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix >>> generated under /viz/topology/cells. >>> >>> There are several ways you might do this. It helps to know what you are >>> aiming for. >>> >>> 1) If you just want this output, it might be easier to just DMView() >>> with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the >>> cell-vertex topology and coordinates >>> >>> >>> Is it possible to get this information in memory, onto a Mat, Vec or >>> some other Int array object directly? it would be handy to have it in order >>> to manipulate it and/or save it to a different format/file. Saving to an >>> HDF5 and loading it again seems redundant. >>> >>> >>> 2) You can call DMPlexUninterpolate() to produce a mesh with just cells >>> and vertices, and output it in any format. >>> >>> 3) If you want it in memory, but still with global indices (I don't >>> understand this use case), then you can use DMPlexCreatePointNumbering() >>> for an overall global numbering, or DMPlexCreateCellNumbering() and >>> DMPlexCreateVertexNumbering() for separate global numberings. >>> >>> >>> Perhaps I missed it, but getting the connectivity matrix in >>> /viz/topology/cells/ did not seem directly trivial to me from the list of >>> global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I >>> assume all the operations done when calling DMView()). >>> >> >> Something like >> >> DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >> DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); >> DMPlexGetVertexNumbering(dm, &globalVertexNumbers); >> ISGetIndices(globalVertexNumbers, &gv); >> for (PetscInt c = cStart; c < cEnd; ++c) { >> PetscInt *closure = NULL; >> >> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >> for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { >> if (closure[cl] < vStart || closure[cl] >= vEnd) continue; >> const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : >> gv[closure[cl]]; >> >> // Do something with v >> } >> DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >> } >> ISRestoreIndices(globalVertexNumbers, &gv); >> ISDestroy(&globalVertexNumbers); >> >> Thanks, >> >> Matt >> >> Thanks, >>> Noam. >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!buGhhRYyQDY3kig5GA6tIzeIOCoCAtlZzIz_UTAH1bZ-05GXENI3VZos9r4s6fXhktg7wUl--lMKoG-FNuUV$ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!buGhhRYyQDY3kig5GA6tIzeIOCoCAtlZzIz_UTAH1bZ-05GXENI3VZos9r4s6fXhktg7wUl--lMKoG-FNuUV$ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!buGhhRYyQDY3kig5GA6tIzeIOCoCAtlZzIz_UTAH1bZ-05GXENI3VZos9r4s6fXhktg7wUl--lMKoG-FNuUV$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Tue Jun 17 17:39:32 2025 From: rlmackie862 at gmail.com (Randall Mackie) Date: Tue, 17 Jun 2025 15:39:32 -0700 Subject: [petsc-users] Problem with composite DM index sets Message-ID: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> Dear Petsc users - I am trying to upgrade my code to petsc-3.23 (from 3.19), and I seem to have run into a problem with DMCompositeGetGlobalISs. The example program listed on the man page for DMCompositeGetGlobalISs, https://urldefense.us/v3/__https://petsc.org/release/src/snes/tutorials/ex73f90t.F90.html__;!!G_uCfscf7eWS!eBXtwfVGvvppsy1_lM2f-Z61YMsb439eVY8V9SYWa0x6VjvTiJ-rXAnhEbi08ogvqtF3s2AZJt4pk9gNeM9yAjmHvA$ , seems to indicate that a call to DMCompositeGetGlobalISs does not need to allocate the IS pointer and you just pass it directly to DMCompositeGetGlobalISs. If I compile and run the simple attached test program (say on 2 processes), I get the following error: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!eBXtwfVGvvppsy1_lM2f-Z61YMsb439eVY8V9SYWa0x6VjvTiJ-rXAnhEbi08ogvqtF3s2AZJt4pk9gNeM-7MIW3KA$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!eBXtwfVGvvppsy1_lM2f-Z61YMsb439eVY8V9SYWa0x6VjvTiJ-rXAnhEbi08ogvqtF3s2AZJt4pk9gNeM_a7QtraQ$ [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: The line numbers in the error traceback may not be exact. [0]PETSC ERROR: #1 F90Array1dCreate() at /home/rmackie/PETSc/petsc-3.23.3/src/sys/ftn-custom/f90_cwrap.c:123 application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 If I uncomment the line to allocate the pointer, I get a very long traceback with lots of error messages. What is the correct way to use DMCompositeGetGlobalISs in Fortran? With or without the pointer allocation, and what is the right way to do this without the errors it seems to generate? Thanks, Randy Mackie ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.F90 Type: application/octet-stream Size: 1201 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Wed Jun 18 16:44:02 2025 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 18 Jun 2025 14:44:02 -0700 Subject: [petsc-users] Problem with composite DM index sets In-Reply-To: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> References: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> Message-ID: Follow up: running with valgrind shows the following issues?.is this a bug in PETSc? ==5216== Use of uninitialised value of size 8 ==5216== at 0x49ED69C: f90array1dcreatefortranaddr_ (f90_fwrap.F90:52) ==5216== by 0x4D7EA94: F90Array1dCreate (f90_cwrap.c:140) ==5216== by 0x6B7295A: dmcompositegetglobaliss_ (zfddaf.c:70) ==5216== by 0x1095C8: MAIN__ (test.F90:35) ==5216== by 0x1096AC: main (test.F90:4) ==5216== Uninitialised value was created by a stack allocation ==5216== at 0x109219: MAIN__ (test.F90:1) ==5216== ==5216== Invalid write of size 8 ==5216== at 0x49ED69C: f90array1dcreatefortranaddr_ (f90_fwrap.F90:52) ==5216== by 0x4D7EA94: F90Array1dCreate (f90_cwrap.c:140) ==5216== by 0x6B7295A: dmcompositegetglobaliss_ (zfddaf.c:70) ==5216== by 0x1095C8: MAIN__ (test.F90:35) ==5216== by 0x1096AC: main (test.F90:4) ==5216== Address 0x20 is not stack'd, malloc'd or (recently) free'd ==5216== Thanks, Randyt > On Jun 17, 2025, at 3:39?PM, Randall Mackie wrote: > > Dear Petsc users - > > I am trying to upgrade my code to petsc-3.23 (from 3.19), and I seem to have run into a problem with DMCompositeGetGlobalISs. > > The example program listed on the man page for DMCompositeGetGlobalISs, https://urldefense.us/v3/__https://petsc.org/release/src/snes/tutorials/ex73f90t.F90.html__;!!G_uCfscf7eWS!bm0O_rjMssXigJJ70W-MVHCpKJyZbomZ-493v0TAJHAGdJgTS8zhtJTfPhN84ty778IefJ_7k2D6EqiHLBIIsoR5QA$ , seems to indicate that a call to DMCompositeGetGlobalISs does not need to allocate the IS pointer and you just pass it directly to DMCompositeGetGlobalISs. > > If I compile and run the simple attached test program (say on 2 processes), I get the following error: > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!bm0O_rjMssXigJJ70W-MVHCpKJyZbomZ-493v0TAJHAGdJgTS8zhtJTfPhN84ty778IefJ_7k2D6EqiHLBJU3XEgSQ$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!bm0O_rjMssXigJJ70W-MVHCpKJyZbomZ-493v0TAJHAGdJgTS8zhtJTfPhN84ty778IefJ_7k2D6EqiHLBJR1nT-VA$ > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: The line numbers in the error traceback may not be exact. > [0]PETSC ERROR: #1 F90Array1dCreate() at /home/rmackie/PETSc/petsc-3.23.3/src/sys/ftn-custom/f90_cwrap.c:123 > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > > If I uncomment the line to allocate the pointer, I get a very long traceback with lots of error messages. > > What is the correct way to use DMCompositeGetGlobalISs in Fortran? With or without the pointer allocation, and what is the right way to do this without the errors it seems to generate? > > Thanks, > > Randy Mackie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Wed Jun 18 17:49:40 2025 From: dontbugthedevs at proton.me (Noam T.) Date: Wed, 18 Jun 2025 22:49:40 +0000 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: See image attached. Connectivity of the top mesh (first order triangle), can be obtained with the code shared before. Connectivity of the bottom mesh (second order triangle) is what I would be interested in obtaining. However, given your clarification on what the Plex and the PetscSection handle, it might not work; I am trying to get form the Plex what's only available from the PetscSection. The purpose of this extended connectivity is plotting; in particular, using VTU files, where the "connectivity" of cells is required, and the extra nodes would be needed when using higher-order elements (e.g. VTK_QUADRATIC_TRIANGLE, VTK_QUADRATIC_QUAD, etc). Perhaps I am over complicating things, and all this information can be obtained in a different, simpler way. Thanks. Noam On Tuesday, June 17th, 2025 at 5:42 PM, Matthew Knepley wrote: > On Tue, Jun 17, 2025 at 12:43?PM Noam T. wrote: > >> Thank you. For now, I am dealing with vertices only. >> >> Perhaps I did not explain myself properly, or I misunderstood your response. >> What I meant to say is, given an element of order higher than one, the connectivity matrix I obtain this way only contains as many entries as the first order element: 3 for a triangle, 4 for a tetrahedron, etc. >> >> Looking at the closure of any cell in the mesh, this is also the case.However, the nodes are definitely present; e.g. from >> >> DMPlexGetCellCoordinates(dm, cell, NULL, nc, NULL, NULL) >> >> nc returns the expected value (12 for a 2nd order 6-node planar triangle, 30 for a 2nd order 10-node tetrahedron, etc). >> >> The question is, are the indices of these extra nodes obtainable in a similar way as with the code shared before? So that one can have e.g. [0, 1, 2, 3, 4, 5] for a second order triangle, not just [0, 1, 2]. > > I am having a hard time understanding what you are after. I think this is because many FEM approaches confuse topology with analysis. > > The Plex stores topology, and you can retrieve adjacencies between any two mesh points. > > The PetscSection maps mesh points (cells, faces, edges , vertices) to sets of dofs. This is how higher order elements are implemented. Thus, we do not have to change topology to get different function spaces. > > The intended interface is for you to call DMPlexVecGetClosure() to get the closure of a cell (or face, or edge). You can also call DMPlexGetClosureIndices(), but index wrangling is what I intended to eliminate. > > What exactly are you looking for here? > > Thanks, > > Matt > >> Thank you. >> Noam >> On Friday, June 13th, 2025 at 3:05 PM, Matthew Knepley wrote: >> >>> On Thu, Jun 12, 2025 at 4:26?PM Noam T. wrote: >>> >>>> Thank you for the code; it provides exactly what I was looking for. >>>> >>>> Following up on this matter, does this method not work for higher order elements? For example, using an 8-node quadrilateral, exporting to a PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node coordinates in geometry/vertices >>> >>> If you wanted to include edges/faces, you could do it. First, you would need to decide how you would number things For example, would you number all points contiguously, or separately number cells, vertices, faces and edges. Second, you would check for faces/edges in the closure loop. Right now, we only check for vertices. >>> >>> I would say that this is what convinced me not to do FEM this way. >>> >>> Thanks, >>> >>> Matt >>> >>>> (here a quadrilateral in [0, 10]) >>>> 5.0, 5.0 >>>> 0.0, 0.0 >>>> 10.0, 0.0 >>>> 10.0, 10.0 >>>> 0.0, 10.0 >>>> 5.0, 0.0 >>>> 10.0, 5.0 >>>> 5.0, 10.0 >>>> 0.0, 5.0 >>>> >>>> but the connectivity in viz/topology is >>>> >>>> 0 1 2 3 >>>> >>>> which are likely the corner nodes of the initial, first-order element, before adding extra nodes for the higher degree element. >>>> >>>> This connectivity values [0, 1, 2, 3, ...] are always the same, including for other elements, whereas the coordinates are correct >>>> >>>> E.g. for 3rd order triangle in [0, 1], coordinates are given left to right, bottom to top >>>> 0, 0 >>>> 1/3, 0, >>>> 2/3, 0, >>>> 1, 0 >>>> 0, 1/3 >>>> 1/3, 1/3 >>>> 2/3, 1/3 >>>> 0, 2/3, >>>> 1/3, 2/3 >>>> 0, 1 >>>> >>>> but the connectivity (viz/topology/cells) is [0, 1, 2]. >>>> >>>> Test meshes were created with gmsh from the python API, using >>>> gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... >>>> >>>> Thank you. >>>> Noam >>>> On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley wrote: >>>> >>>>> On Thu, May 22, 2025 at 12:25?PM Noam T. wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> Thank you the various options. >>>>>> >>>>>> Use case here would be obtaining the exact output generated by option 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix generated under /viz/topology/cells. >>>>>> >>>>>>> There are several ways you might do this. It helps to know what you are aiming for. >>>>>>> >>>>>>> 1) If you just want this output, it might be easier to just DMView() with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the cell-vertex topology and coordinates >>>>>> >>>>>> Is it possible to get this information in memory, onto a Mat, Vec or some other Int array object directly? it would be handy to have it in order to manipulate it and/or save it to a different format/file. Saving to an HDF5 and loading it again seems redundant. >>>>>> >>>>>>> 2) You can call DMPlexUninterpolate() to produce a mesh with just cells and vertices, and output it in any format. >>>>>>> >>>>>>> 3) If you want it in memory, but still with global indices (I don't understand this use case), then you can use DMPlexCreatePointNumbering() for an overall global numbering, or DMPlexCreateCellNumbering() and DMPlexCreateVertexNumbering() for separate global numberings. >>>>>> >>>>>> Perhaps I missed it, but getting the connectivity matrix in /viz/topology/cells/ did not seem directly trivial to me from the list of global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I assume all the operations done when calling DMView()). >>>>> >>>>> Something like >>>>> >>>>> DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>>>> DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); >>>>> DMPlexGetVertexNumbering(dm, &globalVertexNumbers); >>>>> ISGetIndices(globalVertexNumbers, &gv); >>>>> for (PetscInt c = cStart; c < cEnd; ++c) { >>>>> PetscInt *closure = NULL; >>>>> >>>>> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>>> for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { >>>>> if (closure[cl] < vStart || closure[cl] >= vEnd) continue; >>>>> const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : gv[closure[cl]]; >>>>> >>>>> // Do something with v >>>>> } >>>>> DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>>> } >>>>> ISRestoreIndices(globalVertexNumbers, &gv); >>>>> ISDestroy(&globalVertexNumbers); >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> Thanks, >>>>>> Noam. >>>>> >>>>> -- >>>>> >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!fRqwbCoztVyyzAJCKwByl_ypV2r7m5XCOxm8fu0m5Tu0Fp8Ghj8I_m2t3XrOU9m7STyykqh6j29GaFAsb7TDQa2L3jVTMgl9$ >>> >>> -- >>> >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!fRqwbCoztVyyzAJCKwByl_ypV2r7m5XCOxm8fu0m5Tu0Fp8Ghj8I_m2t3XrOU9m7STyykqh6j29GaFAsb7TDQa2L3jVTMgl9$ > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!fRqwbCoztVyyzAJCKwByl_ypV2r7m5XCOxm8fu0m5Tu0Fp8Ghj8I_m2t3XrOU9m7STyykqh6j29GaFAsb7TDQa2L3jVTMgl9$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: connectivity.pdf Type: application/pdf Size: 12595 bytes Desc: not available URL: From knepley at gmail.com Wed Jun 18 19:43:14 2025 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 18 Jun 2025 20:43:14 -0400 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: On Wed, Jun 18, 2025 at 6:49?PM Noam T. wrote: > See image attached. > Connectivity of the top mesh (first order triangle), can be obtained with > the code shared before. > Connectivity of the bottom mesh (second order triangle) is what I would > be interested in obtaining. > > However, given your clarification on what the Plex and the PetscSection > handle, it might not work; I am trying to get form the Plex what's only > available from the PetscSection. > > The purpose of this extended connectivity is plotting; in particular, > using VTU files, where the "connectivity" of cells is required, and the > extra nodes would be needed when using higher-order elements (e.g. > VTK_QUADRATIC_TRIANGLE, VTK_QUADRATIC_QUAD, etc). > Oh yes. VTK does this in a particularly ugly and backward way. Sigh. There is nothing we can do about this now, but someone should replace VTK with a proper interface at some point. So I understand why you want it and it is a defensible case, so here is how you get that (with some explanation). Those locations, I think, should not be understood as topological things, but rather as the locations of point evaluation functionals constituting a basis for the dual space (to your approximation space). I would call DMPlexGetClosureIndices() ( https://urldefense.us/v3/__https://petsc.org/main/manualpages/DMPlex/DMPlexGetClosureIndices/__;!!G_uCfscf7eWS!YK9T8_rEaNbJIndfx3tWHNlZlZ5GTj_KRDshR-nc5atZltZCcw73PtSNcrtvjWXZY73l5jQsAUx6qbM0Fge5$ ) with a Section having the layout of P2 or Q2. This is the easy way to make that PetscSection gs; PetscFE fe; DMPolytopeType ct; PetscInt dim, cStart; PetscCall(DMGetDimension(dm, &dim)); PetscCall(DMPlexGetHeightStratum(dm, 0, &cStart, NULL)); PetscCall(DMPlexGetCellType(dm, cStart, &ct)); PetscCall(PetscFECreateLagrangeByCell(PETSC_COMM_SELF, dim, 1, ct, 2, PETSC_DETERMINE, &fe)); PetscCall(DMSetField(dm, 0, NULL, (PetscObject)fe)); PetscCall(PetscFEDestroy(&fe)); PetscCall(DMCreateDS(dm)); PetscCall(DMGetGlobalSection(dm, &gs)); PetscInt *indices = NULL; PetscInt Nidx; PetscCall(DMPlexGetClosureIndices(dm, gs, gs, cell, PETSC_TRUE, &Nidx, &indices, NULL, NULL)); Thanks, MAtt > Perhaps I am over complicating things, and all this information can be > obtained in a different, simpler way. > > Thanks. > Noam > On Tuesday, June 17th, 2025 at 5:42 PM, Matthew Knepley > wrote: > > On Tue, Jun 17, 2025 at 12:43?PM Noam T. wrote: > >> Thank you. For now, I am dealing with vertices only. >> >> Perhaps I did not explain myself properly, or I misunderstood your >> response. >> What I meant to say is, given an element of order higher than one, the >> connectivity matrix I obtain this way only contains as many entries as the >> first order element: 3 for a triangle, 4 for a tetrahedron, etc. >> >> Looking at the closure of any cell in the mesh, this is also the >> case.However, the nodes are definitely present; e.g. from >> >> DMPlexGetCellCoordinates(dm, cell, NULL, nc, NULL, NULL) >> >> nc returns the expected value (12 for a 2nd order 6-node planar triangle, >> 30 for a 2nd order 10-node tetrahedron, etc). >> >> The question is, are the indices of these extra nodes obtainable in a >> similar way as with the code shared before? So that one can have e.g. [0, >> 1, 2, 3, 4, 5] for a second order triangle, not just [0, 1, 2]. >> > > I am having a hard time understanding what you are after. I think this is > because many FEM approaches confuse topology with analysis. > > The Plex stores topology, and you can retrieve adjacencies between any two > mesh points. > > The PetscSection maps mesh points (cells, faces, edges , vertices) to sets > of dofs. This is how higher order elements are implemented. Thus, we do not > have to change topology to get different function spaces. > > The intended interface is for you to call DMPlexVecGetClosure() to get the > closure of a cell (or face, or edge). You can also call > DMPlexGetClosureIndices(), but index wrangling is what I intended to > eliminate. > > What exactly are you looking for here? > > Thanks, > > Matt > >> Thank you. >> Noam >> On Friday, June 13th, 2025 at 3:05 PM, Matthew Knepley >> wrote: >> >> On Thu, Jun 12, 2025 at 4:26?PM Noam T. wrote: >> >>> >>> Thank you for the code; it provides exactly what I was looking for. >>> >>> Following up on this matter, does this method not work for higher order >>> elements? For example, using an 8-node quadrilateral, exporting to a >>> PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node >>> coordinates in geometry/vertices >>> >> >> If you wanted to include edges/faces, you could do it. First, you would >> need to decide how you would number things For example, would you number >> all points contiguously, or separately number cells, vertices, faces and >> edges. Second, you would check for faces/edges in the closure loop. Right >> now, we only check for vertices. >> >> I would say that this is what convinced me not to do FEM this way. >> >> Thanks, >> >> Matt >> >>> (here a quadrilateral in [0, 10]) >>> 5.0, 5.0 >>> 0.0, 0.0 >>> 10.0, 0.0 >>> 10.0, 10.0 >>> 0.0, 10.0 >>> 5.0, 0.0 >>> 10.0, 5.0 >>> 5.0, 10.0 >>> 0.0, 5.0 >>> >>> but the connectivity in viz/topology is >>> >>> 0 1 2 3 >>> >>> which are likely the corner nodes of the initial, first-order element, >>> before adding extra nodes for the higher degree element. >>> >>> This connectivity values [0, 1, 2, 3, ...] are always the same, >>> including for other elements, whereas the coordinates are correct >>> >>> E.g. for 3rd order triangle in [0, 1], coordinates are given left to >>> right, bottom to top >>> 0, 0 >>> 1/3, 0, >>> 2/3, 0, >>> 1, 0 >>> 0, 1/3 >>> 1/3, 1/3 >>> 2/3, 1/3 >>> 0, 2/3, >>> 1/3, 2/3 >>> 0, 1 >>> >>> but the connectivity (viz/topology/cells) is [0, 1, 2]. >>> >>> Test meshes were created with gmsh from the python API, using >>> gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... >>> >>> Thank you. >>> Noam >>> On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley < >>> knepley at gmail.com> wrote: >>> >>> On Thu, May 22, 2025 at 12:25?PM Noam T. >>> wrote: >>> >>>> Hello, >>>> >>>> Thank you the various options. >>>> >>>> Use case here would be obtaining the exact output generated by option >>>> 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix >>>> generated under /viz/topology/cells. >>>> >>>> There are several ways you might do this. It helps to know what you are >>>> aiming for. >>>> >>>> 1) If you just want this output, it might be easier to just DMView() >>>> with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the >>>> cell-vertex topology and coordinates >>>> >>>> >>>> Is it possible to get this information in memory, onto a Mat, Vec or >>>> some other Int array object directly? it would be handy to have it in order >>>> to manipulate it and/or save it to a different format/file. Saving to an >>>> HDF5 and loading it again seems redundant. >>>> >>>> >>>> 2) You can call DMPlexUninterpolate() to produce a mesh with just cells >>>> and vertices, and output it in any format. >>>> >>>> 3) If you want it in memory, but still with global indices (I don't >>>> understand this use case), then you can use DMPlexCreatePointNumbering() >>>> for an overall global numbering, or DMPlexCreateCellNumbering() and >>>> DMPlexCreateVertexNumbering() for separate global numberings. >>>> >>>> >>>> Perhaps I missed it, but getting the connectivity matrix in >>>> /viz/topology/cells/ did not seem directly trivial to me from the list of >>>> global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I >>>> assume all the operations done when calling DMView()). >>>> >>> >>> Something like >>> >>> DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>> DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); >>> DMPlexGetVertexNumbering(dm, &globalVertexNumbers); >>> ISGetIndices(globalVertexNumbers, &gv); >>> for (PetscInt c = cStart; c < cEnd; ++c) { >>> PetscInt *closure = NULL; >>> >>> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>> for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { >>> if (closure[cl] < vStart || closure[cl] >= vEnd) continue; >>> const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : >>> gv[closure[cl]]; >>> >>> // Do something with v >>> } >>> DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>> } >>> ISRestoreIndices(globalVertexNumbers, &gv); >>> ISDestroy(&globalVertexNumbers); >>> >>> Thanks, >>> >>> Matt >>> >>> Thanks, >>>> Noam. >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YK9T8_rEaNbJIndfx3tWHNlZlZ5GTj_KRDshR-nc5atZltZCcw73PtSNcrtvjWXZY73l5jQsAUx6qcjKXTR_$ >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YK9T8_rEaNbJIndfx3tWHNlZlZ5GTj_KRDshR-nc5atZltZCcw73PtSNcrtvjWXZY73l5jQsAUx6qcjKXTR_$ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YK9T8_rEaNbJIndfx3tWHNlZlZ5GTj_KRDshR-nc5atZltZCcw73PtSNcrtvjWXZY73l5jQsAUx6qcjKXTR_$ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YK9T8_rEaNbJIndfx3tWHNlZlZ5GTj_KRDshR-nc5atZltZCcw73PtSNcrtvjWXZY73l5jQsAUx6qcjKXTR_$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hexiaofeng at buaa.edu.cn Thu Jun 19 06:10:41 2025 From: hexiaofeng at buaa.edu.cn (hexioafeng) Date: Thu, 19 Jun 2025 19:10:41 +0800 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> Message-ID: <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> Dear authors, Here are the options passed with fieldsplit preconditioner: -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason and the output: 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to SUBPC_ERROR Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to SUBPC_ERROR KSP Object: 1 MPI processes type: cg maximum iterations=200, initial guess is zero tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: gamg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 1 Complexity: grid = 1.00222 Coarse grid solver -- level ------------------------------- KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=8, cols=8 package used to perform factorization: petsc total: nonzeros=56, allocated nonzeros=56 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=8, cols=8 total: nonzeros=56, allocated nonzeros=56 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=8, cols=8 total: nonzeros=56, allocated nonzeros=56 total number of mallocs used during MatSetValues calls=0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 3 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: chebyshev eigenvalue estimates used: min = 0.0998145, max = 1.09796 eigenvalues estimate via gmres min 0.00156735, max 0.998145 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=480, cols=480 total: nonzeros=25200, allocated nonzeros=25200 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 160 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=480, cols=480 total: nonzeros=25200, allocated nonzeros=25200 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 160 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_1_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (fieldsplit_1_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_sub_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (fieldsplit_1_sub_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_sub_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=144, cols=144 package used to perform factorization: petsc total: nonzeros=240, allocated nonzeros=240 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=144, cols=144 total: nonzeros=240, allocated nonzeros=240 total number of mallocs used during MatSetValues calls=0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=144, cols=144 total: nonzeros=240, allocated nonzeros=240 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: schurcomplement rows=144, cols=144 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_1_) 1 MPI processes type: mpiaij rows=144, cols=144 total: nonzeros=240, allocated nonzeros=240 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines A10 Mat Object: 1 MPI processes type: mpiaij rows=144, cols=480 total: nonzeros=48, allocated nonzeros=48 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 74 nodes, limit used is 5 KSP of A00 KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: gamg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 1 Complexity: grid = 1.00222 Coarse grid solver -- level ------------------------------- KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=8, cols=8 package used to perform factorization: petsc total: nonzeros=56, allocated nonzeros=56 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=8, cols=8 total: nonzeros=56, allocated nonzeros=56 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=8, cols=8 total: nonzeros=56, allocated nonzeros=56 total number of mallocs used during MatSetValues calls=0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 3 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: chebyshev eigenvalue estimates used: min = 0.0998145, max = 1.09796 eigenvalues estimate via gmres min 0.00156735, max 0.998145 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=480, cols=480 total: nonzeros=25200, allocated nonzeros=25200 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 160 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=480, cols=480 total: nonzeros=25200, allocated nonzeros=25200 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 160 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: mpiaij rows=480, cols=144 total: nonzeros=48, allocated nonzeros=48 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 135 nodes, limit used is 5 Mat Object: 1 MPI processes type: mpiaij rows=144, cols=144 total: nonzeros=240, allocated nonzeros=240 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=624, cols=624 total: nonzeros=25536, allocated nonzeros=25536 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 336 nodes, limit used is 5 Thanks, Xiaofeng > On Jun 17, 2025, at 19:05, Mark Adams wrote: > > And don't use -pc_gamg_parallel_coarse_grid_solver > You can use that in production but for debugging use -mg_coarse_pc_type svd > Also, use -options_left and remove anything that is not used. > (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) > > Mark > > > On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley > wrote: >> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng > wrote: >>> Hello, >>> >>> Here are the options and outputs: >>> >>> options: >>> >>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -fieldsplit_1_mat_schur_complement_ainv_type lump -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >> >> This option was wrong: >> >> -fieldsplit_0_mg_coarse_pc_type_type svd >> >> from the output, we can see that it should have been >> >> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >> >> THanks, >> >> Matt >> >>> output: >>> >>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>> PC failed due to SUBPC_ERROR >>> KSP Object: 1 MPI processes >>> type: cg >>> maximum iterations=200, initial guess is zero >>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>> left preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI processes >>> type: gamg >>> type is MULTIPLICATIVE, levels=2 cycles=v >>> Cycles per PCApply=1 >>> Using externally compute Galerkin coarse grid matrices >>> GAMG specific options >>> Threshold for dropping small values in graph on each level = >>> Threshold scaling factor for each level not specified = 1. >>> AGG specific options >>> Symmetric graph false >>> Number of levels to square graph 1 >>> Number smoothing steps 1 >>> Complexity: grid = 1.00176 >>> Coarse grid solver -- level ------------------------------- >>> KSP Object: (mg_coarse_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (mg_coarse_) 1 MPI processes >>> type: bjacobi >>> number of blocks = 1 >>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (mg_coarse_sub_) 1 MPI processes >>> type: lu >>> out-of-place factorization >>> tolerance for zero pivot 2.22045e-14 >>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>> matrix ordering: nd >>> factor fill ratio given 5., needed 1. >>> Factored matrix follows: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=7, cols=7 >>> package used to perform factorization: petsc >>> total: nonzeros=45, allocated nonzeros=45 >>> using I-node routines: found 3 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=7, cols=7 >>> total: nonzeros=45, allocated nonzeros=45 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node routines: found 3 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=7, cols=7 >>> total: nonzeros=45, allocated nonzeros=45 >>> total number of mallocs used during MatSetValues calls=0 >>> using nonscalable MatPtAP() implementation >>> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >>> Down solver (pre-smoother) on level 1 ------------------------------- >>> KSP Object: (mg_levels_1_) 1 MPI processes >>> type: chebyshev >>> eigenvalue estimates used: min = 0., max = 0. >>> eigenvalues estimate via gmres min 0., max 0. >>> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=10, initial guess is zero >>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: (mg_levels_1_) 1 MPI processes >>> type: sor >>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=624, cols=624 >>> total: nonzeros=25536, allocated nonzeros=25536 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>> estimating eigenvalues using noisy right hand side >>> maximum iterations=2, nonzero initial guess >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (mg_levels_1_) 1 MPI processes >>> type: sor >>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=624, cols=624 >>> total: nonzeros=25536, allocated nonzeros=25536 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=624, cols=624 >>> total: nonzeros=25536, allocated nonzeros=25536 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>> >>> >>> Best regards, >>> >>> Xiaofeng >>> >>> >>>> On Jun 14, 2025, at 07:28, Barry Smith > wrote: >>>> >>>> >>>> Matt, >>>> >>>> Perhaps we should add options -ksp_monitor_debug and -snes_monitor_debug that turn on all possible monitoring for the (possibly) nested solvers and all of their converged reasons also? Note this is not completely trivial because each preconditioner will have to supply its list based on the current solver options for it. >>>> >>>> Then we won't need to constantly list a big string of problem specific monitor options to ask the user to use. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley > wrote: >>>>> >>>>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng > wrote: >>>>>> Dear authors, >>>>>> >>>>>> I tried -pc_type game -pc_gamg_parallel_coarse_grid_solver and -pc_type field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi _fieldsplit_1_sub_pc_type for , both options got the KSP_DIVERGE_PC_FAILED error. >>>>> >>>>> With any question about convergence, we need to see the output of >>>>> >>>>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>> >>>>> and all the error output. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> Thanks, >>>>>> >>>>>> Xiaofeng >>>>>> >>>>>> >>>>>>> On Jun 12, 2025, at 20:50, Mark Adams > wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley > wrote: >>>>>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams > wrote: >>>>>>>>> Adding this to the PETSc mailing list, >>>>>>>>> >>>>>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng > wrote: >>>>>>>>>> >>>>>>>>>> Dear Professor, >>>>>>>>>> >>>>>>>>>> I hope this message finds you well. >>>>>>>>>> >>>>>>>>>> I am an employee at a CAE company and a heavy user of the PETSc library. I would like to thank you for your contributions to PETSc and express my deep appreciation for your work. >>>>>>>>>> >>>>>>>>>> Recently, I encountered some difficulties when using PETSc to solve structural mechanics problems with Lagrange multiplier constraints. After searching extensively online and reviewing several papers, I found your previous paper titled "Algebraic multigrid methods for constrained linear systems with applications to contact problems in solid mechanics" seems to be the most relevant and helpful. >>>>>>>>>> >>>>>>>>>> The stiffness matrix I'm working with, K, is a block saddle-point matrix of the form (A00 A01; A10 0), where A00 is singular?just as described in your paper, and different from many other articles . I have a few questions regarding your work and would greatly appreciate your insights: >>>>>>>>>> >>>>>>>>>> 1. Is the AMG/KKT method presented in your paper available in PETSc? I tried using CG+GAMG directly but received a KSP_DIVERGED_PC_FAILED error. I also attempted to use CG+PCFIELDSPLIT with the following options: >>>>>>>>> >>>>>>>>> No >>>>>>>>> >>>>>>>>>> >>>>>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>>>>>>> >>>>>>>>>> Unfortunately, this also resulted in a KSP_DIVERGED_PC_FAILED error. Do you have any suggestions? >>>>>>>>>> >>>>>>>>>> 2. In your paper, you compare the method with Uzawa-type approaches. To my understanding, Uzawa methods typically require A00 to be invertible. How did you handle the singularity of A00 to construct an M-matrix that is invertible? >>>>>>>>>> >>>>>>>>> >>>>>>>>> You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa >>>>>>>>> >>>>>>>>> >>>>>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing AMG APIs? Implementing a production-level AMG solver from scratch would be quite challenging for me, so I?m hoping to utilize existing AMG interfaces within PETSc or other packages. >>>>>>>>>> >>>>>>>>> >>>>>>>>> You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. >>>>>>>>> >>>>>>>>> >>>>>>>>>> 4. For saddle-point systems where A00 is singular, can you recommend any more robust or efficient solutions? Alternatively, are you aware of any open-source software packages that can handle such cases out-of-the-box? >>>>>>>>>> >>>>>>>>> >>>>>>>>> No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. >>>>>>>>> I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) >>>>>>>> >>>>>>>> One other trick you can use is to have >>>>>>>> >>>>>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>>>>> >>>>>>>> This will use SVD on the coarse grid of GAMG, which can handle the null space in A00 as long as the prolongation does not put it back in. I have used this for the Laplacian with Neumann conditions and for freely floating elastic problems. >>>>>>>> >>>>>>> >>>>>>> Good point. >>>>>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Mark >>>>>>>>>> >>>>>>>>>> Thank you very much for taking the time to read my email. Looking forward to hearing from you. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Sincerely, >>>>>>>>>> >>>>>>>>>> Xiaofeng He >>>>>>>>>> ----------------------------------------------------- >>>>>>>>>> >>>>>>>>>> Research Engineer >>>>>>>>>> >>>>>>>>>> Internet Based Engineering, Beijing, China >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!byYxhIBJt_hYquyBdn7tu8cOQZhdItp2R6lnYK5xb64Ums47URE9JMIkS73yNRFqYf6smAJ-0ss8pTCDv4S8C8te60bgDg$ >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!byYxhIBJt_hYquyBdn7tu8cOQZhdItp2R6lnYK5xb64Ums47URE9JMIkS73yNRFqYf6smAJ-0ss8pTCDv4S8C8te60bgDg$ >>>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!byYxhIBJt_hYquyBdn7tu8cOQZhdItp2R6lnYK5xb64Ums47URE9JMIkS73yNRFqYf6smAJ-0ss8pTCDv4S8C8te60bgDg$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jun 19 06:45:47 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jun 2025 07:45:47 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> Message-ID: This options is wrong -fieldsplit_0_mg_coarse_sub_pc_type_type svd Notice that "_type" is repeated. Thanks, Matt On Thu, Jun 19, 2025 at 7:10?AM hexioafeng wrote: > Dear authors, > > Here are the options passed with fieldsplit preconditioner: > > -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point > -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp > -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd > -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view > -ksp_monitor_true_residual -ksp_converged_reason > -fieldsplit_0_mg_levels_ksp_monitor_true_residual > -fieldsplit_0_mg_levels_ksp_converged_reason > -fieldsplit_1_ksp_monitor_true_residual > -fieldsplit_1_ksp_converged_reason > > and the output: > > 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm > 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS > iterations 2 > Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS > iterations 2 > Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED > iterations 0 > PC failed due to SUBPC_ERROR > Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS > iterations 2 > Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS > iterations 2 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC failed due to SUBPC_ERROR > KSP Object: 1 MPI processes > type: cg > maximum iterations=200, initial guess is zero > tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 > left preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL > Preconditioner for the Schur complement formed from Sp, an assembled > approximation to S, which uses A00's diagonal's inverse > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_) 1 MPI processes > type: gamg > type is MULTIPLICATIVE, levels=2 cycles=v > Cycles per PCApply=1 > Using externally compute Galerkin coarse grid matrices > GAMG specific options > Threshold for dropping small values in graph on each level = > > Threshold scaling factor for each level not specified = 1. > AGG specific options > Symmetric graph false > Number of levels to square graph 1 > Number smoothing steps 1 > Complexity: grid = 1.00222 > Coarse grid solver -- level ------------------------------- > KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the following > KSP and PC objects on rank 0: > KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=8, cols=8 > package used to perform factorization: petsc > total: nonzeros=56, allocated nonzeros=56 > using I-node routines: found 3 nodes, limit used > is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=8, cols=8 > total: nonzeros=56, allocated nonzeros=56 > total number of mallocs used during MatSetValues calls=0 > using I-node routines: found 3 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=8, cols=8 > total: nonzeros=56, allocated nonzeros=56 > total number of mallocs used during MatSetValues calls=0 > using nonscalable MatPtAP() implementation > using I-node (on process 0) routines: found 3 nodes, limit > used is 5 > Down solver (pre-smoother) on level 1 > ------------------------------- > KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes > type: chebyshev > eigenvalue estimates used: min = 0.0998145, max = 1.09796 > eigenvalues estimate via gmres min 0.00156735, max 0.998145 > eigenvalues estimated using gmres with translations [0. > 0.1; 0. 1.1] > KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI > processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes > type: sor > type = local_symmetric, iterations = 1, local iterations = > 1, omega = 1. > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=480, cols=480 > total: nonzeros=25200, allocated nonzeros=25200 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 160 nodes, > limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=480, cols=480 > total: nonzeros=25200, allocated nonzeros=25200 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 160 nodes, limit > used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (fieldsplit_1_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the following KSP > and PC objects on rank 0: > KSP Object: (fieldsplit_1_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_sub_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the following > KSP and PC objects on rank 0: > KSP Object: (fieldsplit_1_sub_sub_) > 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_sub_sub_) > 1 MPI processes > type: ilu > out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=144, cols=144 > package used to perform factorization: petsc > total: nonzeros=240, allocated nonzeros=240 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=144, cols=144 > total: nonzeros=240, allocated nonzeros=240 > total number of mallocs used during MatSetValues > calls=0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=144, cols=144 > total: nonzeros=240, allocated nonzeros=240 > total number of mallocs used during MatSetValues calls=0 > not using I-node (on process 0) routines > linear system matrix followed by preconditioner matrix: > Mat Object: (fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=144, cols=144 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (fieldsplit_1_) 1 MPI processes > type: mpiaij > rows=144, cols=144 > total: nonzeros=240, allocated nonzeros=240 > total number of mallocs used during MatSetValues calls=0 > not using I-node (on process 0) routines > A10 > Mat Object: 1 MPI processes > type: mpiaij > rows=144, cols=480 > total: nonzeros=48, allocated nonzeros=48 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 74 nodes, > limit used is 5 > KSP of A00 > KSP Object: (fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_) 1 MPI processes > type: gamg > type is MULTIPLICATIVE, levels=2 cycles=v > Cycles per PCApply=1 > Using externally compute Galerkin coarse grid matrices > GAMG specific options > Threshold for dropping small values in graph on each > level = > Threshold scaling factor for each level not > specified = 1. > AGG specific options > Symmetric graph false > Number of levels to square graph 1 > Number smoothing steps 1 > Complexity: grid = 1.00222 > Coarse grid solver -- level ------------------------------- > KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the > following KSP and PC objects on rank 0: > KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI > processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI > processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero > pivot [INBLOCKS] > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=8, cols=8 > package used to perform factorization: petsc > total: nonzeros=56, allocated nonzeros=56 > using I-node routines: found 3 nodes, > limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=8, cols=8 > total: nonzeros=56, allocated nonzeros=56 > total number of mallocs used during MatSetValues > calls=0 > using I-node routines: found 3 nodes, limit used > is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=8, cols=8 > total: nonzeros=56, allocated nonzeros=56 > total number of mallocs used during MatSetValues > calls=0 > using nonscalable MatPtAP() implementation > using I-node (on process 0) routines: found 3 > nodes, limit used is 5 > Down solver (pre-smoother) on level 1 > ------------------------------- > KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes > type: chebyshev > eigenvalue estimates used: min = 0.0998145, max = > 1.09796 > eigenvalues estimate via gmres min 0.00156735, max > 0.998145 > eigenvalues estimated using gmres with translations > [0. 0.1; 0. 1.1] > KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI > processes > type: gmres > restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes > type: sor > type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=480, cols=480 > total: nonzeros=25200, allocated nonzeros=25200 > total number of mallocs used during MatSetValues > calls=0 > using I-node (on process 0) routines: found 160 > nodes, limit used is 5 > Up solver (post-smoother) same as down solver > (pre-smoother) > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=480, cols=480 > total: nonzeros=25200, allocated nonzeros=25200 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 160 nodes, > limit used is 5 > A01 > Mat Object: 1 MPI processes > type: mpiaij > rows=480, cols=144 > total: nonzeros=48, allocated nonzeros=48 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 135 nodes, > limit used is 5 > Mat Object: 1 MPI processes > type: mpiaij > rows=144, cols=144 > total: nonzeros=240, allocated nonzeros=240 > total number of mallocs used during MatSetValues calls=0 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=624, cols=624 > total: nonzeros=25536, allocated nonzeros=25536 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 336 nodes, limit used is > 5 > > > Thanks, > Xiaofeng > > > > On Jun 17, 2025, at 19:05, Mark Adams wrote: > > And don't use -pc_gamg_parallel_coarse_grid_solver > You can use that in production but for debugging use -mg_coarse_pc_type svd > Also, use -options_left and remove anything that is not used. > (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) > > Mark > > > On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley wrote: > >> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng >> wrote: >> >>> Hello, >>> >>> Here are the options and outputs: >>> >>> options: >>> >>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver >>> -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur >>> -pc_fieldsplit_schur_precondition selfp >>> -fieldsplit_1_mat_schur_complement_ainv_type lump >>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd >>> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>> -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual >>> -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>> -fieldsplit_0_mg_levels_ksp_converged_reason >>> -fieldsplit_1_ksp_monitor_true_residual >>> -fieldsplit_1_ksp_converged_reason >>> >> >> This option was wrong: >> >> -fieldsplit_0_mg_coarse_pc_type_type svd >> >> from the output, we can see that it should have been >> >> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >> >> THanks, >> >> Matt >> >> >>> output: >>> >>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >>> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>> PC failed due to SUBPC_ERROR >>> KSP Object: 1 MPI processes >>> type: cg >>> maximum iterations=200, initial guess is zero >>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>> left preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI processes >>> type: gamg >>> type is MULTIPLICATIVE, levels=2 cycles=v >>> Cycles per PCApply=1 >>> Using externally compute Galerkin coarse grid matrices >>> GAMG specific options >>> Threshold for dropping small values in graph on each level = >>> Threshold scaling factor for each level not specified = 1. >>> AGG specific options >>> Symmetric graph false >>> Number of levels to square graph 1 >>> Number smoothing steps 1 >>> Complexity: grid = 1.00176 >>> Coarse grid solver -- level ------------------------------- >>> KSP Object: (mg_coarse_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (mg_coarse_) 1 MPI processes >>> type: bjacobi >>> number of blocks = 1 >>> Local solver is the same for all blocks, as in the following KSP >>> and PC objects on rank 0: >>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (mg_coarse_sub_) 1 MPI processes >>> type: lu >>> out-of-place factorization >>> tolerance for zero pivot 2.22045e-14 >>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>> matrix ordering: nd >>> factor fill ratio given 5., needed 1. >>> Factored matrix follows: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=7, cols=7 >>> package used to perform factorization: petsc >>> total: nonzeros=45, allocated nonzeros=45 >>> using I-node routines: found 3 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=7, cols=7 >>> total: nonzeros=45, allocated nonzeros=45 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node routines: found 3 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=7, cols=7 >>> total: nonzeros=45, allocated nonzeros=45 >>> total number of mallocs used during MatSetValues calls=0 >>> using nonscalable MatPtAP() implementation >>> using I-node (on process 0) routines: found 3 nodes, limit >>> used is 5 >>> Down solver (pre-smoother) on level 1 ------------------------------- >>> KSP Object: (mg_levels_1_) 1 MPI processes >>> type: chebyshev >>> eigenvalue estimates used: min = 0., max = 0. >>> eigenvalues estimate via gmres min 0., max 0. >>> eigenvalues estimated using gmres with translations [0. 0.1; 0. >>> 1.1] >>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=10, initial guess is zero >>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: (mg_levels_1_) 1 MPI processes >>> type: sor >>> type = local_symmetric, iterations = 1, local iterations = >>> 1, omega = 1. >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=624, cols=624 >>> total: nonzeros=25536, allocated nonzeros=25536 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 336 nodes, >>> limit used is 5 >>> estimating eigenvalues using noisy right hand side >>> maximum iterations=2, nonzero initial guess >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (mg_levels_1_) 1 MPI processes >>> type: sor >>> type = local_symmetric, iterations = 1, local iterations = 1, >>> omega = 1. linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=624, cols=624 >>> total: nonzeros=25536, allocated nonzeros=25536 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 336 nodes, limit >>> used is 5 Up solver (post-smoother) same as down solver (pre-smoother) >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=624, cols=624 >>> total: nonzeros=25536, allocated nonzeros=25536 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 336 nodes, limit used >>> is 5 >>> >>> >>> Best regards, >>> >>> Xiaofeng >>> >>> >>> On Jun 14, 2025, at 07:28, Barry Smith wrote: >>> >>> >>> Matt, >>> >>> Perhaps we should add options -ksp_monitor_debug and >>> -snes_monitor_debug that turn on all possible monitoring for the (possibly) >>> nested solvers and all of their converged reasons also? Note this is not >>> completely trivial because each preconditioner will have to supply its list >>> based on the current solver options for it. >>> >>> Then we won't need to constantly list a big string of problem >>> specific monitor options to ask the user to use. >>> >>> Barry >>> >>> >>> >>> >>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley wrote: >>> >>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng >>> wrote: >>> >>>> Dear authors, >>>> >>>> I tried *-pc_type game -pc_gamg_parallel_coarse_grid_solver* and *-pc_type >>>> field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type >>>> pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad >>>> -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi >>>> _fieldsplit_1_sub_pc_type for* , both options got the >>>> KSP_DIVERGE_PC_FAILED error. >>>> >>> >>> With any question about convergence, we need to see the output of >>> >>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>> -fieldsplit_0_mg_levels_ksp_converged_reason >>> -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>> >>> and all the error output. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> >>>> Xiaofeng >>>> >>>> >>>> On Jun 12, 2025, at 20:50, Mark Adams wrote: >>>> >>>> >>>> >>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: >>>>> >>>>>> Adding this to the PETSc mailing list, >>>>>> >>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> Dear Professor, >>>>>>> >>>>>>> I hope this message finds you well. >>>>>>> >>>>>>> I am an employee at a CAE company and a heavy user of the PETSc >>>>>>> library. I would like to thank you for your contributions to PETSc and >>>>>>> express my deep appreciation for your work. >>>>>>> >>>>>>> Recently, I encountered some difficulties when using PETSc to solve >>>>>>> structural mechanics problems with Lagrange multiplier constraints. After >>>>>>> searching extensively online and reviewing several papers, I found your >>>>>>> previous paper titled "*Algebraic multigrid methods for constrained >>>>>>> linear systems with applications to contact problems in solid mechanics*" >>>>>>> seems to be the most relevant and helpful. >>>>>>> >>>>>>> The stiffness matrix I'm working with, *K*, is a block saddle-point >>>>>>> matrix of the form (A00 A01; A10 0), where *A00 is singular*?just >>>>>>> as described in your paper, and different from many other articles . I have >>>>>>> a few questions regarding your work and would greatly appreciate your >>>>>>> insights: >>>>>>> >>>>>>> 1. Is the *AMG/KKT* method presented in your paper available in >>>>>>> PETSc? I tried using *CG+GAMG* directly but received a >>>>>>> *KSP_DIVERGED_PC_FAILED* error. I also attempted to use >>>>>>> *CG+PCFIELDSPLIT* with the following options: >>>>>>> >>>>>> >>>>>> No >>>>>> >>>>>> >>>>>>> >>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>>>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>>>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>>>>>> -fieldsplit_1_pc_type bjacobi >>>>>>> >>>>>>> Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* error. >>>>>>> Do you have any suggestions? >>>>>>> >>>>>>> 2. In your paper, you compare the method with *Uzawa*-type >>>>>>> approaches. To my understanding, Uzawa methods typically require A00 to be >>>>>>> invertible. How did you handle the singularity of A00 to construct an >>>>>>> M-matrix that is invertible? >>>>>>> >>>>>>> >>>>>> You add a regularization term like A01 * A10 (like springs). See the >>>>>> paper or any reference to augmented lagrange or Uzawa >>>>>> >>>>>> >>>>>> 3. Can i implement the AMG/KKT method in your paper using existing *AMG >>>>>>> APIs*? Implementing a production-level AMG solver from scratch >>>>>>> would be quite challenging for me, so I?m hoping to utilize existing AMG >>>>>>> interfaces within PETSc or other packages. >>>>>>> >>>>>>> >>>>>> You can do Uzawa and make the regularization matrix with >>>>>> matrix-matrix products. Just use AMG for the A00 block. >>>>>> >>>>>> >>>>>> >>>>>>> 4. For saddle-point systems where A00 is singular, can you recommend >>>>>>> any more robust or efficient solutions? Alternatively, are you aware of any >>>>>>> open-source software packages that can handle such cases out-of-the-box? >>>>>>> >>>>>>> >>>>>> No, and I don't think PETSc can do this out-of-the-box, but others >>>>>> may be able to give you a better idea of what PETSc can do. >>>>>> I think PETSc can do Uzawa or other similar algorithms but it will >>>>>> not do the regularization automatically (it is a bit more complicated than >>>>>> just A01 * A10) >>>>>> >>>>> >>>>> One other trick you can use is to have >>>>> >>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>> >>>>> This will use SVD on the coarse grid of GAMG, which can handle the >>>>> null space in A00 as long as the prolongation does not put it back in. I >>>>> have used this for the Laplacian with Neumann conditions and for freely >>>>> floating elastic problems. >>>>> >>>>> >>>> Good point. >>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to >>>> use a on level iterative solver for the coarse grid. >>>> >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Mark >>>>>> >>>>>>> >>>>>>> Thank you very much for taking the time to read my email. Looking >>>>>>> forward to hearing from you. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Sincerely, >>>>>>> >>>>>>> Xiaofeng He >>>>>>> ----------------------------------------------------- >>>>>>> >>>>>>> Research Engineer >>>>>>> >>>>>>> Internet Based Engineering, Beijing, China >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!egZBIZkxo3gzmVhbpj-LqC0RWijjneLGmQ3sGX354yBmpAP5IhzpECOVON-QT9cwOy5aX1SSdofeEKFrRNlQ$ >>>>> >>>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!egZBIZkxo3gzmVhbpj-LqC0RWijjneLGmQ3sGX354yBmpAP5IhzpECOVON-QT9cwOy5aX1SSdofeEKFrRNlQ$ >>> >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!egZBIZkxo3gzmVhbpj-LqC0RWijjneLGmQ3sGX354yBmpAP5IhzpECOVON-QT9cwOy5aX1SSdofeEKFrRNlQ$ >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!egZBIZkxo3gzmVhbpj-LqC0RWijjneLGmQ3sGX354yBmpAP5IhzpECOVON-QT9cwOy5aX1SSdofeEKFrRNlQ$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hexiaofeng at buaa.edu.cn Thu Jun 19 06:58:59 2025 From: hexiaofeng at buaa.edu.cn (hexioafeng) Date: Thu, 19 Jun 2025 19:58:59 +0800 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> Message-ID: <96AB5047-4A35-49A2-B948-86656A1CFB5B@buaa.edu.cn> Hello sir, I remove the duplicated "_type", and get the same error and output. Best regards, Xiaofeng > On Jun 19, 2025, at 19:45, Matthew Knepley wrote: > > This options is wrong > > -fieldsplit_0_mg_coarse_sub_pc_type_type svd > > Notice that "_type" is repeated. > > Thanks, > > Matt > > On Thu, Jun 19, 2025 at 7:10?AM hexioafeng > wrote: >> Dear authors, >> >> Here are the options passed with fieldsplit preconditioner: >> >> -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >> >> and the output: >> >> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >> Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC failed due to SUBPC_ERROR >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC failed due to SUBPC_ERROR >> KSP Object: 1 MPI processes >> type: cg >> maximum iterations=200, initial guess is zero >> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >> left preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL >> Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's diagonal's inverse >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on each level = >> Threshold scaling factor for each level not specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00222 >> Coarse grid solver -- level ------------------------------- >> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> package used to perform factorization: petsc >> total: nonzeros=56, allocated nonzeros=56 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (fieldsplit_1_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_1_sub_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_sub_) 1 MPI processes >> type: ilu >> out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=144, cols=144 >> package used to perform factorization: petsc >> total: nonzeros=240, allocated nonzeros=240 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> linear system matrix followed by preconditioner matrix: >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: schurcomplement >> rows=144, cols=144 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> A10 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=480 >> total: nonzeros=48, allocated nonzeros=48 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 74 nodes, limit used is 5 >> KSP of A00 >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on each level = >> Threshold scaling factor for each level not specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00222 >> Coarse grid solver -- level ------------------------------- >> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> package used to perform factorization: petsc >> total: nonzeros=56, allocated nonzeros=56 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >> A01 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=480, cols=144 >> total: nonzeros=48, allocated nonzeros=48 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 135 nodes, limit used is 5 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=624, cols=624 >> total: nonzeros=25536, allocated nonzeros=25536 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >> >> >> Thanks, >> Xiaofeng >> >> >> >>> On Jun 17, 2025, at 19:05, Mark Adams > wrote: >>> >>> And don't use -pc_gamg_parallel_coarse_grid_solver >>> You can use that in production but for debugging use -mg_coarse_pc_type svd >>> Also, use -options_left and remove anything that is not used. >>> (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) >>> >>> Mark >>> >>> >>> On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley > wrote: >>>> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng > wrote: >>>>> Hello, >>>>> >>>>> Here are the options and outputs: >>>>> >>>>> options: >>>>> >>>>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -fieldsplit_1_mat_schur_complement_ainv_type lump -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>> >>>> This option was wrong: >>>> >>>> -fieldsplit_0_mg_coarse_pc_type_type svd >>>> >>>> from the output, we can see that it should have been >>>> >>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>> >>>> THanks, >>>> >>>> Matt >>>> >>>>> output: >>>>> >>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>> PC failed due to SUBPC_ERROR >>>>> KSP Object: 1 MPI processes >>>>> type: cg >>>>> maximum iterations=200, initial guess is zero >>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>> left preconditioning >>>>> using UNPRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI processes >>>>> type: gamg >>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>> Cycles per PCApply=1 >>>>> Using externally compute Galerkin coarse grid matrices >>>>> GAMG specific options >>>>> Threshold for dropping small values in graph on each level = >>>>> Threshold scaling factor for each level not specified = 1. >>>>> AGG specific options >>>>> Symmetric graph false >>>>> Number of levels to square graph 1 >>>>> Number smoothing steps 1 >>>>> Complexity: grid = 1.00176 >>>>> Coarse grid solver -- level ------------------------------- >>>>> KSP Object: (mg_coarse_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (mg_coarse_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=1, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (mg_coarse_sub_) 1 MPI processes >>>>> type: lu >>>>> out-of-place factorization >>>>> tolerance for zero pivot 2.22045e-14 >>>>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>>>> matrix ordering: nd >>>>> factor fill ratio given 5., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=7, cols=7 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=45, allocated nonzeros=45 >>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=7, cols=7 >>>>> total: nonzeros=45, allocated nonzeros=45 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=7, cols=7 >>>>> total: nonzeros=45, allocated nonzeros=45 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using nonscalable MatPtAP() implementation >>>>> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>> KSP Object: (mg_levels_1_) 1 MPI processes >>>>> type: chebyshev >>>>> eigenvalue estimates used: min = 0., max = 0. >>>>> eigenvalues estimate via gmres min 0., max 0. >>>>> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >>>>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=10, initial guess is zero >>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>> type: sor >>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=624, cols=624 >>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>>>> estimating eigenvalues using noisy right hand side >>>>> maximum iterations=2, nonzero initial guess >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>> type: sor >>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=624, cols=624 >>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=624, cols=624 >>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Xiaofeng >>>>> >>>>> >>>>>> On Jun 14, 2025, at 07:28, Barry Smith > wrote: >>>>>> >>>>>> >>>>>> Matt, >>>>>> >>>>>> Perhaps we should add options -ksp_monitor_debug and -snes_monitor_debug that turn on all possible monitoring for the (possibly) nested solvers and all of their converged reasons also? Note this is not completely trivial because each preconditioner will have to supply its list based on the current solver options for it. >>>>>> >>>>>> Then we won't need to constantly list a big string of problem specific monitor options to ask the user to use. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley > wrote: >>>>>>> >>>>>>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng > wrote: >>>>>>>> Dear authors, >>>>>>>> >>>>>>>> I tried -pc_type game -pc_gamg_parallel_coarse_grid_solver and -pc_type field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi _fieldsplit_1_sub_pc_type for , both options got the KSP_DIVERGE_PC_FAILED error. >>>>>>> >>>>>>> With any question about convergence, we need to see the output of >>>>>>> >>>>>>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>>>> >>>>>>> and all the error output. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Xiaofeng >>>>>>>> >>>>>>>> >>>>>>>>> On Jun 12, 2025, at 20:50, Mark Adams > wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley > wrote: >>>>>>>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams > wrote: >>>>>>>>>>> Adding this to the PETSc mailing list, >>>>>>>>>>> >>>>>>>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng > wrote: >>>>>>>>>>>> >>>>>>>>>>>> Dear Professor, >>>>>>>>>>>> >>>>>>>>>>>> I hope this message finds you well. >>>>>>>>>>>> >>>>>>>>>>>> I am an employee at a CAE company and a heavy user of the PETSc library. I would like to thank you for your contributions to PETSc and express my deep appreciation for your work. >>>>>>>>>>>> >>>>>>>>>>>> Recently, I encountered some difficulties when using PETSc to solve structural mechanics problems with Lagrange multiplier constraints. After searching extensively online and reviewing several papers, I found your previous paper titled "Algebraic multigrid methods for constrained linear systems with applications to contact problems in solid mechanics" seems to be the most relevant and helpful. >>>>>>>>>>>> >>>>>>>>>>>> The stiffness matrix I'm working with, K, is a block saddle-point matrix of the form (A00 A01; A10 0), where A00 is singular?just as described in your paper, and different from many other articles . I have a few questions regarding your work and would greatly appreciate your insights: >>>>>>>>>>>> >>>>>>>>>>>> 1. Is the AMG/KKT method presented in your paper available in PETSc? I tried using CG+GAMG directly but received a KSP_DIVERGED_PC_FAILED error. I also attempted to use CG+PCFIELDSPLIT with the following options: >>>>>>>>>>> >>>>>>>>>>> No >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>>>>>>>>> >>>>>>>>>>>> Unfortunately, this also resulted in a KSP_DIVERGED_PC_FAILED error. Do you have any suggestions? >>>>>>>>>>>> >>>>>>>>>>>> 2. In your paper, you compare the method with Uzawa-type approaches. To my understanding, Uzawa methods typically require A00 to be invertible. How did you handle the singularity of A00 to construct an M-matrix that is invertible? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing AMG APIs? Implementing a production-level AMG solver from scratch would be quite challenging for me, so I?m hoping to utilize existing AMG interfaces within PETSc or other packages. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> 4. For saddle-point systems where A00 is singular, can you recommend any more robust or efficient solutions? Alternatively, are you aware of any open-source software packages that can handle such cases out-of-the-box? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. >>>>>>>>>>> I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) >>>>>>>>>> >>>>>>>>>> One other trick you can use is to have >>>>>>>>>> >>>>>>>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>>>>>>> >>>>>>>>>> This will use SVD on the coarse grid of GAMG, which can handle the null space in A00 as long as the prolongation does not put it back in. I have used this for the Laplacian with Neumann conditions and for freely floating elastic problems. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Good point. >>>>>>>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Mark >>>>>>>>>>>> >>>>>>>>>>>> Thank you very much for taking the time to read my email. Looking forward to hearing from you. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Sincerely, >>>>>>>>>>>> >>>>>>>>>>>> Xiaofeng He >>>>>>>>>>>> ----------------------------------------------------- >>>>>>>>>>>> >>>>>>>>>>>> Research Engineer >>>>>>>>>>>> >>>>>>>>>>>> Internet Based Engineering, Beijing, China >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZB1cNFj48U8eb5ckEu57heuMiZw2CzbgK4EOR8laI0ulI607DgVUe-wDatJn7gTSAcRZv8Bcw3BmbXZ0gJOgxQicaGoXrw$ >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZB1cNFj48U8eb5ckEu57heuMiZw2CzbgK4EOR8laI0ulI607DgVUe-wDatJn7gTSAcRZv8Bcw3BmbXZ0gJOgxQicaGoXrw$ >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZB1cNFj48U8eb5ckEu57heuMiZw2CzbgK4EOR8laI0ulI607DgVUe-wDatJn7gTSAcRZv8Bcw3BmbXZ0gJOgxQicaGoXrw$ >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZB1cNFj48U8eb5ckEu57heuMiZw2CzbgK4EOR8laI0ulI607DgVUe-wDatJn7gTSAcRZv8Bcw3BmbXZ0gJOgxQicaGoXrw$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jun 19 07:06:46 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jun 2025 08:06:46 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: <96AB5047-4A35-49A2-B948-86656A1CFB5B@buaa.edu.cn> References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> <96AB5047-4A35-49A2-B948-86656A1CFB5B@buaa.edu.cn> Message-ID: On Thu, Jun 19, 2025 at 7:59?AM hexioafeng wrote: > Hello sir, > > I remove the duplicated "_type", and get the same error and output. > The output cannot be the same. Please send it. Thanks, Matt > Best regards, > Xiaofeng > > > On Jun 19, 2025, at 19:45, Matthew Knepley wrote: > > This options is wrong > > -fieldsplit_0_mg_coarse_sub_pc_type_type svd > > Notice that "_type" is repeated. > > Thanks, > > Matt > > On Thu, Jun 19, 2025 at 7:10?AM hexioafeng wrote: > >> Dear authors, >> >> Here are the options passed with fieldsplit preconditioner: >> >> -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd >> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view >> -ksp_monitor_true_residual -ksp_converged_reason >> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >> -fieldsplit_0_mg_levels_ksp_converged_reason >> -fieldsplit_1_ksp_monitor_true_residual >> -fieldsplit_1_ksp_converged_reason >> >> and the output: >> >> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS >> iterations 2 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS >> iterations 2 >> Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED >> iterations 0 >> PC failed due to SUBPC_ERROR >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS >> iterations 2 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS >> iterations 2 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC failed due to SUBPC_ERROR >> KSP Object: 1 MPI processes >> type: cg >> maximum iterations=200, initial guess is zero >> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >> left preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, blocksize = 1, factorization >> FULL >> Preconditioner for the Schur complement formed from Sp, an assembled >> approximation to S, which uses A00's diagonal's inverse >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on each level >> = >> Threshold scaling factor for each level not specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00222 >> Coarse grid solver -- level ------------------------------- >> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the >> following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> using diagonal shift on blocks to prevent zero pivot >> [INBLOCKS] >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> package used to perform factorization: petsc >> total: nonzeros=56, allocated nonzeros=56 >> using I-node routines: found 3 nodes, limit used >> is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 nodes, >> limit used is 5 >> Down solver (pre-smoother) on level 1 >> ------------------------------- >> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >> eigenvalues estimated using gmres with translations [0. >> 0.1; 0. 1.1] >> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI >> processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local iterations = >> 1, omega = 1. >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, >> limit used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit >> used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (fieldsplit_1_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following >> KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following >> KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_1_sub_sub_) >> 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_sub_) >> 1 MPI processes >> type: ilu >> out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=144, cols=144 >> package used to perform factorization: petsc >> total: nonzeros=240, allocated nonzeros=240 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues >> calls=0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> linear system matrix followed by preconditioner matrix: >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: schurcomplement >> rows=144, cols=144 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> A10 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=480 >> total: nonzeros=48, allocated nonzeros=48 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 74 nodes, >> limit used is 5 >> KSP of A00 >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on >> each level = >> Threshold scaling factor for each level not >> specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00222 >> Coarse grid solver -- level >> ------------------------------- >> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the >> following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >> processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >> processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> using diagonal shift on blocks to prevent zero >> pivot [INBLOCKS] >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> package used to perform factorization: petsc >> total: nonzeros=56, allocated nonzeros=56 >> using I-node routines: found 3 nodes, >> limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues >> calls=0 >> using I-node routines: found 3 nodes, limit >> used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues >> calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 >> nodes, limit used is 5 >> Down solver (pre-smoother) on level 1 >> ------------------------------- >> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0.0998145, max = >> 1.09796 >> eigenvalues estimate via gmres min 0.00156735, max >> 0.998145 >> eigenvalues estimated using gmres with translations >> [0. 0.1; 0. 1.1] >> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 >> MPI processes >> type: gmres >> restart=30, using Classical (unmodified) >> Gram-Schmidt Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence >> test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local >> iterations = 1, omega = 1. >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues >> calls=0 >> using I-node (on process 0) routines: found 160 >> nodes, limit used is 5 >> Up solver (post-smoother) same as down solver >> (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 >> nodes, limit used is 5 >> A01 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=480, cols=144 >> total: nonzeros=48, allocated nonzeros=48 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 135 nodes, >> limit used is 5 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=624, cols=624 >> total: nonzeros=25536, allocated nonzeros=25536 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 336 nodes, limit used >> is 5 >> >> >> Thanks, >> Xiaofeng >> >> >> >> On Jun 17, 2025, at 19:05, Mark Adams wrote: >> >> And don't use -pc_gamg_parallel_coarse_grid_solver >> You can use that in production but for debugging use -mg_coarse_pc_type >> svd >> Also, use -options_left and remove anything that is not used. >> (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) >> >> Mark >> >> >> On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley >> wrote: >> >>> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng >>> wrote: >>> >>>> Hello, >>>> >>>> Here are the options and outputs: >>>> >>>> options: >>>> >>>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver >>>> -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur >>>> -pc_fieldsplit_schur_precondition selfp >>>> -fieldsplit_1_mat_schur_complement_ainv_type lump >>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd >>>> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>> -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual >>>> -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>> -fieldsplit_1_ksp_monitor_true_residual >>>> -fieldsplit_1_ksp_converged_reason >>>> >>> >>> This option was wrong: >>> >>> -fieldsplit_0_mg_coarse_pc_type_type svd >>> >>> from the output, we can see that it should have been >>> >>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>> >>> THanks, >>> >>> Matt >>> >>> >>>> output: >>>> >>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >>>> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>> PC failed due to SUBPC_ERROR >>>> KSP Object: 1 MPI processes >>>> type: cg >>>> maximum iterations=200, initial guess is zero >>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>> left preconditioning >>>> using UNPRECONDITIONED norm type for convergence test >>>> PC Object: 1 MPI processes >>>> type: gamg >>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>> Cycles per PCApply=1 >>>> Using externally compute Galerkin coarse grid matrices >>>> GAMG specific options >>>> Threshold for dropping small values in graph on each level = >>>> Threshold scaling factor for each level not specified = 1. >>>> AGG specific options >>>> Symmetric graph false >>>> Number of levels to square graph 1 >>>> Number smoothing steps 1 >>>> Complexity: grid = 1.00176 >>>> Coarse grid solver -- level ------------------------------- >>>> KSP Object: (mg_coarse_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (mg_coarse_) 1 MPI processes >>>> type: bjacobi >>>> number of blocks = 1 >>>> Local solver is the same for all blocks, as in the following >>>> KSP and PC objects on rank 0: >>>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=1, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (mg_coarse_sub_) 1 MPI processes >>>> type: lu >>>> out-of-place factorization >>>> tolerance for zero pivot 2.22045e-14 >>>> using diagonal shift on blocks to prevent zero pivot >>>> [INBLOCKS] >>>> matrix ordering: nd >>>> factor fill ratio given 5., needed 1. >>>> Factored matrix follows: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=7, cols=7 >>>> package used to perform factorization: petsc >>>> total: nonzeros=45, allocated nonzeros=45 >>>> using I-node routines: found 3 nodes, limit used is 5 >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=7, cols=7 >>>> total: nonzeros=45, allocated nonzeros=45 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node routines: found 3 nodes, limit used is 5 >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=7, cols=7 >>>> total: nonzeros=45, allocated nonzeros=45 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using nonscalable MatPtAP() implementation >>>> using I-node (on process 0) routines: found 3 nodes, limit >>>> used is 5 >>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>> KSP Object: (mg_levels_1_) 1 MPI processes >>>> type: chebyshev >>>> eigenvalue estimates used: min = 0., max = 0. >>>> eigenvalues estimate via gmres min 0., max 0. >>>> eigenvalues estimated using gmres with translations [0. 0.1; >>>> 0. 1.1] >>>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>>> type: gmres >>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>> Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=10, initial guess is zero >>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> PC Object: (mg_levels_1_) 1 MPI processes >>>> type: sor >>>> type = local_symmetric, iterations = 1, local iterations = >>>> 1, omega = 1. >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=624, cols=624 >>>> total: nonzeros=25536, allocated nonzeros=25536 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node (on process 0) routines: found 336 nodes, >>>> limit used is 5 >>>> estimating eigenvalues using noisy right hand side >>>> maximum iterations=2, nonzero initial guess >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (mg_levels_1_) 1 MPI processes >>>> type: sor >>>> type = local_symmetric, iterations = 1, local iterations = 1, >>>> omega = 1. linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=624, cols=624 >>>> total: nonzeros=25536, allocated nonzeros=25536 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node (on process 0) routines: found 336 nodes, limit >>>> used is 5 Up solver (post-smoother) same as down solver (pre-smoother) >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=624, cols=624 >>>> total: nonzeros=25536, allocated nonzeros=25536 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node (on process 0) routines: found 336 nodes, limit used >>>> is 5 >>>> >>>> >>>> Best regards, >>>> >>>> Xiaofeng >>>> >>>> >>>> On Jun 14, 2025, at 07:28, Barry Smith wrote: >>>> >>>> >>>> Matt, >>>> >>>> Perhaps we should add options -ksp_monitor_debug and >>>> -snes_monitor_debug that turn on all possible monitoring for the (possibly) >>>> nested solvers and all of their converged reasons also? Note this is not >>>> completely trivial because each preconditioner will have to supply its list >>>> based on the current solver options for it. >>>> >>>> Then we won't need to constantly list a big string of problem >>>> specific monitor options to ask the user to use. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley wrote: >>>> >>>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng >>>> wrote: >>>> >>>>> Dear authors, >>>>> >>>>> I tried *-pc_type game -pc_gamg_parallel_coarse_grid_solver* and *-pc_type >>>>> field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type >>>>> pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad >>>>> -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi >>>>> _fieldsplit_1_sub_pc_type for* , both options got the >>>>> KSP_DIVERGE_PC_FAILED error. >>>>> >>>> >>>> With any question about convergence, we need to see the output of >>>> >>>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>>> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>> -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>> >>>> and all the error output. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> >>>>> Xiaofeng >>>>> >>>>> >>>>> On Jun 12, 2025, at 20:50, Mark Adams wrote: >>>>> >>>>> >>>>> >>>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: >>>>>> >>>>>>> Adding this to the PETSc mailing list, >>>>>>> >>>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> Dear Professor, >>>>>>>> >>>>>>>> I hope this message finds you well. >>>>>>>> >>>>>>>> I am an employee at a CAE company and a heavy user of the PETSc >>>>>>>> library. I would like to thank you for your contributions to PETSc and >>>>>>>> express my deep appreciation for your work. >>>>>>>> >>>>>>>> Recently, I encountered some difficulties when using PETSc to solve >>>>>>>> structural mechanics problems with Lagrange multiplier constraints. After >>>>>>>> searching extensively online and reviewing several papers, I found your >>>>>>>> previous paper titled "*Algebraic multigrid methods for >>>>>>>> constrained linear systems with applications to contact problems in solid >>>>>>>> mechanics*" seems to be the most relevant and helpful. >>>>>>>> >>>>>>>> The stiffness matrix I'm working with, *K*, is a block >>>>>>>> saddle-point matrix of the form (A00 A01; A10 0), where *A00 is >>>>>>>> singular*?just as described in your paper, and different from many >>>>>>>> other articles . I have a few questions regarding your work and would >>>>>>>> greatly appreciate your insights: >>>>>>>> >>>>>>>> 1. Is the *AMG/KKT* method presented in your paper available in >>>>>>>> PETSc? I tried using *CG+GAMG* directly but received a >>>>>>>> *KSP_DIVERGED_PC_FAILED* error. I also attempted to use >>>>>>>> *CG+PCFIELDSPLIT* with the following options: >>>>>>>> >>>>>>> >>>>>>> No >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>>>>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>>>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>>>>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>>>>>>> -fieldsplit_1_pc_type bjacobi >>>>>>>> >>>>>>>> Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* error. >>>>>>>> Do you have any suggestions? >>>>>>>> >>>>>>>> 2. In your paper, you compare the method with *Uzawa*-type >>>>>>>> approaches. To my understanding, Uzawa methods typically require A00 to be >>>>>>>> invertible. How did you handle the singularity of A00 to construct an >>>>>>>> M-matrix that is invertible? >>>>>>>> >>>>>>>> >>>>>>> You add a regularization term like A01 * A10 (like springs). See the >>>>>>> paper or any reference to augmented lagrange or Uzawa >>>>>>> >>>>>>> >>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing *AMG >>>>>>>> APIs*? Implementing a production-level AMG solver from scratch >>>>>>>> would be quite challenging for me, so I?m hoping to utilize existing AMG >>>>>>>> interfaces within PETSc or other packages. >>>>>>>> >>>>>>>> >>>>>>> You can do Uzawa and make the regularization matrix with >>>>>>> matrix-matrix products. Just use AMG for the A00 block. >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 4. For saddle-point systems where A00 is singular, can you >>>>>>>> recommend any more robust or efficient solutions? Alternatively, are you >>>>>>>> aware of any open-source software packages that can handle such cases >>>>>>>> out-of-the-box? >>>>>>>> >>>>>>>> >>>>>>> No, and I don't think PETSc can do this out-of-the-box, but others >>>>>>> may be able to give you a better idea of what PETSc can do. >>>>>>> I think PETSc can do Uzawa or other similar algorithms but it will >>>>>>> not do the regularization automatically (it is a bit more complicated than >>>>>>> just A01 * A10) >>>>>>> >>>>>> >>>>>> One other trick you can use is to have >>>>>> >>>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>>> >>>>>> This will use SVD on the coarse grid of GAMG, which can handle the >>>>>> null space in A00 as long as the prolongation does not put it back in. I >>>>>> have used this for the Laplacian with Neumann conditions and for freely >>>>>> floating elastic problems. >>>>>> >>>>>> >>>>> Good point. >>>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to >>>>> use a on level iterative solver for the coarse grid. >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Mark >>>>>>> >>>>>>>> >>>>>>>> Thank you very much for taking the time to read my email. Looking >>>>>>>> forward to hearing from you. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Sincerely, >>>>>>>> >>>>>>>> Xiaofeng He >>>>>>>> ----------------------------------------------------- >>>>>>>> >>>>>>>> Research Engineer >>>>>>>> >>>>>>>> Internet Based Engineering, Beijing, China >>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!b073klPm2VmdcoyZSanTzrb7QKM0FLY76BIb6aSF6UzIRqOnap2eiE5edCKKSGWHd5jcpBqQi9rBidaVTxD5$ >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!b073klPm2VmdcoyZSanTzrb7QKM0FLY76BIb6aSF6UzIRqOnap2eiE5edCKKSGWHd5jcpBqQi9rBidaVTxD5$ >>>> >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!b073klPm2VmdcoyZSanTzrb7QKM0FLY76BIb6aSF6UzIRqOnap2eiE5edCKKSGWHd5jcpBqQi9rBidaVTxD5$ >>> >>> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!b073klPm2VmdcoyZSanTzrb7QKM0FLY76BIb6aSF6UzIRqOnap2eiE5edCKKSGWHd5jcpBqQi9rBidaVTxD5$ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!b073klPm2VmdcoyZSanTzrb7QKM0FLY76BIb6aSF6UzIRqOnap2eiE5edCKKSGWHd5jcpBqQi9rBidaVTxD5$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Jun 19 11:56:15 2025 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 19 Jun 2025 12:56:15 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> <96AB5047-4A35-49A2-B948-86656A1CFB5B@buaa.edu.cn> Message-ID: This is what Matt is looking at: PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: lu This should be svd, not lu If you had used -options_left you would have caught this mistake(s) On Thu, Jun 19, 2025 at 8:06?AM Matthew Knepley wrote: > On Thu, Jun 19, 2025 at 7:59?AM hexioafeng wrote: > >> Hello sir, >> >> I remove the duplicated "_type", and get the same error and output. >> > > The output cannot be the same. Please send it. > > Thanks, > > Matt > > >> Best regards, >> Xiaofeng >> >> >> On Jun 19, 2025, at 19:45, Matthew Knepley wrote: >> >> This options is wrong >> >> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >> >> Notice that "_type" is repeated. >> >> Thanks, >> >> Matt >> >> On Thu, Jun 19, 2025 at 7:10?AM hexioafeng >> wrote: >> >>> Dear authors, >>> >>> Here are the options passed with fieldsplit preconditioner: >>> >>> -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view >>> -ksp_monitor_true_residual -ksp_converged_reason >>> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>> -fieldsplit_0_mg_levels_ksp_converged_reason >>> -fieldsplit_1_ksp_monitor_true_residual >>> -fieldsplit_1_ksp_converged_reason >>> >>> and the output: >>> >>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >>> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>> CONVERGED_ITS iterations 2 >>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>> CONVERGED_ITS iterations 2 >>> Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED >>> iterations 0 >>> PC failed due to SUBPC_ERROR >>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>> CONVERGED_ITS iterations 2 >>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>> CONVERGED_ITS iterations 2 >>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>> PC failed due to SUBPC_ERROR >>> KSP Object: 1 MPI processes >>> type: cg >>> maximum iterations=200, initial guess is zero >>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>> left preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI processes >>> type: fieldsplit >>> FieldSplit with Schur preconditioner, blocksize = 1, factorization >>> FULL >>> Preconditioner for the Schur complement formed from Sp, an assembled >>> approximation to S, which uses A00's diagonal's inverse >>> Split info: >>> Split number 0 Defined by IS >>> Split number 1 Defined by IS >>> KSP solver for A00 block >>> KSP Object: (fieldsplit_0_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_) 1 MPI processes >>> type: gamg >>> type is MULTIPLICATIVE, levels=2 cycles=v >>> Cycles per PCApply=1 >>> Using externally compute Galerkin coarse grid matrices >>> GAMG specific options >>> Threshold for dropping small values in graph on each level >>> = >>> Threshold scaling factor for each level not specified = 1. >>> AGG specific options >>> Symmetric graph false >>> Number of levels to square graph 1 >>> Number smoothing steps 1 >>> Complexity: grid = 1.00222 >>> Coarse grid solver -- level ------------------------------- >>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>> type: bjacobi >>> number of blocks = 1 >>> Local solver is the same for all blocks, as in the >>> following KSP and PC objects on rank 0: >>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>> type: lu >>> out-of-place factorization >>> tolerance for zero pivot 2.22045e-14 >>> using diagonal shift on blocks to prevent zero pivot >>> [INBLOCKS] >>> matrix ordering: nd >>> factor fill ratio given 5., needed 1. >>> Factored matrix follows: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=8, cols=8 >>> package used to perform factorization: petsc >>> total: nonzeros=56, allocated nonzeros=56 >>> using I-node routines: found 3 nodes, limit used >>> is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=8, cols=8 >>> total: nonzeros=56, allocated nonzeros=56 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node routines: found 3 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=8, cols=8 >>> total: nonzeros=56, allocated nonzeros=56 >>> total number of mallocs used during MatSetValues calls=0 >>> using nonscalable MatPtAP() implementation >>> using I-node (on process 0) routines: found 3 nodes, >>> limit used is 5 >>> Down solver (pre-smoother) on level 1 >>> ------------------------------- >>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>> type: chebyshev >>> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >>> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >>> eigenvalues estimated using gmres with translations [0. >>> 0.1; 0. 1.1] >>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI >>> processes >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=10, initial guess is zero >>> tolerances: relative=1e-12, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> estimating eigenvalues using noisy right hand side >>> maximum iterations=2, nonzero initial guess >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>> type: sor >>> type = local_symmetric, iterations = 1, local iterations = >>> 1, omega = 1. >>> linear system matrix = precond matrix: >>> Mat Object: (fieldsplit_0_) 1 MPI processes >>> type: mpiaij >>> rows=480, cols=480 >>> total: nonzeros=25200, allocated nonzeros=25200 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 160 nodes, >>> limit used is 5 >>> Up solver (post-smoother) same as down solver (pre-smoother) >>> linear system matrix = precond matrix: >>> Mat Object: (fieldsplit_0_) 1 MPI processes >>> type: mpiaij >>> rows=480, cols=480 >>> total: nonzeros=25200, allocated nonzeros=25200 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 160 nodes, limit >>> used is 5 >>> KSP solver for S = A11 - A10 inv(A00) A01 >>> KSP Object: (fieldsplit_1_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_1_) 1 MPI processes >>> type: bjacobi >>> number of blocks = 1 >>> Local solver is the same for all blocks, as in the following >>> KSP and PC objects on rank 0: >>> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_1_sub_) 1 MPI processes >>> type: bjacobi >>> number of blocks = 1 >>> Local solver is the same for all blocks, as in the following >>> KSP and PC objects on rank 0: >>> KSP Object: (fieldsplit_1_sub_sub_) >>> 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_1_sub_sub_) >>> 1 MPI processes >>> type: ilu >>> out-of-place factorization >>> 0 levels of fill >>> tolerance for zero pivot 2.22045e-14 >>> matrix ordering: natural >>> factor fill ratio given 1., needed 1. >>> Factored matrix follows: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=144, cols=144 >>> package used to perform factorization: >>> petsc >>> total: nonzeros=240, allocated nonzeros=240 >>> not using I-node routines >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=144, cols=144 >>> total: nonzeros=240, allocated nonzeros=240 >>> total number of mallocs used during MatSetValues >>> calls=0 >>> not using I-node routines >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=144, cols=144 >>> total: nonzeros=240, allocated nonzeros=240 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node (on process 0) routines >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: (fieldsplit_1_) 1 MPI processes >>> type: schurcomplement >>> rows=144, cols=144 >>> Schur complement A11 - A10 inv(A00) A01 >>> A11 >>> Mat Object: (fieldsplit_1_) 1 MPI processes >>> type: mpiaij >>> rows=144, cols=144 >>> total: nonzeros=240, allocated nonzeros=240 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node (on process 0) routines >>> A10 >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=144, cols=480 >>> total: nonzeros=48, allocated nonzeros=48 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 74 nodes, >>> limit used is 5 >>> KSP of A00 >>> KSP Object: (fieldsplit_0_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_) 1 MPI processes >>> type: gamg >>> type is MULTIPLICATIVE, levels=2 cycles=v >>> Cycles per PCApply=1 >>> Using externally compute Galerkin coarse grid >>> matrices >>> GAMG specific options >>> Threshold for dropping small values in graph on >>> each level = >>> Threshold scaling factor for each level not >>> specified = 1. >>> AGG specific options >>> Symmetric graph false >>> Number of levels to square graph 1 >>> Number smoothing steps 1 >>> Complexity: grid = 1.00222 >>> Coarse grid solver -- level >>> ------------------------------- >>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>> type: bjacobi >>> number of blocks = 1 >>> Local solver is the same for all blocks, as in the >>> following KSP and PC objects on rank 0: >>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >>> processes >>> type: preonly >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >>> processes >>> type: lu >>> out-of-place factorization >>> tolerance for zero pivot 2.22045e-14 >>> using diagonal shift on blocks to prevent zero >>> pivot [INBLOCKS] >>> matrix ordering: nd >>> factor fill ratio given 5., needed 1. >>> Factored matrix follows: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=8, cols=8 >>> package used to perform factorization: >>> petsc >>> total: nonzeros=56, allocated nonzeros=56 >>> using I-node routines: found 3 nodes, >>> limit used is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=8, cols=8 >>> total: nonzeros=56, allocated nonzeros=56 >>> total number of mallocs used during MatSetValues >>> calls=0 >>> using I-node routines: found 3 nodes, limit >>> used is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=8, cols=8 >>> total: nonzeros=56, allocated nonzeros=56 >>> total number of mallocs used during MatSetValues >>> calls=0 >>> using nonscalable MatPtAP() implementation >>> using I-node (on process 0) routines: found 3 >>> nodes, limit used is 5 >>> Down solver (pre-smoother) on level 1 >>> ------------------------------- >>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>> type: chebyshev >>> eigenvalue estimates used: min = 0.0998145, max = >>> 1.09796 >>> eigenvalues estimate via gmres min 0.00156735, max >>> 0.998145 >>> eigenvalues estimated using gmres with >>> translations [0. 0.1; 0. 1.1] >>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 >>> MPI processes >>> type: gmres >>> restart=30, using Classical (unmodified) >>> Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=10, initial guess is zero >>> tolerances: relative=1e-12, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence >>> test >>> estimating eigenvalues using noisy right hand side >>> maximum iterations=2, nonzero initial guess >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>> type: sor >>> type = local_symmetric, iterations = 1, local >>> iterations = 1, omega = 1. >>> linear system matrix = precond matrix: >>> Mat Object: (fieldsplit_0_) 1 MPI processes >>> type: mpiaij >>> rows=480, cols=480 >>> total: nonzeros=25200, allocated nonzeros=25200 >>> total number of mallocs used during MatSetValues >>> calls=0 >>> using I-node (on process 0) routines: found 160 >>> nodes, limit used is 5 >>> Up solver (post-smoother) same as down solver >>> (pre-smoother) >>> linear system matrix = precond matrix: >>> Mat Object: (fieldsplit_0_) 1 MPI processes >>> type: mpiaij >>> rows=480, cols=480 >>> total: nonzeros=25200, allocated nonzeros=25200 >>> total number of mallocs used during MatSetValues >>> calls=0 >>> using I-node (on process 0) routines: found 160 >>> nodes, limit used is 5 >>> A01 >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=480, cols=144 >>> total: nonzeros=48, allocated nonzeros=48 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 135 nodes, >>> limit used is 5 >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=144, cols=144 >>> total: nonzeros=240, allocated nonzeros=240 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node (on process 0) routines >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=624, cols=624 >>> total: nonzeros=25536, allocated nonzeros=25536 >>> total number of mallocs used during MatSetValues calls=0 >>> using I-node (on process 0) routines: found 336 nodes, limit used >>> is 5 >>> >>> >>> Thanks, >>> Xiaofeng >>> >>> >>> >>> On Jun 17, 2025, at 19:05, Mark Adams wrote: >>> >>> And don't use -pc_gamg_parallel_coarse_grid_solver >>> You can use that in production but for debugging use -mg_coarse_pc_type >>> svd >>> Also, use -options_left and remove anything that is not used. >>> (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) >>> >>> Mark >>> >>> >>> On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley >>> wrote: >>> >>>> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> Here are the options and outputs: >>>>> >>>>> options: >>>>> >>>>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver >>>>> -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur >>>>> -pc_fieldsplit_schur_precondition selfp >>>>> -fieldsplit_1_mat_schur_complement_ainv_type lump >>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd >>>>> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>> -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual >>>>> -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>>> -fieldsplit_1_ksp_monitor_true_residual >>>>> -fieldsplit_1_ksp_converged_reason >>>>> >>>> >>>> This option was wrong: >>>> >>>> -fieldsplit_0_mg_coarse_pc_type_type svd >>>> >>>> from the output, we can see that it should have been >>>> >>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>> >>>> THanks, >>>> >>>> Matt >>>> >>>> >>>>> output: >>>>> >>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >>>>> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>> PC failed due to SUBPC_ERROR >>>>> KSP Object: 1 MPI processes >>>>> type: cg >>>>> maximum iterations=200, initial guess is zero >>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>> left preconditioning >>>>> using UNPRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI processes >>>>> type: gamg >>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>> Cycles per PCApply=1 >>>>> Using externally compute Galerkin coarse grid matrices >>>>> GAMG specific options >>>>> Threshold for dropping small values in graph on each level = >>>>> Threshold scaling factor for each level not specified = 1. >>>>> AGG specific options >>>>> Symmetric graph false >>>>> Number of levels to square graph 1 >>>>> Number smoothing steps 1 >>>>> Complexity: grid = 1.00176 >>>>> Coarse grid solver -- level ------------------------------- >>>>> KSP Object: (mg_coarse_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (mg_coarse_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the following >>>>> KSP and PC objects on rank 0: >>>>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=1, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (mg_coarse_sub_) 1 MPI processes >>>>> type: lu >>>>> out-of-place factorization >>>>> tolerance for zero pivot 2.22045e-14 >>>>> using diagonal shift on blocks to prevent zero pivot >>>>> [INBLOCKS] >>>>> matrix ordering: nd >>>>> factor fill ratio given 5., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=7, cols=7 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=45, allocated nonzeros=45 >>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=7, cols=7 >>>>> total: nonzeros=45, allocated nonzeros=45 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=7, cols=7 >>>>> total: nonzeros=45, allocated nonzeros=45 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using nonscalable MatPtAP() implementation >>>>> using I-node (on process 0) routines: found 3 nodes, limit >>>>> used is 5 >>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>> KSP Object: (mg_levels_1_) 1 MPI processes >>>>> type: chebyshev >>>>> eigenvalue estimates used: min = 0., max = 0. >>>>> eigenvalues estimate via gmres min 0., max 0. >>>>> eigenvalues estimated using gmres with translations [0. 0.1; >>>>> 0. 1.1] >>>>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>> Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=10, initial guess is zero >>>>> tolerances: relative=1e-12, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>> type: sor >>>>> type = local_symmetric, iterations = 1, local iterations = >>>>> 1, omega = 1. >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=624, cols=624 >>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 336 nodes, >>>>> limit used is 5 >>>>> estimating eigenvalues using noisy right hand side >>>>> maximum iterations=2, nonzero initial guess >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>> type: sor >>>>> type = local_symmetric, iterations = 1, local iterations = 1, >>>>> omega = 1. linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=624, cols=624 >>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 336 nodes, limit >>>>> used is 5 Up solver (post-smoother) same as down solver (pre-smoother) >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=624, cols=624 >>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 336 nodes, limit >>>>> used is 5 >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Xiaofeng >>>>> >>>>> >>>>> On Jun 14, 2025, at 07:28, Barry Smith wrote: >>>>> >>>>> >>>>> Matt, >>>>> >>>>> Perhaps we should add options -ksp_monitor_debug and >>>>> -snes_monitor_debug that turn on all possible monitoring for the (possibly) >>>>> nested solvers and all of their converged reasons also? Note this is not >>>>> completely trivial because each preconditioner will have to supply its list >>>>> based on the current solver options for it. >>>>> >>>>> Then we won't need to constantly list a big string of problem >>>>> specific monitor options to ask the user to use. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley >>>>> wrote: >>>>> >>>>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng >>>>> wrote: >>>>> >>>>>> Dear authors, >>>>>> >>>>>> I tried *-pc_type game -pc_gamg_parallel_coarse_grid_solver* and *-pc_type >>>>>> field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type >>>>>> pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad >>>>>> -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi >>>>>> _fieldsplit_1_sub_pc_type for* , both options got the >>>>>> KSP_DIVERGE_PC_FAILED error. >>>>>> >>>>> >>>>> With any question about convergence, we need to see the output of >>>>> >>>>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>>>> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>>> -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>> >>>>> and all the error output. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Xiaofeng >>>>>> >>>>>> >>>>>> On Jun 12, 2025, at 20:50, Mark Adams wrote: >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: >>>>>>> >>>>>>>> Adding this to the PETSc mailing list, >>>>>>>> >>>>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> Dear Professor, >>>>>>>>> >>>>>>>>> I hope this message finds you well. >>>>>>>>> >>>>>>>>> I am an employee at a CAE company and a heavy user of the PETSc >>>>>>>>> library. I would like to thank you for your contributions to PETSc and >>>>>>>>> express my deep appreciation for your work. >>>>>>>>> >>>>>>>>> Recently, I encountered some difficulties when using PETSc to >>>>>>>>> solve structural mechanics problems with Lagrange multiplier constraints. >>>>>>>>> After searching extensively online and reviewing several papers, I found >>>>>>>>> your previous paper titled "*Algebraic multigrid methods for >>>>>>>>> constrained linear systems with applications to contact problems in solid >>>>>>>>> mechanics*" seems to be the most relevant and helpful. >>>>>>>>> >>>>>>>>> The stiffness matrix I'm working with, *K*, is a block >>>>>>>>> saddle-point matrix of the form (A00 A01; A10 0), where *A00 is >>>>>>>>> singular*?just as described in your paper, and different from >>>>>>>>> many other articles . I have a few questions regarding your work and would >>>>>>>>> greatly appreciate your insights: >>>>>>>>> >>>>>>>>> 1. Is the *AMG/KKT* method presented in your paper available in >>>>>>>>> PETSc? I tried using *CG+GAMG* directly but received a >>>>>>>>> *KSP_DIVERGED_PC_FAILED* error. I also attempted to use >>>>>>>>> *CG+PCFIELDSPLIT* with the following options: >>>>>>>>> >>>>>>>> >>>>>>>> No >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>>>>>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>>>>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>>>>>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>>>>>>>> -fieldsplit_1_pc_type bjacobi >>>>>>>>> >>>>>>>>> Unfortunately, this also resulted in a *KSP_DIVERGED_PC_FAILED* >>>>>>>>> error. Do you have any suggestions? >>>>>>>>> >>>>>>>>> 2. In your paper, you compare the method with *Uzawa*-type >>>>>>>>> approaches. To my understanding, Uzawa methods typically require A00 to be >>>>>>>>> invertible. How did you handle the singularity of A00 to construct an >>>>>>>>> M-matrix that is invertible? >>>>>>>>> >>>>>>>>> >>>>>>>> You add a regularization term like A01 * A10 (like springs). See >>>>>>>> the paper or any reference to augmented lagrange or Uzawa >>>>>>>> >>>>>>>> >>>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing *AMG >>>>>>>>> APIs*? Implementing a production-level AMG solver from scratch >>>>>>>>> would be quite challenging for me, so I?m hoping to utilize existing AMG >>>>>>>>> interfaces within PETSc or other packages. >>>>>>>>> >>>>>>>>> >>>>>>>> You can do Uzawa and make the regularization matrix with >>>>>>>> matrix-matrix products. Just use AMG for the A00 block. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 4. For saddle-point systems where A00 is singular, can you >>>>>>>>> recommend any more robust or efficient solutions? Alternatively, are you >>>>>>>>> aware of any open-source software packages that can handle such cases >>>>>>>>> out-of-the-box? >>>>>>>>> >>>>>>>>> >>>>>>>> No, and I don't think PETSc can do this out-of-the-box, but others >>>>>>>> may be able to give you a better idea of what PETSc can do. >>>>>>>> I think PETSc can do Uzawa or other similar algorithms but it will >>>>>>>> not do the regularization automatically (it is a bit more complicated than >>>>>>>> just A01 * A10) >>>>>>>> >>>>>>> >>>>>>> One other trick you can use is to have >>>>>>> >>>>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>>>> >>>>>>> This will use SVD on the coarse grid of GAMG, which can handle the >>>>>>> null space in A00 as long as the prolongation does not put it back in. I >>>>>>> have used this for the Laplacian with Neumann conditions and for freely >>>>>>> floating elastic problems. >>>>>>> >>>>>>> >>>>>> Good point. >>>>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to >>>>>> use a on level iterative solver for the coarse grid. >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Mark >>>>>>>> >>>>>>>>> >>>>>>>>> Thank you very much for taking the time to read my email. Looking >>>>>>>>> forward to hearing from you. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Sincerely, >>>>>>>>> >>>>>>>>> Xiaofeng He >>>>>>>>> ----------------------------------------------------- >>>>>>>>> >>>>>>>>> Research Engineer >>>>>>>>> >>>>>>>>> Internet Based Engineering, Beijing, China >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cHB4OLTr7QlRt1fr_h3k0qkn0X-IFHj6y0za4fcxzLznrzyfWFSlsBK-cCaFEQ5yyGBQ91BQj2f5S0NGMjX_0bM$ >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cHB4OLTr7QlRt1fr_h3k0qkn0X-IFHj6y0za4fcxzLznrzyfWFSlsBK-cCaFEQ5yyGBQ91BQj2f5S0NGMjX_0bM$ >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cHB4OLTr7QlRt1fr_h3k0qkn0X-IFHj6y0za4fcxzLznrzyfWFSlsBK-cCaFEQ5yyGBQ91BQj2f5S0NGMjX_0bM$ >>>> >>>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cHB4OLTr7QlRt1fr_h3k0qkn0X-IFHj6y0za4fcxzLznrzyfWFSlsBK-cCaFEQ5yyGBQ91BQj2f5S0NGMjX_0bM$ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cHB4OLTr7QlRt1fr_h3k0qkn0X-IFHj6y0za4fcxzLznrzyfWFSlsBK-cCaFEQ5yyGBQ91BQj2f5S0NGMjX_0bM$ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hexiaofeng at buaa.edu.cn Thu Jun 19 20:18:34 2025 From: hexiaofeng at buaa.edu.cn (hexioafeng) Date: Fri, 20 Jun 2025 09:18:34 +0800 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> <96AB5047-4A35-49A2-B948-86656A1CFB5B@buaa.edu.cn> Message-ID: <82A9172B-C0DE-45F8-8DBD-9F351389541B@buaa.edu.cn> Hello, Here are the outputs with svd: 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to SUBPC_ERROR Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to SUBPC_ERROR KSP Object: 1 MPI processes type: cg maximum iterations=200, initial guess is zero tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: gamg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 1 Complexity: grid = 1.00222 Coarse grid solver -- level ------------------------------- KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: svd All singular values smaller than 1e-12 treated as zero Provided essential rank of the matrix 0 (all other eigenvalues are zeroed) linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=8, cols=8 total: nonzeros=56, allocated nonzeros=56 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=8, cols=8 total: nonzeros=56, allocated nonzeros=56 total number of mallocs used during MatSetValues calls=0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 3 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: chebyshev eigenvalue estimates used: min = 0.0998145, max = 1.09796 eigenvalues estimate via gmres min 0.00156735, max 0.998145 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=480, cols=480 total: nonzeros=25200, allocated nonzeros=25200 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 160 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=480, cols=480 total: nonzeros=25200, allocated nonzeros=25200 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 160 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_1_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (fieldsplit_1_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_sub_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (fieldsplit_1_sub_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_sub_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=144, cols=144 package used to perform factorization: petsc total: nonzeros=240, allocated nonzeros=240 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=144, cols=144 total: nonzeros=240, allocated nonzeros=240 total number of mallocs used during MatSetValues calls=0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=144, cols=144 total: nonzeros=240, allocated nonzeros=240 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: schurcomplement rows=144, cols=144 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_1_) 1 MPI processes type: mpiaij rows=144, cols=144 total: nonzeros=240, allocated nonzeros=240 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines A10 Mat Object: 1 MPI processes type: mpiaij rows=144, cols=480 total: nonzeros=48, allocated nonzeros=48 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 74 nodes, limit used is 5 KSP of A00 KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: gamg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 1 Complexity: grid = 1.00222 Coarse grid solver -- level ------------------------------- KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: bjacobi number of blocks = 1 Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes type: svd All singular values smaller than 1e-12 treated as zero Provided essential rank of the matrix 0 (all other eigenvalues are zeroed) linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=8, cols=8 total: nonzeros=56, allocated nonzeros=56 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=8, cols=8 total: nonzeros=56, allocated nonzeros=56 total number of mallocs used during MatSetValues calls=0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 3 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: chebyshev eigenvalue estimates used: min = 0.0998145, max = 1.09796 eigenvalues estimate via gmres min 0.00156735, max 0.998145 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=480, cols=480 total: nonzeros=25200, allocated nonzeros=25200 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 160 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=480, cols=480 total: nonzeros=25200, allocated nonzeros=25200 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 160 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: mpiaij rows=480, cols=144 total: nonzeros=48, allocated nonzeros=48 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 135 nodes, limit used is 5 Mat Object: 1 MPI processes type: mpiaij rows=144, cols=144 total: nonzeros=240, allocated nonzeros=240 total number of mallocs used during MatSetValues calls=0 not using I-node (on process 0) routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=624, cols=624 total: nonzeros=25536, allocated nonzeros=25536 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 336 nodes, limit used is 5 Thanks, Xiaofeng > On Jun 20, 2025, at 00:56, Mark Adams wrote: > > This is what Matt is looking at: > > PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes > type: lu > > This should be svd, not lu > > If you had used -options_left you would have caught this mistake(s) > > On Thu, Jun 19, 2025 at 8:06?AM Matthew Knepley > wrote: >> On Thu, Jun 19, 2025 at 7:59?AM hexioafeng > wrote: >>> Hello sir, >>> >>> I remove the duplicated "_type", and get the same error and output. >> >> The output cannot be the same. Please send it. >> >> Thanks, >> >> Matt >> >>> Best regards, >>> Xiaofeng >>> >>> >>>> On Jun 19, 2025, at 19:45, Matthew Knepley > wrote: >>>> >>>> This options is wrong >>>> >>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>> >>>> Notice that "_type" is repeated. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Thu, Jun 19, 2025 at 7:10?AM hexioafeng > wrote: >>>>> Dear authors, >>>>> >>>>> Here are the options passed with fieldsplit preconditioner: >>>>> >>>>> -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>> >>>>> and the output: >>>>> >>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >>>>> Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>> PC failed due to SUBPC_ERROR >>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>> PC failed due to SUBPC_ERROR >>>>> KSP Object: 1 MPI processes >>>>> type: cg >>>>> maximum iterations=200, initial guess is zero >>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>> left preconditioning >>>>> using UNPRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI processes >>>>> type: fieldsplit >>>>> FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL >>>>> Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's diagonal's inverse >>>>> Split info: >>>>> Split number 0 Defined by IS >>>>> Split number 1 Defined by IS >>>>> KSP solver for A00 block >>>>> KSP Object: (fieldsplit_0_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_) 1 MPI processes >>>>> type: gamg >>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>> Cycles per PCApply=1 >>>>> Using externally compute Galerkin coarse grid matrices >>>>> GAMG specific options >>>>> Threshold for dropping small values in graph on each level = >>>>> Threshold scaling factor for each level not specified = 1. >>>>> AGG specific options >>>>> Symmetric graph false >>>>> Number of levels to square graph 1 >>>>> Number smoothing steps 1 >>>>> Complexity: grid = 1.00222 >>>>> Coarse grid solver -- level ------------------------------- >>>>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=1, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>> type: lu >>>>> out-of-place factorization >>>>> tolerance for zero pivot 2.22045e-14 >>>>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>>>> matrix ordering: nd >>>>> factor fill ratio given 5., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=8, cols=8 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=8, cols=8 >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=8, cols=8 >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using nonscalable MatPtAP() implementation >>>>> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>> type: chebyshev >>>>> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >>>>> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >>>>> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >>>>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=10, initial guess is zero >>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> estimating eigenvalues using noisy right hand side >>>>> maximum iterations=2, nonzero initial guess >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>> type: sor >>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=480 >>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=480 >>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >>>>> KSP solver for S = A11 - A10 inv(A00) A01 >>>>> KSP Object: (fieldsplit_1_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_1_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_1_sub_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>> KSP Object: (fieldsplit_1_sub_sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_1_sub_sub_) 1 MPI processes >>>>> type: ilu >>>>> out-of-place factorization >>>>> 0 levels of fill >>>>> tolerance for zero pivot 2.22045e-14 >>>>> matrix ordering: natural >>>>> factor fill ratio given 1., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=144, cols=144 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> not using I-node routines >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=144, cols=144 >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=144, cols=144 >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node (on process 0) routines >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: (fieldsplit_1_) 1 MPI processes >>>>> type: schurcomplement >>>>> rows=144, cols=144 >>>>> Schur complement A11 - A10 inv(A00) A01 >>>>> A11 >>>>> Mat Object: (fieldsplit_1_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=144, cols=144 >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node (on process 0) routines >>>>> A10 >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=144, cols=480 >>>>> total: nonzeros=48, allocated nonzeros=48 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 74 nodes, limit used is 5 >>>>> KSP of A00 >>>>> KSP Object: (fieldsplit_0_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_) 1 MPI processes >>>>> type: gamg >>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>> Cycles per PCApply=1 >>>>> Using externally compute Galerkin coarse grid matrices >>>>> GAMG specific options >>>>> Threshold for dropping small values in graph on each level = >>>>> Threshold scaling factor for each level not specified = 1. >>>>> AGG specific options >>>>> Symmetric graph false >>>>> Number of levels to square graph 1 >>>>> Number smoothing steps 1 >>>>> Complexity: grid = 1.00222 >>>>> Coarse grid solver -- level ------------------------------- >>>>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=1, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>> type: lu >>>>> out-of-place factorization >>>>> tolerance for zero pivot 2.22045e-14 >>>>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>>>> matrix ordering: nd >>>>> factor fill ratio given 5., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=8, cols=8 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=8, cols=8 >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=8, cols=8 >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using nonscalable MatPtAP() implementation >>>>> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>> type: chebyshev >>>>> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >>>>> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >>>>> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >>>>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=10, initial guess is zero >>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> estimating eigenvalues using noisy right hand side >>>>> maximum iterations=2, nonzero initial guess >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>> type: sor >>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=480 >>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=480 >>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >>>>> A01 >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=144 >>>>> total: nonzeros=48, allocated nonzeros=48 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 135 nodes, limit used is 5 >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=144, cols=144 >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node (on process 0) routines >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=624, cols=624 >>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>>>> >>>>> >>>>> Thanks, >>>>> Xiaofeng >>>>> >>>>> >>>>> >>>>>> On Jun 17, 2025, at 19:05, Mark Adams > wrote: >>>>>> >>>>>> And don't use -pc_gamg_parallel_coarse_grid_solver >>>>>> You can use that in production but for debugging use -mg_coarse_pc_type svd >>>>>> Also, use -options_left and remove anything that is not used. >>>>>> (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) >>>>>> >>>>>> Mark >>>>>> >>>>>> >>>>>> On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley > wrote: >>>>>>> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng > wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> Here are the options and outputs: >>>>>>>> >>>>>>>> options: >>>>>>>> >>>>>>>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -fieldsplit_1_mat_schur_complement_ainv_type lump -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>>>> >>>>>>> This option was wrong: >>>>>>> >>>>>>> -fieldsplit_0_mg_coarse_pc_type_type svd >>>>>>> >>>>>>> from the output, we can see that it should have been >>>>>>> >>>>>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>>>>> >>>>>>> THanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>>> output: >>>>>>>> >>>>>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>>>>> PC failed due to SUBPC_ERROR >>>>>>>> KSP Object: 1 MPI processes >>>>>>>> type: cg >>>>>>>> maximum iterations=200, initial guess is zero >>>>>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: 1 MPI processes >>>>>>>> type: gamg >>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>>> Cycles per PCApply=1 >>>>>>>> Using externally compute Galerkin coarse grid matrices >>>>>>>> GAMG specific options >>>>>>>> Threshold for dropping small values in graph on each level = >>>>>>>> Threshold scaling factor for each level not specified = 1. >>>>>>>> AGG specific options >>>>>>>> Symmetric graph false >>>>>>>> Number of levels to square graph 1 >>>>>>>> Number smoothing steps 1 >>>>>>>> Complexity: grid = 1.00176 >>>>>>>> Coarse grid solver -- level ------------------------------- >>>>>>>> KSP Object: (mg_coarse_) 1 MPI processes >>>>>>>> type: preonly >>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using NONE norm type for convergence test >>>>>>>> PC Object: (mg_coarse_) 1 MPI processes >>>>>>>> type: bjacobi >>>>>>>> number of blocks = 1 >>>>>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>>>>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>>>>>>> type: preonly >>>>>>>> maximum iterations=1, initial guess is zero >>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using NONE norm type for convergence test >>>>>>>> PC Object: (mg_coarse_sub_) 1 MPI processes >>>>>>>> type: lu >>>>>>>> out-of-place factorization >>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>>>>>>> matrix ordering: nd >>>>>>>> factor fill ratio given 5., needed 1. >>>>>>>> Factored matrix follows: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: seqaij >>>>>>>> rows=7, cols=7 >>>>>>>> package used to perform factorization: petsc >>>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: seqaij >>>>>>>> rows=7, cols=7 >>>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: mpiaij >>>>>>>> rows=7, cols=7 >>>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>> using nonscalable MatPtAP() implementation >>>>>>>> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >>>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>>> KSP Object: (mg_levels_1_) 1 MPI processes >>>>>>>> type: chebyshev >>>>>>>> eigenvalue estimates used: min = 0., max = 0. >>>>>>>> eigenvalues estimate via gmres min 0., max 0. >>>>>>>> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >>>>>>>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>>>>>>> type: gmres >>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>> maximum iterations=10, initial guess is zero >>>>>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>>>>> type: sor >>>>>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: mpiaij >>>>>>>> rows=624, cols=624 >>>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>>>>>>> estimating eigenvalues using noisy right hand side >>>>>>>> maximum iterations=2, nonzero initial guess >>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using NONE norm type for convergence test >>>>>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>>>>> type: sor >>>>>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: mpiaij >>>>>>>> rows=624, cols=624 >>>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: mpiaij >>>>>>>> rows=624, cols=624 >>>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>>>>>>> >>>>>>>> >>>>>>>> Best regards, >>>>>>>> >>>>>>>> Xiaofeng >>>>>>>> >>>>>>>> >>>>>>>>> On Jun 14, 2025, at 07:28, Barry Smith > wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Matt, >>>>>>>>> >>>>>>>>> Perhaps we should add options -ksp_monitor_debug and -snes_monitor_debug that turn on all possible monitoring for the (possibly) nested solvers and all of their converged reasons also? Note this is not completely trivial because each preconditioner will have to supply its list based on the current solver options for it. >>>>>>>>> >>>>>>>>> Then we won't need to constantly list a big string of problem specific monitor options to ask the user to use. >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley > wrote: >>>>>>>>>> >>>>>>>>>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng > wrote: >>>>>>>>>>> Dear authors, >>>>>>>>>>> >>>>>>>>>>> I tried -pc_type game -pc_gamg_parallel_coarse_grid_solver and -pc_type field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi _fieldsplit_1_sub_pc_type for , both options got the KSP_DIVERGE_PC_FAILED error. >>>>>>>>>> >>>>>>>>>> With any question about convergence, we need to see the output of >>>>>>>>>> >>>>>>>>>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>>>>>>> >>>>>>>>>> and all the error output. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Xiaofeng >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Jun 12, 2025, at 20:50, Mark Adams > wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley > wrote: >>>>>>>>>>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams > wrote: >>>>>>>>>>>>>> Adding this to the PETSc mailing list, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng > wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Dear Professor, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I hope this message finds you well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am an employee at a CAE company and a heavy user of the PETSc library. I would like to thank you for your contributions to PETSc and express my deep appreciation for your work. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Recently, I encountered some difficulties when using PETSc to solve structural mechanics problems with Lagrange multiplier constraints. After searching extensively online and reviewing several papers, I found your previous paper titled "Algebraic multigrid methods for constrained linear systems with applications to contact problems in solid mechanics" seems to be the most relevant and helpful. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The stiffness matrix I'm working with, K, is a block saddle-point matrix of the form (A00 A01; A10 0), where A00 is singular?just as described in your paper, and different from many other articles . I have a few questions regarding your work and would greatly appreciate your insights: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. Is the AMG/KKT method presented in your paper available in PETSc? I tried using CG+GAMG directly but received a KSP_DIVERGED_PC_FAILED error. I also attempted to use CG+PCFIELDSPLIT with the following options: >>>>>>>>>>>>>> >>>>>>>>>>>>>> No >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Unfortunately, this also resulted in a KSP_DIVERGED_PC_FAILED error. Do you have any suggestions? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. In your paper, you compare the method with Uzawa-type approaches. To my understanding, Uzawa methods typically require A00 to be invertible. How did you handle the singularity of A00 to construct an M-matrix that is invertible? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing AMG APIs? Implementing a production-level AMG solver from scratch would be quite challenging for me, so I?m hoping to utilize existing AMG interfaces within PETSc or other packages. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 4. For saddle-point systems where A00 is singular, can you recommend any more robust or efficient solutions? Alternatively, are you aware of any open-source software packages that can handle such cases out-of-the-box? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. >>>>>>>>>>>>>> I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) >>>>>>>>>>>>> >>>>>>>>>>>>> One other trick you can use is to have >>>>>>>>>>>>> >>>>>>>>>>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>>>>>>>>>> >>>>>>>>>>>>> This will use SVD on the coarse grid of GAMG, which can handle the null space in A00 as long as the prolongation does not put it back in. I have used this for the Laplacian with Neumann conditions and for freely floating elastic problems. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Good point. >>>>>>>>>>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Mark >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you very much for taking the time to read my email. Looking forward to hearing from you. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sincerely, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Xiaofeng He >>>>>>>>>>>>>>> ----------------------------------------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Internet Based Engineering, Beijing, China >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aDVdbJudFjso7jHS1DFJhOOh76fS5qSwrvzSb1t5hCFZ2rdj3XFfsHLlDjne7FWFjeweHRXBAbd-5KPlb0st3T_PyxNcNQ$ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aDVdbJudFjso7jHS1DFJhOOh76fS5qSwrvzSb1t5hCFZ2rdj3XFfsHLlDjne7FWFjeweHRXBAbd-5KPlb0st3T_PyxNcNQ$ >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aDVdbJudFjso7jHS1DFJhOOh76fS5qSwrvzSb1t5hCFZ2rdj3XFfsHLlDjne7FWFjeweHRXBAbd-5KPlb0st3T_PyxNcNQ$ >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aDVdbJudFjso7jHS1DFJhOOh76fS5qSwrvzSb1t5hCFZ2rdj3XFfsHLlDjne7FWFjeweHRXBAbd-5KPlb0st3T_PyxNcNQ$ >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aDVdbJudFjso7jHS1DFJhOOh76fS5qSwrvzSb1t5hCFZ2rdj3XFfsHLlDjne7FWFjeweHRXBAbd-5KPlb0st3T_PyxNcNQ$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jun 19 21:07:28 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jun 2025 22:07:28 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: <82A9172B-C0DE-45F8-8DBD-9F351389541B@buaa.edu.cn> References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> <96AB5047-4A35-49A2-B948-86656A1CFB5B@buaa.edu.cn> <82A9172B-C0DE-45F8-8DBD-9F351389541B@buaa.edu.cn> Message-ID: On Thu, Jun 19, 2025 at 9:18?PM hexioafeng wrote: > Hello, > > Here are the outputs with svd: > > 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm > 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS > iterations 2 > Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS > iterations 2 > Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED > You are running ILU(0) on your Schur complement, but it looks like it is rank-deficient. You will have to use something that works for that (like maybe GAMG again with SVD on the coarse grid). Is S elliptic? Thanks, Matt > iterations 0 > PC failed due to SUBPC_ERROR > Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS > iterations 2 > Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS > iterations 2 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC failed due to SUBPC_ERROR > KSP Object: 1 MPI processes > type: cg > maximum iterations=200, initial guess is zero > tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 > left preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL > Preconditioner for the Schur complement formed from Sp, an assembled > approximation to S, which uses A00's diagonal's inverse > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_) 1 MPI processes > type: gamg > type is MULTIPLICATIVE, levels=2 cycles=v > Cycles per PCApply=1 > Using externally compute Galerkin coarse grid matrices > GAMG specific options > Threshold for dropping small values in graph on each level = > > Threshold scaling factor for each level not specified = 1. > AGG specific options > Symmetric graph false > Number of levels to square graph 1 > Number smoothing steps 1 > Complexity: grid = 1.00222 > Coarse grid solver -- level ------------------------------- > KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the following > KSP and PC objects on rank 0: > KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes > type: svd > All singular values smaller than 1e-12 treated as zero > Provided essential rank of the matrix 0 (all other > eigenvalues are zeroed) > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=8, cols=8 > total: nonzeros=56, allocated nonzeros=56 > total number of mallocs used during MatSetValues calls=0 > using I-node routines: found 3 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=8, cols=8 > total: nonzeros=56, allocated nonzeros=56 > total number of mallocs used during MatSetValues calls=0 > using nonscalable MatPtAP() implementation > using I-node (on process 0) routines: found 3 nodes, limit > used is 5 > Down solver (pre-smoother) on level 1 > ------------------------------- > KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes > type: chebyshev > eigenvalue estimates used: min = 0.0998145, max = 1.09796 > eigenvalues estimate via gmres min 0.00156735, max 0.998145 > eigenvalues estimated using gmres with translations [0. > 0.1; 0. 1.1] > KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI > processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes > type: sor > type = local_symmetric, iterations = 1, local iterations = > 1, omega = 1. > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=480, cols=480 > total: nonzeros=25200, allocated nonzeros=25200 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 160 nodes, > limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=480, cols=480 > total: nonzeros=25200, allocated nonzeros=25200 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 160 nodes, limit > used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (fieldsplit_1_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the following KSP > and PC objects on rank 0: > KSP Object: (fieldsplit_1_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_sub_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the following > KSP and PC objects on rank 0: > KSP Object: (fieldsplit_1_sub_sub_) > 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_sub_sub_) > 1 MPI processes > type: ilu > out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=144, cols=144 > package used to perform factorization: petsc > total: nonzeros=240, allocated nonzeros=240 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=144, cols=144 > total: nonzeros=240, allocated nonzeros=240 > total number of mallocs used during MatSetValues > calls=0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=144, cols=144 > total: nonzeros=240, allocated nonzeros=240 > total number of mallocs used during MatSetValues calls=0 > not using I-node (on process 0) routines > linear system matrix followed by preconditioner matrix: > Mat Object: (fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=144, cols=144 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (fieldsplit_1_) 1 MPI processes > type: mpiaij > rows=144, cols=144 > total: nonzeros=240, allocated nonzeros=240 > total number of mallocs used during MatSetValues calls=0 > not using I-node (on process 0) routines > A10 > Mat Object: 1 MPI processes > type: mpiaij > rows=144, cols=480 > total: nonzeros=48, allocated nonzeros=48 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 74 nodes, > limit used is 5 > KSP of A00 > KSP Object: (fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_) 1 MPI processes > type: gamg > type is MULTIPLICATIVE, levels=2 cycles=v > Cycles per PCApply=1 > Using externally compute Galerkin coarse grid matrices > GAMG specific options > Threshold for dropping small values in graph on each > level = > Threshold scaling factor for each level not > specified = 1. > AGG specific options > Symmetric graph false > Number of levels to square graph 1 > Number smoothing steps 1 > Complexity: grid = 1.00222 > Coarse grid solver -- level ------------------------------- > KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes > type: bjacobi > number of blocks = 1 > Local solver is the same for all blocks, as in the > following KSP and PC objects on rank 0: > KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI > processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI > processes > type: svd > All singular values smaller than 1e-12 treated as > zero > Provided essential rank of the matrix 0 (all other > eigenvalues are zeroed) > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=8, cols=8 > total: nonzeros=56, allocated nonzeros=56 > total number of mallocs used during MatSetValues > calls=0 > using I-node routines: found 3 nodes, limit used > is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=8, cols=8 > total: nonzeros=56, allocated nonzeros=56 > total number of mallocs used during MatSetValues > calls=0 > using nonscalable MatPtAP() implementation > using I-node (on process 0) routines: found 3 > nodes, limit used is 5 > Down solver (pre-smoother) on level 1 > ------------------------------- > KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes > type: chebyshev > eigenvalue estimates used: min = 0.0998145, max = > 1.09796 > eigenvalues estimate via gmres min 0.00156735, max > 0.998145 > eigenvalues estimated using gmres with translations > [0. 0.1; 0. 1.1] > KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI > processes > type: gmres > restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes > type: sor > type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=480, cols=480 > total: nonzeros=25200, allocated nonzeros=25200 > total number of mallocs used during MatSetValues > calls=0 > using I-node (on process 0) routines: found 160 > nodes, limit used is 5 > Up solver (post-smoother) same as down solver > (pre-smoother) > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=480, cols=480 > total: nonzeros=25200, allocated nonzeros=25200 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 160 nodes, > limit used is 5 > A01 > Mat Object: 1 MPI processes > type: mpiaij > rows=480, cols=144 > total: nonzeros=48, allocated nonzeros=48 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 135 nodes, > limit used is 5 > Mat Object: 1 MPI processes > type: mpiaij > rows=144, cols=144 > total: nonzeros=240, allocated nonzeros=240 > total number of mallocs used during MatSetValues calls=0 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=624, cols=624 > total: nonzeros=25536, allocated nonzeros=25536 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 336 nodes, limit used is > 5 > > > Thanks, > Xiaofeng > > > > On Jun 20, 2025, at 00:56, Mark Adams wrote: > > This is what Matt is looking at: > > PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes > type: lu > > This should be svd, not lu > > If you had used -options_left you would have caught this mistake(s) > > On Thu, Jun 19, 2025 at 8:06?AM Matthew Knepley wrote: > >> On Thu, Jun 19, 2025 at 7:59?AM hexioafeng >> wrote: >> >>> Hello sir, >>> >>> I remove the duplicated "_type", and get the same error and output. >>> >> >> The output cannot be the same. Please send it. >> >> Thanks, >> >> Matt >> >> >>> Best regards, >>> Xiaofeng >>> >>> >>> On Jun 19, 2025, at 19:45, Matthew Knepley wrote: >>> >>> This options is wrong >>> >>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>> >>> Notice that "_type" is repeated. >>> >>> Thanks, >>> >>> Matt >>> >>> On Thu, Jun 19, 2025 at 7:10?AM hexioafeng >>> wrote: >>> >>>> Dear authors, >>>> >>>> Here are the options passed with fieldsplit preconditioner: >>>> >>>> -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view >>>> -ksp_monitor_true_residual -ksp_converged_reason >>>> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>> -fieldsplit_1_ksp_monitor_true_residual >>>> -fieldsplit_1_ksp_converged_reason >>>> >>>> and the output: >>>> >>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >>>> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>>> CONVERGED_ITS iterations 2 >>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>>> CONVERGED_ITS iterations 2 >>>> Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED >>>> iterations 0 >>>> PC failed due to SUBPC_ERROR >>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>>> CONVERGED_ITS iterations 2 >>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>>> CONVERGED_ITS iterations 2 >>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>> PC failed due to SUBPC_ERROR >>>> KSP Object: 1 MPI processes >>>> type: cg >>>> maximum iterations=200, initial guess is zero >>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>> left preconditioning >>>> using UNPRECONDITIONED norm type for convergence test >>>> PC Object: 1 MPI processes >>>> type: fieldsplit >>>> FieldSplit with Schur preconditioner, blocksize = 1, factorization >>>> FULL >>>> Preconditioner for the Schur complement formed from Sp, an >>>> assembled approximation to S, which uses A00's diagonal's inverse >>>> Split info: >>>> Split number 0 Defined by IS >>>> Split number 1 Defined by IS >>>> KSP solver for A00 block >>>> KSP Object: (fieldsplit_0_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_0_) 1 MPI processes >>>> type: gamg >>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>> Cycles per PCApply=1 >>>> Using externally compute Galerkin coarse grid matrices >>>> GAMG specific options >>>> Threshold for dropping small values in graph on each >>>> level = >>>> Threshold scaling factor for each level not specified = 1. >>>> AGG specific options >>>> Symmetric graph false >>>> Number of levels to square graph 1 >>>> Number smoothing steps 1 >>>> Complexity: grid = 1.00222 >>>> Coarse grid solver -- level ------------------------------- >>>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>> type: bjacobi >>>> number of blocks = 1 >>>> Local solver is the same for all blocks, as in the >>>> following KSP and PC objects on rank 0: >>>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=1, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>> type: lu >>>> out-of-place factorization >>>> tolerance for zero pivot 2.22045e-14 >>>> using diagonal shift on blocks to prevent zero pivot >>>> [INBLOCKS] >>>> matrix ordering: nd >>>> factor fill ratio given 5., needed 1. >>>> Factored matrix follows: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=8, cols=8 >>>> package used to perform factorization: petsc >>>> total: nonzeros=56, allocated nonzeros=56 >>>> using I-node routines: found 3 nodes, limit >>>> used is 5 >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=8, cols=8 >>>> total: nonzeros=56, allocated nonzeros=56 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node routines: found 3 nodes, limit used is 5 >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=8, cols=8 >>>> total: nonzeros=56, allocated nonzeros=56 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using nonscalable MatPtAP() implementation >>>> using I-node (on process 0) routines: found 3 nodes, >>>> limit used is 5 >>>> Down solver (pre-smoother) on level 1 >>>> ------------------------------- >>>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>> type: chebyshev >>>> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >>>> eigenvalues estimate via gmres min 0.00156735, max >>>> 0.998145 >>>> eigenvalues estimated using gmres with translations [0. >>>> 0.1; 0. 1.1] >>>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI >>>> processes >>>> type: gmres >>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>> Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=10, initial guess is zero >>>> tolerances: relative=1e-12, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> estimating eigenvalues using noisy right hand side >>>> maximum iterations=2, nonzero initial guess >>>> tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>> type: sor >>>> type = local_symmetric, iterations = 1, local iterations >>>> = 1, omega = 1. >>>> linear system matrix = precond matrix: >>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>> type: mpiaij >>>> rows=480, cols=480 >>>> total: nonzeros=25200, allocated nonzeros=25200 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node (on process 0) routines: found 160 nodes, >>>> limit used is 5 >>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>> linear system matrix = precond matrix: >>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>> type: mpiaij >>>> rows=480, cols=480 >>>> total: nonzeros=25200, allocated nonzeros=25200 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node (on process 0) routines: found 160 nodes, >>>> limit used is 5 >>>> KSP solver for S = A11 - A10 inv(A00) A01 >>>> KSP Object: (fieldsplit_1_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_1_) 1 MPI processes >>>> type: bjacobi >>>> number of blocks = 1 >>>> Local solver is the same for all blocks, as in the following >>>> KSP and PC objects on rank 0: >>>> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_1_sub_) 1 MPI processes >>>> type: bjacobi >>>> number of blocks = 1 >>>> Local solver is the same for all blocks, as in the >>>> following KSP and PC objects on rank 0: >>>> KSP Object: (fieldsplit_1_sub_sub_) >>>> 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_1_sub_sub_) >>>> 1 MPI processes >>>> type: ilu >>>> out-of-place factorization >>>> 0 levels of fill >>>> tolerance for zero pivot 2.22045e-14 >>>> matrix ordering: natural >>>> factor fill ratio given 1., needed 1. >>>> Factored matrix follows: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=144, cols=144 >>>> package used to perform factorization: >>>> petsc >>>> total: nonzeros=240, allocated >>>> nonzeros=240 >>>> not using I-node routines >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=144, cols=144 >>>> total: nonzeros=240, allocated nonzeros=240 >>>> total number of mallocs used during >>>> MatSetValues calls=0 >>>> not using I-node routines >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=144, cols=144 >>>> total: nonzeros=240, allocated nonzeros=240 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node (on process 0) routines >>>> linear system matrix followed by preconditioner matrix: >>>> Mat Object: (fieldsplit_1_) 1 MPI processes >>>> type: schurcomplement >>>> rows=144, cols=144 >>>> Schur complement A11 - A10 inv(A00) A01 >>>> A11 >>>> Mat Object: (fieldsplit_1_) 1 MPI processes >>>> type: mpiaij >>>> rows=144, cols=144 >>>> total: nonzeros=240, allocated nonzeros=240 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node (on process 0) routines >>>> A10 >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=144, cols=480 >>>> total: nonzeros=48, allocated nonzeros=48 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node (on process 0) routines: found 74 nodes, >>>> limit used is 5 >>>> KSP of A00 >>>> KSP Object: (fieldsplit_0_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_0_) 1 MPI processes >>>> type: gamg >>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>> Cycles per PCApply=1 >>>> Using externally compute Galerkin coarse grid >>>> matrices >>>> GAMG specific options >>>> Threshold for dropping small values in graph on >>>> each level = >>>> Threshold scaling factor for each level not >>>> specified = 1. >>>> AGG specific options >>>> Symmetric graph false >>>> Number of levels to square graph 1 >>>> Number smoothing steps 1 >>>> Complexity: grid = 1.00222 >>>> Coarse grid solver -- level >>>> ------------------------------- >>>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>> type: bjacobi >>>> number of blocks = 1 >>>> Local solver is the same for all blocks, as in >>>> the following KSP and PC objects on rank 0: >>>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >>>> processes >>>> type: preonly >>>> maximum iterations=1, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >>>> processes >>>> type: lu >>>> out-of-place factorization >>>> tolerance for zero pivot 2.22045e-14 >>>> using diagonal shift on blocks to prevent zero >>>> pivot [INBLOCKS] >>>> matrix ordering: nd >>>> factor fill ratio given 5., needed 1. >>>> Factored matrix follows: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=8, cols=8 >>>> package used to perform factorization: >>>> petsc >>>> total: nonzeros=56, allocated nonzeros=56 >>>> using I-node routines: found 3 nodes, >>>> limit used is 5 >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=8, cols=8 >>>> total: nonzeros=56, allocated nonzeros=56 >>>> total number of mallocs used during >>>> MatSetValues calls=0 >>>> using I-node routines: found 3 nodes, limit >>>> used is 5 >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=8, cols=8 >>>> total: nonzeros=56, allocated nonzeros=56 >>>> total number of mallocs used during MatSetValues >>>> calls=0 >>>> using nonscalable MatPtAP() implementation >>>> using I-node (on process 0) routines: found 3 >>>> nodes, limit used is 5 >>>> Down solver (pre-smoother) on level 1 >>>> ------------------------------- >>>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI >>>> processes >>>> type: chebyshev >>>> eigenvalue estimates used: min = 0.0998145, max >>>> = 1.09796 >>>> eigenvalues estimate via gmres min 0.00156735, >>>> max 0.998145 >>>> eigenvalues estimated using gmres with >>>> translations [0. 0.1; 0. 1.1] >>>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 >>>> MPI processes >>>> type: gmres >>>> restart=30, using Classical (unmodified) >>>> Gram-Schmidt Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=10, initial guess is zero >>>> tolerances: relative=1e-12, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence >>>> test >>>> estimating eigenvalues using noisy right hand side >>>> maximum iterations=2, nonzero initial guess >>>> tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>> type: sor >>>> type = local_symmetric, iterations = 1, local >>>> iterations = 1, omega = 1. >>>> linear system matrix = precond matrix: >>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>> type: mpiaij >>>> rows=480, cols=480 >>>> total: nonzeros=25200, allocated nonzeros=25200 >>>> total number of mallocs used during MatSetValues >>>> calls=0 >>>> using I-node (on process 0) routines: found 160 >>>> nodes, limit used is 5 >>>> Up solver (post-smoother) same as down solver >>>> (pre-smoother) >>>> linear system matrix = precond matrix: >>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>> type: mpiaij >>>> rows=480, cols=480 >>>> total: nonzeros=25200, allocated nonzeros=25200 >>>> total number of mallocs used during MatSetValues >>>> calls=0 >>>> using I-node (on process 0) routines: found 160 >>>> nodes, limit used is 5 >>>> A01 >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=480, cols=144 >>>> total: nonzeros=48, allocated nonzeros=48 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node (on process 0) routines: found 135 >>>> nodes, limit used is 5 >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=144, cols=144 >>>> total: nonzeros=240, allocated nonzeros=240 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node (on process 0) routines >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: mpiaij >>>> rows=624, cols=624 >>>> total: nonzeros=25536, allocated nonzeros=25536 >>>> total number of mallocs used during MatSetValues calls=0 >>>> using I-node (on process 0) routines: found 336 nodes, limit used >>>> is 5 >>>> >>>> >>>> Thanks, >>>> Xiaofeng >>>> >>>> >>>> >>>> On Jun 17, 2025, at 19:05, Mark Adams wrote: >>>> >>>> And don't use -pc_gamg_parallel_coarse_grid_solver >>>> You can use that in production but for debugging use -mg_coarse_pc_type >>>> svd >>>> Also, use -options_left and remove anything that is not used. >>>> (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) >>>> >>>> Mark >>>> >>>> >>>> On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley >>>> wrote: >>>> >>>>> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> Here are the options and outputs: >>>>>> >>>>>> options: >>>>>> >>>>>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver >>>>>> -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur >>>>>> -pc_fieldsplit_schur_precondition selfp >>>>>> -fieldsplit_1_mat_schur_complement_ainv_type lump >>>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>>> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd >>>>>> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>>> -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual >>>>>> -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>>>> -fieldsplit_1_ksp_monitor_true_residual >>>>>> -fieldsplit_1_ksp_converged_reason >>>>>> >>>>> >>>>> This option was wrong: >>>>> >>>>> -fieldsplit_0_mg_coarse_pc_type_type svd >>>>> >>>>> from the output, we can see that it should have been >>>>> >>>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>>> >>>>> THanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> output: >>>>>> >>>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >>>>>> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>>> PC failed due to SUBPC_ERROR >>>>>> KSP Object: 1 MPI processes >>>>>> type: cg >>>>>> maximum iterations=200, initial guess is zero >>>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>>> left preconditioning >>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI processes >>>>>> type: gamg >>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>> Cycles per PCApply=1 >>>>>> Using externally compute Galerkin coarse grid matrices >>>>>> GAMG specific options >>>>>> Threshold for dropping small values in graph on each level = >>>>>> Threshold scaling factor for each level not specified = 1. >>>>>> AGG specific options >>>>>> Symmetric graph false >>>>>> Number of levels to square graph 1 >>>>>> Number smoothing steps 1 >>>>>> Complexity: grid = 1.00176 >>>>>> Coarse grid solver -- level ------------------------------- >>>>>> KSP Object: (mg_coarse_) 1 MPI processes >>>>>> type: preonly >>>>>> maximum iterations=10000, initial guess is zero >>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using NONE norm type for convergence test >>>>>> PC Object: (mg_coarse_) 1 MPI processes >>>>>> type: bjacobi >>>>>> number of blocks = 1 >>>>>> Local solver is the same for all blocks, as in the following >>>>>> KSP and PC objects on rank 0: >>>>>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>>>>> type: preonly >>>>>> maximum iterations=1, initial guess is zero >>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using NONE norm type for convergence test >>>>>> PC Object: (mg_coarse_sub_) 1 MPI processes >>>>>> type: lu >>>>>> out-of-place factorization >>>>>> tolerance for zero pivot 2.22045e-14 >>>>>> using diagonal shift on blocks to prevent zero pivot >>>>>> [INBLOCKS] >>>>>> matrix ordering: nd >>>>>> factor fill ratio given 5., needed 1. >>>>>> Factored matrix follows: >>>>>> Mat Object: 1 MPI processes >>>>>> type: seqaij >>>>>> rows=7, cols=7 >>>>>> package used to perform factorization: petsc >>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>> using I-node routines: found 3 nodes, limit used is >>>>>> 5 >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: seqaij >>>>>> rows=7, cols=7 >>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: mpiaij >>>>>> rows=7, cols=7 >>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> using nonscalable MatPtAP() implementation >>>>>> using I-node (on process 0) routines: found 3 nodes, limit >>>>>> used is 5 >>>>>> Down solver (pre-smoother) on level 1 >>>>>> ------------------------------- >>>>>> KSP Object: (mg_levels_1_) 1 MPI processes >>>>>> type: chebyshev >>>>>> eigenvalue estimates used: min = 0., max = 0. >>>>>> eigenvalues estimate via gmres min 0., max 0. >>>>>> eigenvalues estimated using gmres with translations [0. 0.1; >>>>>> 0. 1.1] >>>>>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>>>>> type: gmres >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>> Orthogonalization with no iterative refinement >>>>>> happy breakdown tolerance 1e-30 >>>>>> maximum iterations=10, initial guess is zero >>>>>> tolerances: relative=1e-12, absolute=1e-50, >>>>>> divergence=10000. >>>>>> left preconditioning >>>>>> using PRECONDITIONED norm type for convergence test >>>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>>> type: sor >>>>>> type = local_symmetric, iterations = 1, local iterations >>>>>> = 1, omega = 1. >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: mpiaij >>>>>> rows=624, cols=624 >>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> using I-node (on process 0) routines: found 336 nodes, >>>>>> limit used is 5 >>>>>> estimating eigenvalues using noisy right hand side >>>>>> maximum iterations=2, nonzero initial guess >>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using NONE norm type for convergence test >>>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>>> type: sor >>>>>> type = local_symmetric, iterations = 1, local iterations = 1, >>>>>> omega = 1. linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: mpiaij >>>>>> rows=624, cols=624 >>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> using I-node (on process 0) routines: found 336 nodes, >>>>>> limit used is 5 Up solver (post-smoother) same as down solver >>>>>> (pre-smoother) >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: mpiaij >>>>>> rows=624, cols=624 >>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> using I-node (on process 0) routines: found 336 nodes, limit >>>>>> used is 5 >>>>>> >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Xiaofeng >>>>>> >>>>>> >>>>>> On Jun 14, 2025, at 07:28, Barry Smith wrote: >>>>>> >>>>>> >>>>>> Matt, >>>>>> >>>>>> Perhaps we should add options -ksp_monitor_debug and >>>>>> -snes_monitor_debug that turn on all possible monitoring for the (possibly) >>>>>> nested solvers and all of their converged reasons also? Note this is not >>>>>> completely trivial because each preconditioner will have to supply its list >>>>>> based on the current solver options for it. >>>>>> >>>>>> Then we won't need to constantly list a big string of problem >>>>>> specific monitor options to ask the user to use. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley >>>>>> wrote: >>>>>> >>>>>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng >>>>>> wrote: >>>>>> >>>>>>> Dear authors, >>>>>>> >>>>>>> I tried *-pc_type game -pc_gamg_parallel_coarse_grid_solver* and *-pc_type >>>>>>> field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type >>>>>>> pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad >>>>>>> -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi >>>>>>> _fieldsplit_1_sub_pc_type for* , both options got the >>>>>>> KSP_DIVERGE_PC_FAILED error. >>>>>>> >>>>>> >>>>>> With any question about convergence, we need to see the output of >>>>>> >>>>>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>>>>> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>>>> -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>>> >>>>>> and all the error output. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Xiaofeng >>>>>>> >>>>>>> >>>>>>> On Jun 12, 2025, at 20:50, Mark Adams wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams wrote: >>>>>>>> >>>>>>>>> Adding this to the PETSc mailing list, >>>>>>>>> >>>>>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Dear Professor, >>>>>>>>>> >>>>>>>>>> I hope this message finds you well. >>>>>>>>>> >>>>>>>>>> I am an employee at a CAE company and a heavy user of the PETSc >>>>>>>>>> library. I would like to thank you for your contributions to PETSc and >>>>>>>>>> express my deep appreciation for your work. >>>>>>>>>> >>>>>>>>>> Recently, I encountered some difficulties when using PETSc to >>>>>>>>>> solve structural mechanics problems with Lagrange multiplier constraints. >>>>>>>>>> After searching extensively online and reviewing several papers, I found >>>>>>>>>> your previous paper titled "*Algebraic multigrid methods for >>>>>>>>>> constrained linear systems with applications to contact problems in solid >>>>>>>>>> mechanics*" seems to be the most relevant and helpful. >>>>>>>>>> >>>>>>>>>> The stiffness matrix I'm working with, *K*, is a block >>>>>>>>>> saddle-point matrix of the form (A00 A01; A10 0), where *A00 is >>>>>>>>>> singular*?just as described in your paper, and different from >>>>>>>>>> many other articles . I have a few questions regarding your work and would >>>>>>>>>> greatly appreciate your insights: >>>>>>>>>> >>>>>>>>>> 1. Is the *AMG/KKT* method presented in your paper available in >>>>>>>>>> PETSc? I tried using *CG+GAMG* directly but received a >>>>>>>>>> *KSP_DIVERGED_PC_FAILED* error. I also attempted to use >>>>>>>>>> *CG+PCFIELDSPLIT* with the following options: >>>>>>>>>> >>>>>>>>> >>>>>>>>> No >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>>>>>>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>>>>>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>>>>>>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>>>>>>>>> -fieldsplit_1_pc_type bjacobi >>>>>>>>>> >>>>>>>>>> Unfortunately, this also resulted in a >>>>>>>>>> *KSP_DIVERGED_PC_FAILED* error. Do you have any suggestions? >>>>>>>>>> >>>>>>>>>> 2. In your paper, you compare the method with *Uzawa*-type >>>>>>>>>> approaches. To my understanding, Uzawa methods typically require A00 to be >>>>>>>>>> invertible. How did you handle the singularity of A00 to construct an >>>>>>>>>> M-matrix that is invertible? >>>>>>>>>> >>>>>>>>>> >>>>>>>>> You add a regularization term like A01 * A10 (like springs). See >>>>>>>>> the paper or any reference to augmented lagrange or Uzawa >>>>>>>>> >>>>>>>>> >>>>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing >>>>>>>>>> *AMG APIs*? Implementing a production-level AMG solver from >>>>>>>>>> scratch would be quite challenging for me, so I?m hoping to utilize >>>>>>>>>> existing AMG interfaces within PETSc or other packages. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> You can do Uzawa and make the regularization matrix with >>>>>>>>> matrix-matrix products. Just use AMG for the A00 block. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> 4. For saddle-point systems where A00 is singular, can you >>>>>>>>>> recommend any more robust or efficient solutions? Alternatively, are you >>>>>>>>>> aware of any open-source software packages that can handle such cases >>>>>>>>>> out-of-the-box? >>>>>>>>>> >>>>>>>>>> >>>>>>>>> No, and I don't think PETSc can do this out-of-the-box, but others >>>>>>>>> may be able to give you a better idea of what PETSc can do. >>>>>>>>> I think PETSc can do Uzawa or other similar algorithms but it will >>>>>>>>> not do the regularization automatically (it is a bit more complicated than >>>>>>>>> just A01 * A10) >>>>>>>>> >>>>>>>> >>>>>>>> One other trick you can use is to have >>>>>>>> >>>>>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>>>>> >>>>>>>> This will use SVD on the coarse grid of GAMG, which can handle the >>>>>>>> null space in A00 as long as the prolongation does not put it back in. I >>>>>>>> have used this for the Laplacian with Neumann conditions and for freely >>>>>>>> floating elastic problems. >>>>>>>> >>>>>>>> >>>>>>> Good point. >>>>>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to >>>>>>> use a on level iterative solver for the coarse grid. >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Mark >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thank you very much for taking the time to read my email. Looking >>>>>>>>>> forward to hearing from you. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Sincerely, >>>>>>>>>> >>>>>>>>>> Xiaofeng He >>>>>>>>>> ----------------------------------------------------- >>>>>>>>>> >>>>>>>>>> Research Engineer >>>>>>>>>> >>>>>>>>>> Internet Based Engineering, Beijing, China >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bCEQ75t2tb346ZO4z8MiHCNG9f8IWujBRyEK8EbJqLTQfpfGIn5H_0ZXA_V7K7Y7Csps7k35GiSVrqnpTYvh$ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bCEQ75t2tb346ZO4z8MiHCNG9f8IWujBRyEK8EbJqLTQfpfGIn5H_0ZXA_V7K7Y7Csps7k35GiSVrqnpTYvh$ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bCEQ75t2tb346ZO4z8MiHCNG9f8IWujBRyEK8EbJqLTQfpfGIn5H_0ZXA_V7K7Y7Csps7k35GiSVrqnpTYvh$ >>>>> >>>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bCEQ75t2tb346ZO4z8MiHCNG9f8IWujBRyEK8EbJqLTQfpfGIn5H_0ZXA_V7K7Y7Csps7k35GiSVrqnpTYvh$ >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bCEQ75t2tb346ZO4z8MiHCNG9f8IWujBRyEK8EbJqLTQfpfGIn5H_0ZXA_V7K7Y7Csps7k35GiSVrqnpTYvh$ >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bCEQ75t2tb346ZO4z8MiHCNG9f8IWujBRyEK8EbJqLTQfpfGIn5H_0ZXA_V7K7Y7Csps7k35GiSVrqnpTYvh$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hexiaofeng at buaa.edu.cn Thu Jun 19 21:49:24 2025 From: hexiaofeng at buaa.edu.cn (hexioafeng) Date: Fri, 20 Jun 2025 10:49:24 +0800 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> <96AB5047-4A35-49A2-B948-86656A1CFB5B@buaa.edu.cn> <82A9172B-C0DE-45F8-8DBD-9F351389541B@buaa.edu.cn> Message-ID: I tried to solve S with game and use svd on the coarse grid, then I got the error: Arugments are incompatible, Zero diagonal on row 0. In my opinion, S should be rank-efficient, but not elliptic. Best regards, Xiaofeng > On Jun 20, 2025, at 10:07, Matthew Knepley wrote: > > On Thu, Jun 19, 2025 at 9:18?PM hexioafeng > wrote: >> Hello, >> >> Here are the outputs with svd: >> >> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >> Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED > > You are running ILU(0) on your Schur complement, but it looks like it is rank-deficient. You will have to use something that works for that (like maybe GAMG again with SVD on the coarse grid). Is S elliptic? > > Thanks, > > Matt > >> iterations 0 >> PC failed due to SUBPC_ERROR >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC failed due to SUBPC_ERROR >> KSP Object: 1 MPI processes >> type: cg >> maximum iterations=200, initial guess is zero >> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >> left preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL >> Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's diagonal's inverse >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on each level = >> Threshold scaling factor for each level not specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00222 >> Coarse grid solver -- level ------------------------------- >> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: svd >> All singular values smaller than 1e-12 treated as zero >> Provided essential rank of the matrix 0 (all other eigenvalues are zeroed) >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (fieldsplit_1_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_1_sub_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_sub_) 1 MPI processes >> type: ilu >> out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=144, cols=144 >> package used to perform factorization: petsc >> total: nonzeros=240, allocated nonzeros=240 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> linear system matrix followed by preconditioner matrix: >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: schurcomplement >> rows=144, cols=144 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> A10 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=480 >> total: nonzeros=48, allocated nonzeros=48 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 74 nodes, limit used is 5 >> KSP of A00 >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on each level = >> Threshold scaling factor for each level not specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00222 >> Coarse grid solver -- level ------------------------------- >> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: svd >> All singular values smaller than 1e-12 treated as zero >> Provided essential rank of the matrix 0 (all other eigenvalues are zeroed) >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >> A01 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=480, cols=144 >> total: nonzeros=48, allocated nonzeros=48 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 135 nodes, limit used is 5 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=624, cols=624 >> total: nonzeros=25536, allocated nonzeros=25536 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >> >> >> Thanks, >> Xiaofeng >> >> >> >>> On Jun 20, 2025, at 00:56, Mark Adams > wrote: >>> >>> This is what Matt is looking at: >>> >>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>> type: lu >>> >>> This should be svd, not lu >>> >>> If you had used -options_left you would have caught this mistake(s) >>> >>> On Thu, Jun 19, 2025 at 8:06?AM Matthew Knepley > wrote: >>>> On Thu, Jun 19, 2025 at 7:59?AM hexioafeng > wrote: >>>>> Hello sir, >>>>> >>>>> I remove the duplicated "_type", and get the same error and output. >>>> >>>> The output cannot be the same. Please send it. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> Best regards, >>>>> Xiaofeng >>>>> >>>>> >>>>>> On Jun 19, 2025, at 19:45, Matthew Knepley > wrote: >>>>>> >>>>>> This options is wrong >>>>>> >>>>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>>>> >>>>>> Notice that "_type" is repeated. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> On Thu, Jun 19, 2025 at 7:10?AM hexioafeng > wrote: >>>>>>> Dear authors, >>>>>>> >>>>>>> Here are the options passed with fieldsplit preconditioner: >>>>>>> >>>>>>> -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>>>> >>>>>>> and the output: >>>>>>> >>>>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >>>>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >>>>>>> Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>>>> PC failed due to SUBPC_ERROR >>>>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >>>>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to CONVERGED_ITS iterations 2 >>>>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>>>> PC failed due to SUBPC_ERROR >>>>>>> KSP Object: 1 MPI processes >>>>>>> type: cg >>>>>>> maximum iterations=200, initial guess is zero >>>>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>>>> left preconditioning >>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI processes >>>>>>> type: fieldsplit >>>>>>> FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL >>>>>>> Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's diagonal's inverse >>>>>>> Split info: >>>>>>> Split number 0 Defined by IS >>>>>>> Split number 1 Defined by IS >>>>>>> KSP solver for A00 block >>>>>>> KSP Object: (fieldsplit_0_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_0_) 1 MPI processes >>>>>>> type: gamg >>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>> Cycles per PCApply=1 >>>>>>> Using externally compute Galerkin coarse grid matrices >>>>>>> GAMG specific options >>>>>>> Threshold for dropping small values in graph on each level = >>>>>>> Threshold scaling factor for each level not specified = 1. >>>>>>> AGG specific options >>>>>>> Symmetric graph false >>>>>>> Number of levels to square graph 1 >>>>>>> Number smoothing steps 1 >>>>>>> Complexity: grid = 1.00222 >>>>>>> Coarse grid solver -- level ------------------------------- >>>>>>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>>>> type: bjacobi >>>>>>> number of blocks = 1 >>>>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>>>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=1, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>>>> type: lu >>>>>>> out-of-place factorization >>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>>>>>> matrix ordering: nd >>>>>>> factor fill ratio given 5., needed 1. >>>>>>> Factored matrix follows: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=8, cols=8 >>>>>>> package used to perform factorization: petsc >>>>>>> total: nonzeros=56, allocated nonzeros=56 >>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=8, cols=8 >>>>>>> total: nonzeros=56, allocated nonzeros=56 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=8, cols=8 >>>>>>> total: nonzeros=56, allocated nonzeros=56 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using nonscalable MatPtAP() implementation >>>>>>> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>>>> type: chebyshev >>>>>>> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >>>>>>> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >>>>>>> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >>>>>>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=10, initial guess is zero >>>>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> estimating eigenvalues using noisy right hand side >>>>>>> maximum iterations=2, nonzero initial guess >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>>>> type: sor >>>>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=480, cols=480 >>>>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=480, cols=480 >>>>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >>>>>>> KSP solver for S = A11 - A10 inv(A00) A01 >>>>>>> KSP Object: (fieldsplit_1_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_1_) 1 MPI processes >>>>>>> type: bjacobi >>>>>>> number of blocks = 1 >>>>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>>>> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_1_sub_) 1 MPI processes >>>>>>> type: bjacobi >>>>>>> number of blocks = 1 >>>>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>>>> KSP Object: (fieldsplit_1_sub_sub_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_1_sub_sub_) 1 MPI processes >>>>>>> type: ilu >>>>>>> out-of-place factorization >>>>>>> 0 levels of fill >>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>> matrix ordering: natural >>>>>>> factor fill ratio given 1., needed 1. >>>>>>> Factored matrix follows: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=144, cols=144 >>>>>>> package used to perform factorization: petsc >>>>>>> total: nonzeros=240, allocated nonzeros=240 >>>>>>> not using I-node routines >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=144, cols=144 >>>>>>> total: nonzeros=240, allocated nonzeros=240 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=144, cols=144 >>>>>>> total: nonzeros=240, allocated nonzeros=240 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node (on process 0) routines >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: (fieldsplit_1_) 1 MPI processes >>>>>>> type: schurcomplement >>>>>>> rows=144, cols=144 >>>>>>> Schur complement A11 - A10 inv(A00) A01 >>>>>>> A11 >>>>>>> Mat Object: (fieldsplit_1_) 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=144, cols=144 >>>>>>> total: nonzeros=240, allocated nonzeros=240 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node (on process 0) routines >>>>>>> A10 >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=144, cols=480 >>>>>>> total: nonzeros=48, allocated nonzeros=48 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 74 nodes, limit used is 5 >>>>>>> KSP of A00 >>>>>>> KSP Object: (fieldsplit_0_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_0_) 1 MPI processes >>>>>>> type: gamg >>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>> Cycles per PCApply=1 >>>>>>> Using externally compute Galerkin coarse grid matrices >>>>>>> GAMG specific options >>>>>>> Threshold for dropping small values in graph on each level = >>>>>>> Threshold scaling factor for each level not specified = 1. >>>>>>> AGG specific options >>>>>>> Symmetric graph false >>>>>>> Number of levels to square graph 1 >>>>>>> Number smoothing steps 1 >>>>>>> Complexity: grid = 1.00222 >>>>>>> Coarse grid solver -- level ------------------------------- >>>>>>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>>>> type: bjacobi >>>>>>> number of blocks = 1 >>>>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>>>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=1, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>>>> type: lu >>>>>>> out-of-place factorization >>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>>>>>> matrix ordering: nd >>>>>>> factor fill ratio given 5., needed 1. >>>>>>> Factored matrix follows: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=8, cols=8 >>>>>>> package used to perform factorization: petsc >>>>>>> total: nonzeros=56, allocated nonzeros=56 >>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=8, cols=8 >>>>>>> total: nonzeros=56, allocated nonzeros=56 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=8, cols=8 >>>>>>> total: nonzeros=56, allocated nonzeros=56 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using nonscalable MatPtAP() implementation >>>>>>> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>>>> type: chebyshev >>>>>>> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >>>>>>> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >>>>>>> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >>>>>>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI processes >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=10, initial guess is zero >>>>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> estimating eigenvalues using noisy right hand side >>>>>>> maximum iterations=2, nonzero initial guess >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>>>> type: sor >>>>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=480, cols=480 >>>>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=480, cols=480 >>>>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 160 nodes, limit used is 5 >>>>>>> A01 >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=480, cols=144 >>>>>>> total: nonzeros=48, allocated nonzeros=48 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 135 nodes, limit used is 5 >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=144, cols=144 >>>>>>> total: nonzeros=240, allocated nonzeros=240 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node (on process 0) routines >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=624, cols=624 >>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Xiaofeng >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Jun 17, 2025, at 19:05, Mark Adams > wrote: >>>>>>>> >>>>>>>> And don't use -pc_gamg_parallel_coarse_grid_solver >>>>>>>> You can use that in production but for debugging use -mg_coarse_pc_type svd >>>>>>>> Also, use -options_left and remove anything that is not used. >>>>>>>> (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) >>>>>>>> >>>>>>>> Mark >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley > wrote: >>>>>>>>> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng > wrote: >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> Here are the options and outputs: >>>>>>>>>> >>>>>>>>>> options: >>>>>>>>>> >>>>>>>>>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -fieldsplit_1_mat_schur_complement_ainv_type lump -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>>>>>> >>>>>>>>> This option was wrong: >>>>>>>>> >>>>>>>>> -fieldsplit_0_mg_coarse_pc_type_type svd >>>>>>>>> >>>>>>>>> from the output, we can see that it should have been >>>>>>>>> >>>>>>>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>>>>>>> >>>>>>>>> THanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>>> output: >>>>>>>>>> >>>>>>>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>>>>>>> PC failed due to SUBPC_ERROR >>>>>>>>>> KSP Object: 1 MPI processes >>>>>>>>>> type: cg >>>>>>>>>> maximum iterations=200, initial guess is zero >>>>>>>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>>>>>>> left preconditioning >>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: 1 MPI processes >>>>>>>>>> type: gamg >>>>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>>>>> Cycles per PCApply=1 >>>>>>>>>> Using externally compute Galerkin coarse grid matrices >>>>>>>>>> GAMG specific options >>>>>>>>>> Threshold for dropping small values in graph on each level = >>>>>>>>>> Threshold scaling factor for each level not specified = 1. >>>>>>>>>> AGG specific options >>>>>>>>>> Symmetric graph false >>>>>>>>>> Number of levels to square graph 1 >>>>>>>>>> Number smoothing steps 1 >>>>>>>>>> Complexity: grid = 1.00176 >>>>>>>>>> Coarse grid solver -- level ------------------------------- >>>>>>>>>> KSP Object: (mg_coarse_) 1 MPI processes >>>>>>>>>> type: preonly >>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>>> left preconditioning >>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>> PC Object: (mg_coarse_) 1 MPI processes >>>>>>>>>> type: bjacobi >>>>>>>>>> number of blocks = 1 >>>>>>>>>> Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0: >>>>>>>>>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>>>>>>>>> type: preonly >>>>>>>>>> maximum iterations=1, initial guess is zero >>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>>> left preconditioning >>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>> PC Object: (mg_coarse_sub_) 1 MPI processes >>>>>>>>>> type: lu >>>>>>>>>> out-of-place factorization >>>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >>>>>>>>>> matrix ordering: nd >>>>>>>>>> factor fill ratio given 5., needed 1. >>>>>>>>>> Factored matrix follows: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: seqaij >>>>>>>>>> rows=7, cols=7 >>>>>>>>>> package used to perform factorization: petsc >>>>>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: seqaij >>>>>>>>>> rows=7, cols=7 >>>>>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: mpiaij >>>>>>>>>> rows=7, cols=7 >>>>>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>> using nonscalable MatPtAP() implementation >>>>>>>>>> using I-node (on process 0) routines: found 3 nodes, limit used is 5 >>>>>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>>>>> KSP Object: (mg_levels_1_) 1 MPI processes >>>>>>>>>> type: chebyshev >>>>>>>>>> eigenvalue estimates used: min = 0., max = 0. >>>>>>>>>> eigenvalues estimate via gmres min 0., max 0. >>>>>>>>>> eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] >>>>>>>>>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>>>>>>>>> type: gmres >>>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>>> maximum iterations=10, initial guess is zero >>>>>>>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>>>>>>>> left preconditioning >>>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>>>>>>> type: sor >>>>>>>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: mpiaij >>>>>>>>>> rows=624, cols=624 >>>>>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>>>>>>>>> estimating eigenvalues using noisy right hand side >>>>>>>>>> maximum iterations=2, nonzero initial guess >>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>>> left preconditioning >>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>>>>>>> type: sor >>>>>>>>>> type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: mpiaij >>>>>>>>>> rows=624, cols=624 >>>>>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: mpiaij >>>>>>>>>> rows=624, cols=624 >>>>>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>> using I-node (on process 0) routines: found 336 nodes, limit used is 5 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> >>>>>>>>>> Xiaofeng >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Jun 14, 2025, at 07:28, Barry Smith > wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Matt, >>>>>>>>>>> >>>>>>>>>>> Perhaps we should add options -ksp_monitor_debug and -snes_monitor_debug that turn on all possible monitoring for the (possibly) nested solvers and all of their converged reasons also? Note this is not completely trivial because each preconditioner will have to supply its list based on the current solver options for it. >>>>>>>>>>> >>>>>>>>>>> Then we won't need to constantly list a big string of problem specific monitor options to ask the user to use. >>>>>>>>>>> >>>>>>>>>>> Barry >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley > wrote: >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng > wrote: >>>>>>>>>>>>> Dear authors, >>>>>>>>>>>>> >>>>>>>>>>>>> I tried -pc_type game -pc_gamg_parallel_coarse_grid_solver and -pc_type field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi _fieldsplit_1_sub_pc_type for , both options got the KSP_DIVERGE_PC_FAILED error. >>>>>>>>>>>> >>>>>>>>>>>> With any question about convergence, we need to see the output of >>>>>>>>>>>> >>>>>>>>>>>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual -fieldsplit_0_mg_levels_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>>>>>>>>> >>>>>>>>>>>> and all the error output. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Xiaofeng >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On Jun 12, 2025, at 20:50, Mark Adams > wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley > wrote: >>>>>>>>>>>>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams > wrote: >>>>>>>>>>>>>>>> Adding this to the PETSc mailing list, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng > wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Dear Professor, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I hope this message finds you well. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am an employee at a CAE company and a heavy user of the PETSc library. I would like to thank you for your contributions to PETSc and express my deep appreciation for your work. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Recently, I encountered some difficulties when using PETSc to solve structural mechanics problems with Lagrange multiplier constraints. After searching extensively online and reviewing several papers, I found your previous paper titled "Algebraic multigrid methods for constrained linear systems with applications to contact problems in solid mechanics" seems to be the most relevant and helpful. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The stiffness matrix I'm working with, K, is a block saddle-point matrix of the form (A00 A01; A10 0), where A00 is singular?just as described in your paper, and different from many other articles . I have a few questions regarding your work and would greatly appreciate your insights: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1. Is the AMG/KKT method presented in your paper available in PETSc? I tried using CG+GAMG directly but received a KSP_DIVERGED_PC_FAILED error. I also attempted to use CG+PCFIELDSPLIT with the following options: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> No >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Unfortunately, this also resulted in a KSP_DIVERGED_PC_FAILED error. Do you have any suggestions? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2. In your paper, you compare the method with Uzawa-type approaches. To my understanding, Uzawa methods typically require A00 to be invertible. How did you handle the singularity of A00 to construct an M-matrix that is invertible? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You add a regularization term like A01 * A10 (like springs). See the paper or any reference to augmented lagrange or Uzawa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing AMG APIs? Implementing a production-level AMG solver from scratch would be quite challenging for me, so I?m hoping to utilize existing AMG interfaces within PETSc or other packages. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You can do Uzawa and make the regularization matrix with matrix-matrix products. Just use AMG for the A00 block. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 4. For saddle-point systems where A00 is singular, can you recommend any more robust or efficient solutions? Alternatively, are you aware of any open-source software packages that can handle such cases out-of-the-box? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> No, and I don't think PETSc can do this out-of-the-box, but others may be able to give you a better idea of what PETSc can do. >>>>>>>>>>>>>>>> I think PETSc can do Uzawa or other similar algorithms but it will not do the regularization automatically (it is a bit more complicated than just A01 * A10) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> One other trick you can use is to have >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This will use SVD on the coarse grid of GAMG, which can handle the null space in A00 as long as the prolongation does not put it back in. I have used this for the Laplacian with Neumann conditions and for freely floating elastic problems. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Good point. >>>>>>>>>>>>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG to use a on level iterative solver for the coarse grid. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Mark >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you very much for taking the time to read my email. Looking forward to hearing from you. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Sincerely, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Xiaofeng He >>>>>>>>>>>>>>>>> ----------------------------------------------------- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Internet Based Engineering, Beijing, China >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dkaeuWgD0uDnpmdLS-Ef11c7lZ27yu7cYz1ISaMuYc5msucIYwTLq06ZlgwFE0jVH2skpeKWXSQEpath_7CngObn_lmBCQ$ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dkaeuWgD0uDnpmdLS-Ef11c7lZ27yu7cYz1ISaMuYc5msucIYwTLq06ZlgwFE0jVH2skpeKWXSQEpath_7CngObn_lmBCQ$ >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dkaeuWgD0uDnpmdLS-Ef11c7lZ27yu7cYz1ISaMuYc5msucIYwTLq06ZlgwFE0jVH2skpeKWXSQEpath_7CngObn_lmBCQ$ >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dkaeuWgD0uDnpmdLS-Ef11c7lZ27yu7cYz1ISaMuYc5msucIYwTLq06ZlgwFE0jVH2skpeKWXSQEpath_7CngObn_lmBCQ$ >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dkaeuWgD0uDnpmdLS-Ef11c7lZ27yu7cYz1ISaMuYc5msucIYwTLq06ZlgwFE0jVH2skpeKWXSQEpath_7CngObn_lmBCQ$ >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dkaeuWgD0uDnpmdLS-Ef11c7lZ27yu7cYz1ISaMuYc5msucIYwTLq06ZlgwFE0jVH2skpeKWXSQEpath_7CngObn_lmBCQ$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali.ali_ahmad at utt.fr Fri Jun 20 04:10:24 2025 From: ali.ali_ahmad at utt.fr (Ali ALI AHMAD) Date: Fri, 20 Jun 2025 11:10:24 +0200 (CEST) Subject: [petsc-users] [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) In-Reply-To: <85CD2CA9-7B77-4288-87BA-9E108D40C7E8@petsc.dev> References: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> <1703896473.7853283.1749734882144.JavaMail.zimbra@utt.fr> <461035026.7868511.1749735776853.JavaMail.zimbra@utt.fr> <323745907.8383516.1749804945465.JavaMail.zimbra@utt.fr> <85CD2CA9-7B77-4288-87BA-9E108D40C7E8@petsc.dev> Message-ID: <82133477.13270009.1750410624370.JavaMail.zimbra@utt.fr> * Yes, I am indeed using an inexact Newton method in my code. The descent direction is computed by solving a linear system involving the Jacobian, so the update follows the classical formula "J(un)^{- 1}d(un)=-F(un)" I'm also trying to use a line search strategy based on a weighted L 2 norm (in the Lebesgue sense), which a priori should lead to better accuracy and faster convergence in anisotropic settings. * During the subsequent iterations, I apply the Eisenstat?Walker method to adapt the tolerance, which should also involve modifying the norm used in the algorithm. * The current implementation still uses the standard Euclidean L 2 norm in PETSc's linear solver and in GMRES. I believe this should ideally be replaced by a weighted L 2 norm consistent with the discretization. However, I haven't yet succeeded in modifying the norm used internally by the linear solver in PETSc, so, I'm not yet sure how much impact this change would have on the overall convergence, but I suspect it could improve robustness, especially for highly anisotropic problems. I would greatly appreciate any guidance on how to implement this properly in PETSc. Do not hesitate to contact me again if anything remains unclear or if you need further information. Best regards, Ali ALI AHMAD De: "Barry Smith" ?: "Ali ALI AHMAD" Cc: "petsc-users" , "petsc-maint" Envoy?: Samedi 14 Juin 2025 01:06:52 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) I appreciate the clarification. I would call 3) preconditioning. To increase my understanding, you are already using Newton's method? That is, you compute the Jacobian of the function and use - J^{-1}(u^n) F(u^n) as your update direction? When you switch the inner product (or precondition) how will the search direction be different? Thanks Barry The case you need support for is becoming important to PETSc so we need to understand it well and support it well which is why I am asking these (perhaps to you) trivial questions. On Jun 13, 2025, at 4:55 AM, Ali ALI AHMAD wrote: Thank you for your message. To answer your question: I would like to use the L 2 norm in the sense of Lebesgue for all three purposes , especially the third one . 1- For displaying residuals during the nonlinear iterations, I would like to observe the convergence behavior using a norm that better reflects the physical properties of the problem. 2- For convergence testing , I would like the stopping criterion to be based on a weighted L 2 norm that accounts for the geometry of the mesh (since I am working with unstructured, anisotropic triangular meshes). 3 - Most importantly , I would like to modify the inner product used in the algorithm so that it aligns with the weighted L 2 norm (since I am working with unstructured, anisotropic triangular meshes). Best regards, Ali ALI AHMAD De: "Barry Smith" ?: "Ali ALI AHMAD" Cc: "petsc-users" , "petsc-maint" Envoy?: Vendredi 13 Juin 2025 03:14:06 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) You haven't answered my question. Where (conceptually) and for what purpose do you want to use the L2 norm. 1) displaying norms to observe the convergence behavior 2) in the convergence testing to determine when to stop 3) changing the "inner product" in the algorithm which amounts to preconditioning. Barry BQ_BEGIN On Jun 12, 2025, at 9:42 AM, Ali ALI AHMAD wrote: Thank you for your answer. I am currently working with the nonlinear solvers newtonls (with bt , l2 , etc.) and newtontr (using newton , cauchy , and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. I also use the Eisenstat-Walker method for newtonls , as my initial guess is often very far from the exact solution. What I would like to do now is to replace the standard Euclidean L 2 norm with the L 2 norm in the Lebesgue sense in the above numerical algorithm , because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. Would you be able to advise me on how to implement this change properly? I would deeply appreciate any guidance or suggestions you could provide. Thank you in advance for your help. Best regards, Ali ALI AHMAD De: "Ali ALI AHMAD" ?: "Barry Smith" Cc: "petsc-users" , "petsc-maint" Envoy?: Jeudi 12 Juin 2025 15:28:02 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) Thank you for your answer. I am currently working with the nonlinear solvers newtonls (with bt , l2 , etc.) and newtontr (using newton , cauchy , and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. I also use the Eisenstat-Walker method for newtonls , as my initial guess is often very far from the exact solution. What I would like to do now is to replace the standard Euclidean L 2 norm with the L 2 norm in the Lebesgue sense , because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. Would you be able to advise me on how to implement this change properly? I would deeply appreciate any guidance or suggestions you could provide. Thank you in advance for your help. Best regards, Ali ALI AHMAD De: "Barry Smith" ?: "Ali ALI AHMAD" Cc: "petsc-users" , "petsc-maint" Envoy?: Jeudi 12 Juin 2025 14:57:40 Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) Do you wish to use a different norm 1) ONLY for displaying (printing out) the residual norms to track progress 2) in the convergence testing 3) to change the numerical algorithm (for example using the L2 inner product instead of the usual linear algebra R^N l2 inner product). For 1) use SNESMonitorSet() and in your monitor function use SNESGetSolution() to grab the solution and then VecGetArray(). Now you can compute any weighted norm you want on the solution. For 2) similar but you need to use SNESSetConvergenceTest For 3) yes, but you need to ask us specifically. Barry BQ_BEGIN On Jun 11, 2025, at 4:45 AM, Ali ALI AHMAD wrote: Dear PETSc team, I hope this message finds you well. I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? Thank you very much in advance for your help and for the great work on PETSc! Best regards, Ali ALI AHMAD PhD Student University of Technology of Troyes - UTT - France GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 12 rue Marie Curie - CS 42060 10004 TROYES Cedex BQ_END BQ_END -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jun 20 06:55:33 2025 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jun 2025 07:55:33 -0400 Subject: [petsc-users] Questions Regarding PETSc and Solving Constrained Structural Mechanics Problems In-Reply-To: References: <3BF3C1E8-0CB0-42F1-A624-8FA0DC7FD4A4@buaa.edu.cn> <35A61411-85CF-4F48-9DD6-0409F0CFE598@petsc.dev> <5CFF6556-4BDE-48D9-9D3A-6D8790465358@buaa.edu.cn> <87A02E48-DDE4-4DCD-8C52-D2DAF975EF01@buaa.edu.cn> <96AB5047-4A35-49A2-B948-86656A1CFB5B@buaa.edu.cn> <82A9172B-C0DE-45F8-8DBD-9F351389541B@buaa.edu.cn> Message-ID: On Thu, Jun 19, 2025 at 10:49?PM hexioafeng wrote: > I tried to solve S with game and use svd on the coarse grid, then I got > the error: Arugments are incompatible, Zero diagonal on row 0. > You should not get that error, so I suspect something is wrong with your setup. Please always send the full output. > In my opinion, S should be rank-efficient, but not elliptic. > Do you mean full rank? That is unlikely since A_00 is rank deficient. Thanks, Matt > Best regards, > Xiaofeng > > > On Jun 20, 2025, at 10:07, Matthew Knepley wrote: > > On Thu, Jun 19, 2025 at 9:18?PM hexioafeng wrote: > >> Hello, >> >> Here are the outputs with svd: >> >> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to >> CONVERGED_ITS iterations 2 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to >> CONVERGED_ITS iterations 2 >> Linear fieldsplit_1_ solve did not converge due to DIVERGED_PC_FAILED >> > > You are running ILU(0) on your Schur complement, but it looks like it is > rank-deficient. You will have to use something that works for that (like > maybe GAMG again with SVD on the coarse grid). Is S elliptic? > > Thanks, > > Matt > > >> iterations 0 >> PC failed due to SUBPC_ERROR >> Linear fieldsplit_0_mg_levels_1_ solve converged due to >> CONVERGED_ITS iterations 2 >> Linear fieldsplit_0_mg_levels_1_ solve converged due to >> CONVERGED_ITS iterations 2 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC failed due to SUBPC_ERROR >> KSP Object: 1 MPI processes >> type: cg >> maximum iterations=200, initial guess is zero >> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >> left preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, blocksize = 1, factorization >> FULL >> Preconditioner for the Schur complement formed from Sp, an assembled >> approximation to S, which uses A00's diagonal's inverse >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on each level >> = >> Threshold scaling factor for each level not specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00222 >> Coarse grid solver -- level ------------------------------- >> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the >> following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: svd >> All singular values smaller than 1e-12 treated as zero >> Provided essential rank of the matrix 0 (all other >> eigenvalues are zeroed) >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 3 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 nodes, >> limit used is 5 >> Down solver (pre-smoother) on level 1 >> ------------------------------- >> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0.0998145, max = 1.09796 >> eigenvalues estimate via gmres min 0.00156735, max 0.998145 >> eigenvalues estimated using gmres with translations [0. >> 0.1; 0. 1.1] >> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI >> processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local iterations = >> 1, omega = 1. >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, >> limit used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 160 nodes, limit >> used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (fieldsplit_1_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following >> KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the following >> KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_1_sub_sub_) >> 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_sub_) >> 1 MPI processes >> type: ilu >> out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=144, cols=144 >> package used to perform factorization: >> petsc >> total: nonzeros=240, allocated nonzeros=240 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues >> calls=0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> linear system matrix followed by preconditioner matrix: >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: schurcomplement >> rows=144, cols=144 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> A10 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=480 >> total: nonzeros=48, allocated nonzeros=48 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 74 nodes, >> limit used is 5 >> KSP of A00 >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid >> matrices >> GAMG specific options >> Threshold for dropping small values in graph on >> each level = >> Threshold scaling factor for each level not >> specified = 1. >> AGG specific options >> Symmetric graph false >> Number of levels to square graph 1 >> Number smoothing steps 1 >> Complexity: grid = 1.00222 >> Coarse grid solver -- level >> ------------------------------- >> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >> type: bjacobi >> number of blocks = 1 >> Local solver is the same for all blocks, as in the >> following KSP and PC objects on rank 0: >> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >> processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >> processes >> type: svd >> All singular values smaller than 1e-12 treated >> as zero >> Provided essential rank of the matrix 0 (all >> other eigenvalues are zeroed) >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues >> calls=0 >> using I-node routines: found 3 nodes, limit >> used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=8, cols=8 >> total: nonzeros=56, allocated nonzeros=56 >> total number of mallocs used during MatSetValues >> calls=0 >> using nonscalable MatPtAP() implementation >> using I-node (on process 0) routines: found 3 >> nodes, limit used is 5 >> Down solver (pre-smoother) on level 1 >> ------------------------------- >> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: chebyshev >> eigenvalue estimates used: min = 0.0998145, max = >> 1.09796 >> eigenvalues estimate via gmres min 0.00156735, max >> 0.998145 >> eigenvalues estimated using gmres with >> translations [0. 0.1; 0. 1.1] >> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 >> MPI processes >> type: gmres >> restart=30, using Classical (unmodified) >> Gram-Schmidt Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence >> test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >> type: sor >> type = local_symmetric, iterations = 1, local >> iterations = 1, omega = 1. >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues >> calls=0 >> using I-node (on process 0) routines: found 160 >> nodes, limit used is 5 >> Up solver (post-smoother) same as down solver >> (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=480, cols=480 >> total: nonzeros=25200, allocated nonzeros=25200 >> total number of mallocs used during MatSetValues >> calls=0 >> using I-node (on process 0) routines: found 160 >> nodes, limit used is 5 >> A01 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=480, cols=144 >> total: nonzeros=48, allocated nonzeros=48 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 135 nodes, >> limit used is 5 >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=144, cols=144 >> total: nonzeros=240, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=624, cols=624 >> total: nonzeros=25536, allocated nonzeros=25536 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 336 nodes, limit used >> is 5 >> >> >> Thanks, >> Xiaofeng >> >> >> >> On Jun 20, 2025, at 00:56, Mark Adams wrote: >> >> This is what Matt is looking at: >> >> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >> type: lu >> >> This should be svd, not lu >> >> If you had used -options_left you would have caught this mistake(s) >> >> On Thu, Jun 19, 2025 at 8:06?AM Matthew Knepley >> wrote: >> >>> On Thu, Jun 19, 2025 at 7:59?AM hexioafeng >>> wrote: >>> >>>> Hello sir, >>>> >>>> I remove the duplicated "_type", and get the same error and output. >>>> >>> >>> The output cannot be the same. Please send it. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Best regards, >>>> Xiaofeng >>>> >>>> >>>> On Jun 19, 2025, at 19:45, Matthew Knepley wrote: >>>> >>>> This options is wrong >>>> >>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>> >>>> Notice that "_type" is repeated. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Thu, Jun 19, 2025 at 7:10?AM hexioafeng >>>> wrote: >>>> >>>>> Dear authors, >>>>> >>>>> Here are the options passed with fieldsplit preconditioner: >>>>> >>>>> -ksp_type cg -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>>> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi -ksp_view >>>>> -ksp_monitor_true_residual -ksp_converged_reason >>>>> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>>> -fieldsplit_1_ksp_monitor_true_residual >>>>> -fieldsplit_1_ksp_converged_reason >>>>> >>>>> and the output: >>>>> >>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >>>>> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>>>> CONVERGED_ITS iterations 2 >>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>>>> CONVERGED_ITS iterations 2 >>>>> Linear fieldsplit_1_ solve did not converge due to >>>>> DIVERGED_PC_FAILED iterations 0 >>>>> PC failed due to SUBPC_ERROR >>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>>>> CONVERGED_ITS iterations 2 >>>>> Linear fieldsplit_0_mg_levels_1_ solve converged due to >>>>> CONVERGED_ITS iterations 2 >>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>> PC failed due to SUBPC_ERROR >>>>> KSP Object: 1 MPI processes >>>>> type: cg >>>>> maximum iterations=200, initial guess is zero >>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>> left preconditioning >>>>> using UNPRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI processes >>>>> type: fieldsplit >>>>> FieldSplit with Schur preconditioner, blocksize = 1, >>>>> factorization FULL >>>>> Preconditioner for the Schur complement formed from Sp, an >>>>> assembled approximation to S, which uses A00's diagonal's inverse >>>>> Split info: >>>>> Split number 0 Defined by IS >>>>> Split number 1 Defined by IS >>>>> KSP solver for A00 block >>>>> KSP Object: (fieldsplit_0_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_) 1 MPI processes >>>>> type: gamg >>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>> Cycles per PCApply=1 >>>>> Using externally compute Galerkin coarse grid matrices >>>>> GAMG specific options >>>>> Threshold for dropping small values in graph on each >>>>> level = >>>>> Threshold scaling factor for each level not specified = >>>>> 1. >>>>> AGG specific options >>>>> Symmetric graph false >>>>> Number of levels to square graph 1 >>>>> Number smoothing steps 1 >>>>> Complexity: grid = 1.00222 >>>>> Coarse grid solver -- level ------------------------------- >>>>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the >>>>> following KSP and PC objects on rank 0: >>>>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=1, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI processes >>>>> type: lu >>>>> out-of-place factorization >>>>> tolerance for zero pivot 2.22045e-14 >>>>> using diagonal shift on blocks to prevent zero pivot >>>>> [INBLOCKS] >>>>> matrix ordering: nd >>>>> factor fill ratio given 5., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=8, cols=8 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> using I-node routines: found 3 nodes, limit >>>>> used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=8, cols=8 >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> total number of mallocs used during MatSetValues >>>>> calls=0 >>>>> using I-node routines: found 3 nodes, limit used is >>>>> 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=8, cols=8 >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using nonscalable MatPtAP() implementation >>>>> using I-node (on process 0) routines: found 3 nodes, >>>>> limit used is 5 >>>>> Down solver (pre-smoother) on level 1 >>>>> ------------------------------- >>>>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>> type: chebyshev >>>>> eigenvalue estimates used: min = 0.0998145, max = >>>>> 1.09796 >>>>> eigenvalues estimate via gmres min 0.00156735, max >>>>> 0.998145 >>>>> eigenvalues estimated using gmres with translations >>>>> [0. 0.1; 0. 1.1] >>>>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI >>>>> processes >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) >>>>> Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=10, initial guess is zero >>>>> tolerances: relative=1e-12, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> estimating eigenvalues using noisy right hand side >>>>> maximum iterations=2, nonzero initial guess >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes >>>>> type: sor >>>>> type = local_symmetric, iterations = 1, local >>>>> iterations = 1, omega = 1. >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=480 >>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 160 >>>>> nodes, limit used is 5 >>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=480 >>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 160 nodes, >>>>> limit used is 5 >>>>> KSP solver for S = A11 - A10 inv(A00) A01 >>>>> KSP Object: (fieldsplit_1_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_1_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the >>>>> following KSP and PC objects on rank 0: >>>>> KSP Object: (fieldsplit_1_sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_1_sub_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in the >>>>> following KSP and PC objects on rank 0: >>>>> KSP Object: (fieldsplit_1_sub_sub_) >>>>> 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_1_sub_sub_) >>>>> 1 MPI processes >>>>> type: ilu >>>>> out-of-place factorization >>>>> 0 levels of fill >>>>> tolerance for zero pivot 2.22045e-14 >>>>> matrix ordering: natural >>>>> factor fill ratio given 1., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=144, cols=144 >>>>> package used to perform factorization: >>>>> petsc >>>>> total: nonzeros=240, allocated >>>>> nonzeros=240 >>>>> not using I-node routines >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=144, cols=144 >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> total number of mallocs used during >>>>> MatSetValues calls=0 >>>>> not using I-node routines >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=144, cols=144 >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node (on process 0) routines >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: (fieldsplit_1_) 1 MPI processes >>>>> type: schurcomplement >>>>> rows=144, cols=144 >>>>> Schur complement A11 - A10 inv(A00) A01 >>>>> A11 >>>>> Mat Object: (fieldsplit_1_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=144, cols=144 >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> total number of mallocs used during MatSetValues >>>>> calls=0 >>>>> not using I-node (on process 0) routines >>>>> A10 >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=144, cols=480 >>>>> total: nonzeros=48, allocated nonzeros=48 >>>>> total number of mallocs used during MatSetValues >>>>> calls=0 >>>>> using I-node (on process 0) routines: found 74 >>>>> nodes, limit used is 5 >>>>> KSP of A00 >>>>> KSP Object: (fieldsplit_0_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_) 1 MPI processes >>>>> type: gamg >>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>> Cycles per PCApply=1 >>>>> Using externally compute Galerkin coarse grid >>>>> matrices >>>>> GAMG specific options >>>>> Threshold for dropping small values in graph on >>>>> each level = >>>>> Threshold scaling factor for each level not >>>>> specified = 1. >>>>> AGG specific options >>>>> Symmetric graph false >>>>> Number of levels to square graph 1 >>>>> Number smoothing steps 1 >>>>> Complexity: grid = 1.00222 >>>>> Coarse grid solver -- level >>>>> ------------------------------- >>>>> KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI >>>>> processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes >>>>> type: bjacobi >>>>> number of blocks = 1 >>>>> Local solver is the same for all blocks, as in >>>>> the following KSP and PC objects on rank 0: >>>>> KSP Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >>>>> processes >>>>> type: preonly >>>>> maximum iterations=1, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_coarse_sub_) 1 MPI >>>>> processes >>>>> type: lu >>>>> out-of-place factorization >>>>> tolerance for zero pivot 2.22045e-14 >>>>> using diagonal shift on blocks to prevent >>>>> zero pivot [INBLOCKS] >>>>> matrix ordering: nd >>>>> factor fill ratio given 5., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=8, cols=8 >>>>> package used to perform factorization: >>>>> petsc >>>>> total: nonzeros=56, allocated >>>>> nonzeros=56 >>>>> using I-node routines: found 3 nodes, >>>>> limit used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=8, cols=8 >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> total number of mallocs used during >>>>> MatSetValues calls=0 >>>>> using I-node routines: found 3 nodes, limit >>>>> used is 5 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=8, cols=8 >>>>> total: nonzeros=56, allocated nonzeros=56 >>>>> total number of mallocs used during >>>>> MatSetValues calls=0 >>>>> using nonscalable MatPtAP() implementation >>>>> using I-node (on process 0) routines: found 3 >>>>> nodes, limit used is 5 >>>>> Down solver (pre-smoother) on level 1 >>>>> ------------------------------- >>>>> KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI >>>>> processes >>>>> type: chebyshev >>>>> eigenvalue estimates used: min = 0.0998145, >>>>> max = 1.09796 >>>>> eigenvalues estimate via gmres min 0.00156735, >>>>> max 0.998145 >>>>> eigenvalues estimated using gmres with >>>>> translations [0. 0.1; 0. 1.1] >>>>> KSP Object: (fieldsplit_0_mg_levels_1_esteig_) >>>>> 1 MPI processes >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) >>>>> Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=10, initial guess is zero >>>>> tolerances: relative=1e-12, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for >>>>> convergence test >>>>> estimating eigenvalues using noisy right hand >>>>> side >>>>> maximum iterations=2, nonzero initial guess >>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI >>>>> processes >>>>> type: sor >>>>> type = local_symmetric, iterations = 1, local >>>>> iterations = 1, omega = 1. >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=480 >>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>> total number of mallocs used during >>>>> MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found >>>>> 160 nodes, limit used is 5 >>>>> Up solver (post-smoother) same as down solver >>>>> (pre-smoother) >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (fieldsplit_0_) 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=480 >>>>> total: nonzeros=25200, allocated nonzeros=25200 >>>>> total number of mallocs used during MatSetValues >>>>> calls=0 >>>>> using I-node (on process 0) routines: found 160 >>>>> nodes, limit used is 5 >>>>> A01 >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=480, cols=144 >>>>> total: nonzeros=48, allocated nonzeros=48 >>>>> total number of mallocs used during MatSetValues >>>>> calls=0 >>>>> using I-node (on process 0) routines: found 135 >>>>> nodes, limit used is 5 >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=144, cols=144 >>>>> total: nonzeros=240, allocated nonzeros=240 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node (on process 0) routines >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: mpiaij >>>>> rows=624, cols=624 >>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> using I-node (on process 0) routines: found 336 nodes, limit >>>>> used is 5 >>>>> >>>>> >>>>> Thanks, >>>>> Xiaofeng >>>>> >>>>> >>>>> >>>>> On Jun 17, 2025, at 19:05, Mark Adams wrote: >>>>> >>>>> And don't use -pc_gamg_parallel_coarse_grid_solver >>>>> You can use that in production but for debugging use >>>>> -mg_coarse_pc_type svd >>>>> Also, use -options_left and remove anything that is not used. >>>>> (I am puzzled, I see -pc_type gamg not -pc_type fieldsplit) >>>>> >>>>> Mark >>>>> >>>>> >>>>> On Mon, Jun 16, 2025 at 6:40?AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Sun, Jun 15, 2025 at 9:46?PM hexioafeng >>>>>> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Here are the options and outputs: >>>>>>> >>>>>>> options: >>>>>>> >>>>>>> -ksp_type cg -pc_type gamg -pc_gamg_parallel_coarse_grid_solver >>>>>>> -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur >>>>>>> -pc_fieldsplit_schur_precondition selfp >>>>>>> -fieldsplit_1_mat_schur_complement_ainv_type lump >>>>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>>>> -fieldsplit_0_pc_type gamg -fieldsplit_0_mg_coarse_pc_type_type svd >>>>>>> -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type bjacobi >>>>>>> -fieldsplit_1_sub_pc_type sor -ksp_view -ksp_monitor_true_residual >>>>>>> -ksp_converged_reason -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>>>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>>>>> -fieldsplit_1_ksp_monitor_true_residual >>>>>>> -fieldsplit_1_ksp_converged_reason >>>>>>> >>>>>> >>>>>> This option was wrong: >>>>>> >>>>>> -fieldsplit_0_mg_coarse_pc_type_type svd >>>>>> >>>>>> from the output, we can see that it should have been >>>>>> >>>>>> -fieldsplit_0_mg_coarse_sub_pc_type_type svd >>>>>> >>>>>> THanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> output: >>>>>>> >>>>>>> 0 KSP unpreconditioned resid norm 2.777777777778e+01 true resid norm >>>>>>> 2.777777777778e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >>>>>>> PC failed due to SUBPC_ERROR >>>>>>> KSP Object: 1 MPI processes >>>>>>> type: cg >>>>>>> maximum iterations=200, initial guess is zero >>>>>>> tolerances: relative=1e-06, absolute=1e-12, divergence=1e+30 >>>>>>> left preconditioning >>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI processes >>>>>>> type: gamg >>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>> Cycles per PCApply=1 >>>>>>> Using externally compute Galerkin coarse grid matrices >>>>>>> GAMG specific options >>>>>>> Threshold for dropping small values in graph on each level = >>>>>>> Threshold scaling factor for each level not specified = 1. >>>>>>> AGG specific options >>>>>>> Symmetric graph false >>>>>>> Number of levels to square graph 1 >>>>>>> Number smoothing steps 1 >>>>>>> Complexity: grid = 1.00176 >>>>>>> Coarse grid solver -- level ------------------------------- >>>>>>> KSP Object: (mg_coarse_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (mg_coarse_) 1 MPI processes >>>>>>> type: bjacobi >>>>>>> number of blocks = 1 >>>>>>> Local solver is the same for all blocks, as in the >>>>>>> following KSP and PC objects on rank 0: >>>>>>> KSP Object: (mg_coarse_sub_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=1, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>>>> divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (mg_coarse_sub_) 1 MPI processes >>>>>>> type: lu >>>>>>> out-of-place factorization >>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>> using diagonal shift on blocks to prevent zero pivot >>>>>>> [INBLOCKS] >>>>>>> matrix ordering: nd >>>>>>> factor fill ratio given 5., needed 1. >>>>>>> Factored matrix follows: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=7, cols=7 >>>>>>> package used to perform factorization: petsc >>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>> using I-node routines: found 3 nodes, limit used >>>>>>> is 5 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=7, cols=7 >>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node routines: found 3 nodes, limit used is 5 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=7, cols=7 >>>>>>> total: nonzeros=45, allocated nonzeros=45 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using nonscalable MatPtAP() implementation >>>>>>> using I-node (on process 0) routines: found 3 nodes, >>>>>>> limit used is 5 >>>>>>> Down solver (pre-smoother) on level 1 >>>>>>> ------------------------------- >>>>>>> KSP Object: (mg_levels_1_) 1 MPI processes >>>>>>> type: chebyshev >>>>>>> eigenvalue estimates used: min = 0., max = 0. >>>>>>> eigenvalues estimate via gmres min 0., max 0. >>>>>>> eigenvalues estimated using gmres with translations [0. >>>>>>> 0.1; 0. 1.1] >>>>>>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=10, initial guess is zero >>>>>>> tolerances: relative=1e-12, absolute=1e-50, >>>>>>> divergence=10000. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>>>> type: sor >>>>>>> type = local_symmetric, iterations = 1, local >>>>>>> iterations = 1, omega = 1. >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=624, cols=624 >>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 336 >>>>>>> nodes, limit used is 5 >>>>>>> estimating eigenvalues using noisy right hand side >>>>>>> maximum iterations=2, nonzero initial guess >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (mg_levels_1_) 1 MPI processes >>>>>>> type: sor >>>>>>> type = local_symmetric, iterations = 1, local iterations = >>>>>>> 1, omega = 1. linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=624, cols=624 >>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 336 nodes, >>>>>>> limit used is 5 Up solver (post-smoother) same as down solver >>>>>>> (pre-smoother) >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mpiaij >>>>>>> rows=624, cols=624 >>>>>>> total: nonzeros=25536, allocated nonzeros=25536 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> using I-node (on process 0) routines: found 336 nodes, limit >>>>>>> used is 5 >>>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Xiaofeng >>>>>>> >>>>>>> >>>>>>> On Jun 14, 2025, at 07:28, Barry Smith wrote: >>>>>>> >>>>>>> >>>>>>> Matt, >>>>>>> >>>>>>> Perhaps we should add options -ksp_monitor_debug and >>>>>>> -snes_monitor_debug that turn on all possible monitoring for the (possibly) >>>>>>> nested solvers and all of their converged reasons also? Note this is not >>>>>>> completely trivial because each preconditioner will have to supply its list >>>>>>> based on the current solver options for it. >>>>>>> >>>>>>> Then we won't need to constantly list a big string of problem >>>>>>> specific monitor options to ask the user to use. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Jun 13, 2025, at 9:09?AM, Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>> On Thu, Jun 12, 2025 at 10:55?PM hexioafeng >>>>>>> wrote: >>>>>>> >>>>>>>> Dear authors, >>>>>>>> >>>>>>>> I tried *-pc_type game -pc_gamg_parallel_coarse_grid_solver* and *-pc_type >>>>>>>> field split -pc_fieldsplit_detect_saddle_point -fieldsplit_0_ksp_type >>>>>>>> pronely -fieldsplit_0_pc_type game -fieldsplit_0_mg_coarse_pc_type sad >>>>>>>> -fieldsplit_1_ksp_type pronely -fieldsplit_1_pc_type Jacobi >>>>>>>> _fieldsplit_1_sub_pc_type for* , both options got the >>>>>>>> KSP_DIVERGE_PC_FAILED error. >>>>>>>> >>>>>>> >>>>>>> With any question about convergence, we need to see the output of >>>>>>> >>>>>>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>>>>>> -fieldsplit_0_mg_levels_ksp_monitor_true_residual >>>>>>> -fieldsplit_0_mg_levels_ksp_converged_reason >>>>>>> -fieldsplit_1_ksp_monitor_true_residual -fieldsplit_1_ksp_converged_reason >>>>>>> >>>>>>> and all the error output. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Xiaofeng >>>>>>>> >>>>>>>> >>>>>>>> On Jun 12, 2025, at 20:50, Mark Adams wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jun 12, 2025 at 8:44?AM Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Thu, Jun 12, 2025 at 4:58?AM Mark Adams >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Adding this to the PETSc mailing list, >>>>>>>>>> >>>>>>>>>> On Thu, Jun 12, 2025 at 3:43?AM hexioafeng < >>>>>>>>>> hexiaofeng at buaa.edu.cn> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Dear Professor, >>>>>>>>>>> >>>>>>>>>>> I hope this message finds you well. >>>>>>>>>>> >>>>>>>>>>> I am an employee at a CAE company and a heavy user of the PETSc >>>>>>>>>>> library. I would like to thank you for your contributions to PETSc and >>>>>>>>>>> express my deep appreciation for your work. >>>>>>>>>>> >>>>>>>>>>> Recently, I encountered some difficulties when using PETSc to >>>>>>>>>>> solve structural mechanics problems with Lagrange multiplier constraints. >>>>>>>>>>> After searching extensively online and reviewing several papers, I found >>>>>>>>>>> your previous paper titled "*Algebraic multigrid methods for >>>>>>>>>>> constrained linear systems with applications to contact problems in solid >>>>>>>>>>> mechanics*" seems to be the most relevant and helpful. >>>>>>>>>>> >>>>>>>>>>> The stiffness matrix I'm working with, *K*, is a block >>>>>>>>>>> saddle-point matrix of the form (A00 A01; A10 0), where *A00 is >>>>>>>>>>> singular*?just as described in your paper, and different from >>>>>>>>>>> many other articles . I have a few questions regarding your work and would >>>>>>>>>>> greatly appreciate your insights: >>>>>>>>>>> >>>>>>>>>>> 1. Is the *AMG/KKT* method presented in your paper available in >>>>>>>>>>> PETSc? I tried using *CG+GAMG* directly but received a >>>>>>>>>>> *KSP_DIVERGED_PC_FAILED* error. I also attempted to use >>>>>>>>>>> *CG+PCFIELDSPLIT* with the following options: >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> No >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point >>>>>>>>>>> -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp >>>>>>>>>>> -pc_fieldsplit_schur_fact_type full -fieldsplit_0_ksp_type preonly >>>>>>>>>>> -fieldsplit_0_pc_type gamg -fieldsplit_1_ksp_type preonly >>>>>>>>>>> -fieldsplit_1_pc_type bjacobi >>>>>>>>>>> >>>>>>>>>>> Unfortunately, this also resulted in a >>>>>>>>>>> *KSP_DIVERGED_PC_FAILED* error. Do you have any suggestions? >>>>>>>>>>> >>>>>>>>>>> 2. In your paper, you compare the method with *Uzawa*-type >>>>>>>>>>> approaches. To my understanding, Uzawa methods typically require A00 to be >>>>>>>>>>> invertible. How did you handle the singularity of A00 to construct an >>>>>>>>>>> M-matrix that is invertible? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> You add a regularization term like A01 * A10 (like springs). See >>>>>>>>>> the paper or any reference to augmented lagrange or Uzawa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 3. Can i implement the AMG/KKT method in your paper using existing >>>>>>>>>>> *AMG APIs*? Implementing a production-level AMG solver from >>>>>>>>>>> scratch would be quite challenging for me, so I?m hoping to utilize >>>>>>>>>>> existing AMG interfaces within PETSc or other packages. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> You can do Uzawa and make the regularization matrix with >>>>>>>>>> matrix-matrix products. Just use AMG for the A00 block. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> 4. For saddle-point systems where A00 is singular, can you >>>>>>>>>>> recommend any more robust or efficient solutions? Alternatively, are you >>>>>>>>>>> aware of any open-source software packages that can handle such cases >>>>>>>>>>> out-of-the-box? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> No, and I don't think PETSc can do this out-of-the-box, but >>>>>>>>>> others may be able to give you a better idea of what PETSc can do. >>>>>>>>>> I think PETSc can do Uzawa or other similar algorithms but it >>>>>>>>>> will not do the regularization automatically (it is a bit more complicated >>>>>>>>>> than just A01 * A10) >>>>>>>>>> >>>>>>>>> >>>>>>>>> One other trick you can use is to have >>>>>>>>> >>>>>>>>> -fieldsplit_0_mg_coarse_pc_type svd >>>>>>>>> >>>>>>>>> This will use SVD on the coarse grid of GAMG, which can handle the >>>>>>>>> null space in A00 as long as the prolongation does not put it back in. I >>>>>>>>> have used this for the Laplacian with Neumann conditions and for freely >>>>>>>>> floating elastic problems. >>>>>>>>> >>>>>>>>> >>>>>>>> Good point. >>>>>>>> You can also use -pc_gamg_parallel_coarse_grid_solver to get GAMG >>>>>>>> to use a on level iterative solver for the coarse grid. >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Mark >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thank you very much for taking the time to read my email. >>>>>>>>>>> Looking forward to hearing from you. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Sincerely, >>>>>>>>>>> >>>>>>>>>>> Xiaofeng He >>>>>>>>>>> ----------------------------------------------------- >>>>>>>>>>> >>>>>>>>>>> Research Engineer >>>>>>>>>>> >>>>>>>>>>> Internet Based Engineering, Beijing, China >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bau_jDLNP3cONGtpUOuWfnaKXWFEHxy4m79iA3uPSfuIiIK0MstOrmbKllq58EumfsF7a8pP4SX-onxxySnZ$ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bau_jDLNP3cONGtpUOuWfnaKXWFEHxy4m79iA3uPSfuIiIK0MstOrmbKllq58EumfsF7a8pP4SX-onxxySnZ$ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bau_jDLNP3cONGtpUOuWfnaKXWFEHxy4m79iA3uPSfuIiIK0MstOrmbKllq58EumfsF7a8pP4SX-onxxySnZ$ >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bau_jDLNP3cONGtpUOuWfnaKXWFEHxy4m79iA3uPSfuIiIK0MstOrmbKllq58EumfsF7a8pP4SX-onxxySnZ$ >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bau_jDLNP3cONGtpUOuWfnaKXWFEHxy4m79iA3uPSfuIiIK0MstOrmbKllq58EumfsF7a8pP4SX-onxxySnZ$ >>> >>> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bau_jDLNP3cONGtpUOuWfnaKXWFEHxy4m79iA3uPSfuIiIK0MstOrmbKllq58EumfsF7a8pP4SX-onxxySnZ$ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bau_jDLNP3cONGtpUOuWfnaKXWFEHxy4m79iA3uPSfuIiIK0MstOrmbKllq58EumfsF7a8pP4SX-onxxySnZ$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 20 08:50:17 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 20 Jun 2025 09:50:17 -0400 Subject: [petsc-users] [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) In-Reply-To: <82133477.13270009.1750410624370.JavaMail.zimbra@utt.fr> References: <414475981.6714047.1749631527145.JavaMail.zimbra@utt.fr> <1703896473.7853283.1749734882144.JavaMail.zimbra@utt.fr> <461035026.7868511.1749735776853.JavaMail.zimbra@utt.fr> <323745907.8383516.1749804945465.JavaMail.zimbra@utt.fr> <85CD2CA9-7B77-4288-87BA-9E108D40C7E8@petsc.dev> <82133477.13270009.1750410624370.JavaMail.zimbra@utt.fr> Message-ID: <86F87C6A-DEB3-4125-AF51-2B2E577EBFDD@petsc.dev> > On Jun 20, 2025, at 5:10?AM, Ali ALI AHMAD wrote: > > > * Yes, I am indeed using an inexact Newton method in my code. The descent direction is computed by solving a linear system involving the Jacobian, so the update follows the classical formula "J(un)^{- > 1}d(un)=-F(un)" > I'm also trying to use a line search strategy based on a weighted L2 norm (in the Lebesgue sense), which a priori should lead to better accuracy and faster convergence in anisotropic settings. Ok, could you point to sample code (any language) or written algorithms where a different norm is used in the line search? > > * During the subsequent iterations, I apply the Eisenstat?Walker method to adapt the tolerance, which should also involve modifying the norm used in the algorithm. > > * The current implementation still uses the standard Euclidean L2 norm in PETSc's linear solver and in GMRES. I believe this should ideally be replaced by a weighted L2 norm consistent with the discretization. However, I haven't yet succeeded in modifying the norm used internally by the linear solver in PETSc, so, I'm not yet sure how much impact this change would have on the overall convergence, but I suspect it could improve robustness, especially for highly anisotropic problems. I would greatly appreciate any guidance on how to implement this properly in PETSc. Norms are used in multiple ways in GMRES. 1) defining convergence 2) as part of preconditioning Again can you point to sample code (any language) or written algorithms that describe exactly what you would like to accomplish. Barry > > Do not hesitate to contact me again if anything remains unclear or if you need further information. > > Best regards, > Ali ALI AHMAD > > De: "Barry Smith" > ?: "Ali ALI AHMAD" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Samedi 14 Juin 2025 01:06:52 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > I appreciate the clarification. I would call 3) preconditioning. > To increase my understanding, you are already using Newton's method? That is, you compute the Jacobian of the function and use - J^{-1}(u^n) F(u^n) as your update direction? > > When you switch the inner product (or precondition) how will the search direction be different? > > Thanks > > Barry > > The case you need support for is becoming important to PETSc so we need to understand it well and support it well which is why I am asking these (perhaps to you) trivial questions. > > > > On Jun 13, 2025, at 4:55?AM, Ali ALI AHMAD wrote: > > Thank you for your message. > > To answer your question: I would like to use the L2 norm in the sense of Lebesgue for all three purposes, especially the third one. > > 1- For displaying residuals during the nonlinear iterations, I would like to observe the convergence behavior using a norm that better reflects the physical properties of the problem. > > 2- For convergence testing, I would like the stopping criterion to be based on a weighted L2 norm that accounts for the geometry of the mesh (since I am working with unstructured, anisotropic triangular meshes). > > 3 - Most importantly, I would like to modify the inner product used in the algorithm so that it aligns with the weighted L2 norm (since I am working with unstructured, anisotropic triangular meshes). > > Best regards, > Ali ALI AHMAD > De: "Barry Smith" > ?: "Ali ALI AHMAD" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Vendredi 13 Juin 2025 03:14:06 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > You haven't answered my question. Where (conceptually) and for what purpose do you want to use the L2 norm. > 1) displaying norms to observe the convergence behavior > > 2) in the convergence testing to determine when to stop > > 3) changing the "inner product" in the algorithm which amounts to preconditioning. > > Barry > > > On Jun 12, 2025, at 9:42?AM, Ali ALI AHMAD wrote: > > Thank you for your answer. > > I am currently working with the nonlinear solvers newtonls (with bt, l2, etc.) and newtontr (using newton, cauchy, and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. > > I also use the Eisenstat-Walker method for newtonls, as my initial guess is often very far from the exact solution. > > What I would like to do now is to replace the standard Euclidean L2 norm with the L2 norm in the Lebesgue sense in the above numerical algorithm, because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. > > Would you be able to advise me on how to implement this change properly? > > I would deeply appreciate any guidance or suggestions you could provide. > > Thank you in advance for your help. > > Best regards, > Ali ALI AHMAD > > De: "Ali ALI AHMAD" > ?: "Barry Smith" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Jeudi 12 Juin 2025 15:28:02 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > Thank you for your answer. > > I am currently working with the nonlinear solvers newtonls (with bt, l2, etc.) and newtontr (using newton, cauchy, and dogleg strategies) combined with the linear solver gmres and the ILU preconditioner, since my Jacobian matrix is nonsymmetric. > > I also use the Eisenstat-Walker method for newtonls, as my initial guess is often very far from the exact solution. > > What I would like to do now is to replace the standard Euclidean L2 norm with the L2 norm in the Lebesgue sense, because my problem is defined on an unstructured, anisotropic triangular mesh where a weighted norm would be more physically appropriate. > > Would you be able to advise me on how to implement this change properly? > > I would deeply appreciate any guidance or suggestions you could provide. > > Thank you in advance for your help. > > Best regards, > Ali ALI AHMAD > De: "Barry Smith" > ?: "Ali ALI AHMAD" > Cc: "petsc-users" , "petsc-maint" > Envoy?: Jeudi 12 Juin 2025 14:57:40 > Objet: Re: [petsc-maint] norm L2 problemQuestion about changing the norm used in nonlinear solvers (L2 Euclidean vs. L2 Lebesgue) > > Do you wish to use a different norm > > 1) ONLY for displaying (printing out) the residual norms to track progress > > 2) in the convergence testing > > 3) to change the numerical algorithm (for example using the L2 inner product instead of the usual linear algebra R^N l2 inner product). > > For 1) use SNESMonitorSet() and in your monitor function use SNESGetSolution() to grab the solution and then VecGetArray(). Now you can compute any weighted norm you want on the solution. > > For 2) similar but you need to use SNESSetConvergenceTest > > For 3) yes, but you need to ask us specifically. > > Barry > > > On Jun 11, 2025, at 4:45?AM, Ali ALI AHMAD wrote: > > Dear PETSc team, > > I hope this message finds you well. > > I am currently using PETSc in a C++, where I rely on the nonlinear solvers `SNES` with either `newtonls` or `newtontr` methods. I would like to ask if it is possible to change the default norm used (typically the L2 Euclidean norm) to a custom norm, specifically the L2 norm in the sense of Lebesgue (e.g., involving cell-wise weighted integrals over the domain). > > My main goal is to define a custom residual norm that better reflects the physical quantities of interest in my simulation. > > Would this be feasible within the PETSc framework? If so, could you point me to the recommended approach (e.g., redefining the norm manually, using specific PETSc hooks or options)? > > Thank you very much in advance for your help and for the great work on PETSc! > > Best regards, > > Ali ALI AHMAD > PhD Student > University of Technology of Troyes - UTT - France > GAMMA3 Project - Office H008 - Phone No: +33 7 67 44 68 18 > 12 rue Marie Curie - CS 42060 10004 TROYES Cedex > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Fri Jun 20 09:57:41 2025 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 20 Jun 2025 07:57:41 -0700 Subject: [petsc-users] Problem with composite DM index sets In-Reply-To: References: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> Message-ID: Dear PETSc, On Tuesday I sent a small reproducer program that shows a problem with the Fortran version of DMCompositeGetGlobalISs, with the valgrind stack trace below. Not having yet received a reply, I?m just wondering if perhaps my email slipped through the cracks. Would appreciate some guidance here so we can continue our code upgrade from petsc 3.19 to 3.23. Thanks, Randy > On Jun 18, 2025, at 2:44?PM, Randall Mackie wrote: > > Follow up: running with valgrind shows the following issues?.is this a bug in PETSc? > > ==5216== Use of uninitialised value of size 8 > ==5216== at 0x49ED69C: f90array1dcreatefortranaddr_ (f90_fwrap.F90:52) > ==5216== by 0x4D7EA94: F90Array1dCreate (f90_cwrap.c:140) > ==5216== by 0x6B7295A: dmcompositegetglobaliss_ (zfddaf.c:70) > ==5216== by 0x1095C8: MAIN__ (test.F90:35) > ==5216== by 0x1096AC: main (test.F90:4) > ==5216== Uninitialised value was created by a stack allocation > ==5216== at 0x109219: MAIN__ (test.F90:1) > ==5216== > ==5216== Invalid write of size 8 > ==5216== at 0x49ED69C: f90array1dcreatefortranaddr_ (f90_fwrap.F90:52) > ==5216== by 0x4D7EA94: F90Array1dCreate (f90_cwrap.c:140) > ==5216== by 0x6B7295A: dmcompositegetglobaliss_ (zfddaf.c:70) > ==5216== by 0x1095C8: MAIN__ (test.F90:35) > ==5216== by 0x1096AC: main (test.F90:4) > ==5216== Address 0x20 is not stack'd, malloc'd or (recently) free'd > ==5216== > > > Thanks, Randyt > > > >> On Jun 17, 2025, at 3:39?PM, Randall Mackie wrote: >> >> Dear Petsc users - >> >> I am trying to upgrade my code to petsc-3.23 (from 3.19), and I seem to have run into a problem with DMCompositeGetGlobalISs. >> >> The example program listed on the man page for DMCompositeGetGlobalISs, https://urldefense.us/v3/__https://petsc.org/release/src/snes/tutorials/ex73f90t.F90.html__;!!G_uCfscf7eWS!Z2gkoRqGs_off3mdHshweKBv_JQzRrYyYBMN7hIRUHEZIypl2PnZFuidgVHE4iuK9FNlN1y4PtlNyahPNhcFiO1S2w$ , seems to indicate that a call to DMCompositeGetGlobalISs does not need to allocate the IS pointer and you just pass it directly to DMCompositeGetGlobalISs. >> >> If I compile and run the simple attached test program (say on 2 processes), I get the following error: >> >> [0]PETSC ERROR: ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!Z2gkoRqGs_off3mdHshweKBv_JQzRrYyYBMN7hIRUHEZIypl2PnZFuidgVHE4iuK9FNlN1y4PtlNyahPNhdiZKy11Q$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!Z2gkoRqGs_off3mdHshweKBv_JQzRrYyYBMN7hIRUHEZIypl2PnZFuidgVHE4iuK9FNlN1y4PtlNyahPNhdzXsvxLQ$ >> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [0]PETSC ERROR: The line numbers in the error traceback may not be exact. >> [0]PETSC ERROR: #1 F90Array1dCreate() at /home/rmackie/PETSc/petsc-3.23.3/src/sys/ftn-custom/f90_cwrap.c:123 >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> >> If I uncomment the line to allocate the pointer, I get a very long traceback with lots of error messages. >> >> What is the correct way to use DMCompositeGetGlobalISs in Fortran? With or without the pointer allocation, and what is the right way to do this without the errors it seems to generate? >> >> Thanks, >> >> Randy Mackie >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 20 12:21:35 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 20 Jun 2025 13:21:35 -0400 Subject: [petsc-users] Workshop at ICERM Sep 13-14 2025 in honor of Bill Gropp's 70th birthday Message-ID: Bill Gropp was extremely influential in the early days of PETSc. I learned numerical analysis from him as an undergraduate and was later his post-doc. ICERM is hosting a meeting, From Modeling to Learning with HPC, Sep 13-14 2025 in honor of Bill Gropp's 70th birthday. You can find more information at https://urldefense.us/v3/__https://app.icerm.brown.edu/Cube/apply__;!!G_uCfscf7eWS!fumvXSupRQHE-gnpd_eqLI601O9rUP73OmUFZwrAcwjAnWrKE4Uyjf8WJtxTBK3uqboZQKSHaBtg3gZSEcOyy-o$ Barry -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Fri Jun 20 16:13:57 2025 From: dontbugthedevs at proton.me (Noam T.) Date: Fri, 20 Jun 2025 21:13:57 +0000 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: Thank you once again, the code provides exactly what needed. An alternative for the VTK use was subdividing cells using corner nodes and integration points, such that all cells were first order. Any "better" alternative format/visualization software for this purpose? Possibly the last question in relation to this matter. We use first order meshes only, as you suggest, and let PETSc handle everything high-order through the approximation space. Hence, when retrieving node coordinates with DMGetCoordinates(Local) or DMPlexGetCellCoordinates, one gets the corner nodes only. Is the list of additional, high-order nodes coordinates readily available (stored) somewhere to be retrieved? They can be computed (e.g. using DMPlexReferenceToCoordinates knowing their position in the reference cell; or using corner nodes coordinates), but this will result in shared nodes being computed possibly several times; the large the mesh, the worse. E.g. in the example of the image attached before, DMPlexGetClosureIndices returns [4, 5, 6, 0, 1, 3] for the first cell [7, 8, 5, 1, 2, 3] for the second cell so that 3 nodes (5, 1, 3) will be computed twice if done naively, cell by cell. Thank you, Noam On Thursday, June 19th, 2025 at 12:43 AM, Matthew Knepley wrote: > On Wed, Jun 18, 2025 at 6:49?PM Noam T. wrote: > >> See image attached. >> Connectivity of the top mesh (first order triangle), can be obtained with the code shared before. >> Connectivity of the bottom mesh (second order triangle) is what I would be interested in obtaining. >> >> However, given your clarification on what the Plex and the PetscSection handle, it might not work; I am trying to get form the Plex what's only available from the PetscSection. >> >> The purpose of this extended connectivity is plotting; in particular, using VTU files, where the "connectivity" of cells is required, and the extra nodes would be needed when using higher-order elements (e.g. VTK_QUADRATIC_TRIANGLE, VTK_QUADRATIC_QUAD, etc). > > Oh yes. VTK does this in a particularly ugly and backward way. Sigh. There is nothing we can do about this now, but someone should replace VTK with a proper interface at some point. > > So I understand why you want it and it is a defensible case, so here is how you get that (with some explanation). Those locations, I think, should not be understood as topological things, but rather as the locations of point evaluation functionals constituting a basis for the dual space (to your approximation space). I would call DMPlexGetClosureIndices() (https://urldefense.us/v3/__https://petsc.org/main/manualpages/DMPlex/DMPlexGetClosureIndices/__;!!G_uCfscf7eWS!YswUx8vrAy00gLbhVHtR6nHAcUoAPFp5cyc5dEZ_bGmq7P3nb2BmpMlM28YywJBsvA4etxZNR_ARL3vqfTK42Cu61h7-Anfk$ ) with a Section having the layout of P2 or Q2. This is the easy way to make that > > PetscSection gs; > PetscFE fe; > DMPolytopeType ct; > PetscInt dim, cStart; > > PetscCall(DMGetDimension(dm, &dim)); > PetscCall(DMPlexGetHeightStratum(dm, 0, &cStart, NULL)); > PetscCall(DMPlexGetCellType(dm, cStart, &ct)); > PetscCall(PetscFECreateLagrangeByCell(PETSC_COMM_SELF, dim, 1, ct, 2, PETSC_DETERMINE, &fe)); > PetscCall(DMSetField(dm, 0, NULL, (PetscObject)fe)); > PetscCall(PetscFEDestroy(&fe)); > PetscCall(DMCreateDS(dm)); > PetscCall(DMGetGlobalSection(dm, &gs)); > > PetscInt *indices = NULL; > PetscInt Nidx; > > PetscCall(DMPlexGetClosureIndices(dm, gs, gs, cell, PETSC_TRUE, &Nidx, &indices, NULL, NULL)); > > Thanks, > > MAtt > >> Perhaps I am over complicating things, and all this information can be obtained in a different, simpler way. >> >> Thanks. >> Noam >> On Tuesday, June 17th, 2025 at 5:42 PM, Matthew Knepley wrote: >> >>> On Tue, Jun 17, 2025 at 12:43?PM Noam T. wrote: >>> >>>> Thank you. For now, I am dealing with vertices only. >>>> >>>> Perhaps I did not explain myself properly, or I misunderstood your response. >>>> What I meant to say is, given an element of order higher than one, the connectivity matrix I obtain this way only contains as many entries as the first order element: 3 for a triangle, 4 for a tetrahedron, etc. >>>> >>>> Looking at the closure of any cell in the mesh, this is also the case.However, the nodes are definitely present; e.g. from >>>> >>>> DMPlexGetCellCoordinates(dm, cell, NULL, nc, NULL, NULL) >>>> >>>> nc returns the expected value (12 for a 2nd order 6-node planar triangle, 30 for a 2nd order 10-node tetrahedron, etc). >>>> >>>> The question is, are the indices of these extra nodes obtainable in a similar way as with the code shared before? So that one can have e.g. [0, 1, 2, 3, 4, 5] for a second order triangle, not just [0, 1, 2]. >>> >>> I am having a hard time understanding what you are after. I think this is because many FEM approaches confuse topology with analysis. >>> >>> The Plex stores topology, and you can retrieve adjacencies between any two mesh points. >>> >>> The PetscSection maps mesh points (cells, faces, edges , vertices) to sets of dofs. This is how higher order elements are implemented. Thus, we do not have to change topology to get different function spaces. >>> >>> The intended interface is for you to call DMPlexVecGetClosure() to get the closure of a cell (or face, or edge). You can also call DMPlexGetClosureIndices(), but index wrangling is what I intended to eliminate. >>> >>> What exactly are you looking for here? >>> >>> Thanks, >>> >>> Matt >>> >>>> Thank you. >>>> Noam >>>> On Friday, June 13th, 2025 at 3:05 PM, Matthew Knepley wrote: >>>> >>>>> On Thu, Jun 12, 2025 at 4:26?PM Noam T. wrote: >>>>> >>>>>> Thank you for the code; it provides exactly what I was looking for. >>>>>> >>>>>> Following up on this matter, does this method not work for higher order elements? For example, using an 8-node quadrilateral, exporting to a PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node coordinates in geometry/vertices >>>>> >>>>> If you wanted to include edges/faces, you could do it. First, you would need to decide how you would number things For example, would you number all points contiguously, or separately number cells, vertices, faces and edges. Second, you would check for faces/edges in the closure loop. Right now, we only check for vertices. >>>>> >>>>> I would say that this is what convinced me not to do FEM this way. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> (here a quadrilateral in [0, 10]) >>>>>> 5.0, 5.0 >>>>>> 0.0, 0.0 >>>>>> 10.0, 0.0 >>>>>> 10.0, 10.0 >>>>>> 0.0, 10.0 >>>>>> 5.0, 0.0 >>>>>> 10.0, 5.0 >>>>>> 5.0, 10.0 >>>>>> 0.0, 5.0 >>>>>> >>>>>> but the connectivity in viz/topology is >>>>>> >>>>>> 0 1 2 3 >>>>>> >>>>>> which are likely the corner nodes of the initial, first-order element, before adding extra nodes for the higher degree element. >>>>>> >>>>>> This connectivity values [0, 1, 2, 3, ...] are always the same, including for other elements, whereas the coordinates are correct >>>>>> >>>>>> E.g. for 3rd order triangle in [0, 1], coordinates are given left to right, bottom to top >>>>>> 0, 0 >>>>>> 1/3, 0, >>>>>> 2/3, 0, >>>>>> 1, 0 >>>>>> 0, 1/3 >>>>>> 1/3, 1/3 >>>>>> 2/3, 1/3 >>>>>> 0, 2/3, >>>>>> 1/3, 2/3 >>>>>> 0, 1 >>>>>> >>>>>> but the connectivity (viz/topology/cells) is [0, 1, 2]. >>>>>> >>>>>> Test meshes were created with gmsh from the python API, using >>>>>> gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... >>>>>> >>>>>> Thank you. >>>>>> Noam >>>>>> On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley wrote: >>>>>> >>>>>>> On Thu, May 22, 2025 at 12:25?PM Noam T. wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> Thank you the various options. >>>>>>>> >>>>>>>> Use case here would be obtaining the exact output generated by option 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix generated under /viz/topology/cells. >>>>>>>> >>>>>>>>> There are several ways you might do this. It helps to know what you are aiming for. >>>>>>>>> >>>>>>>>> 1) If you just want this output, it might be easier to just DMView() with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the cell-vertex topology and coordinates >>>>>>>> >>>>>>>> Is it possible to get this information in memory, onto a Mat, Vec or some other Int array object directly? it would be handy to have it in order to manipulate it and/or save it to a different format/file. Saving to an HDF5 and loading it again seems redundant. >>>>>>>> >>>>>>>>> 2) You can call DMPlexUninterpolate() to produce a mesh with just cells and vertices, and output it in any format. >>>>>>>>> >>>>>>>>> 3) If you want it in memory, but still with global indices (I don't understand this use case), then you can use DMPlexCreatePointNumbering() for an overall global numbering, or DMPlexCreateCellNumbering() and DMPlexCreateVertexNumbering() for separate global numberings. >>>>>>>> >>>>>>>> Perhaps I missed it, but getting the connectivity matrix in /viz/topology/cells/ did not seem directly trivial to me from the list of global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I assume all the operations done when calling DMView()). >>>>>>> >>>>>>> Something like >>>>>>> >>>>>>> DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>>>>>> DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); >>>>>>> DMPlexGetVertexNumbering(dm, &globalVertexNumbers); >>>>>>> ISGetIndices(globalVertexNumbers, &gv); >>>>>>> for (PetscInt c = cStart; c < cEnd; ++c) { >>>>>>> PetscInt *closure = NULL; >>>>>>> >>>>>>> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>>>>> for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { >>>>>>> if (closure[cl] < vStart || closure[cl] >= vEnd) continue; >>>>>>> const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : gv[closure[cl]]; >>>>>>> >>>>>>> // Do something with v >>>>>>> } >>>>>>> DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>>>>> } >>>>>>> ISRestoreIndices(globalVertexNumbers, &gv); >>>>>>> ISDestroy(&globalVertexNumbers); >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>>> Thanks, >>>>>>>> Noam. >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!YswUx8vrAy00gLbhVHtR6nHAcUoAPFp5cyc5dEZ_bGmq7P3nb2BmpMlM28YywJBsvA4etxZNR_ARL3vqfTK42Cu61uefq04u$ >>>>> >>>>> -- >>>>> >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!YswUx8vrAy00gLbhVHtR6nHAcUoAPFp5cyc5dEZ_bGmq7P3nb2BmpMlM28YywJBsvA4etxZNR_ARL3vqfTK42Cu61uefq04u$ >>> >>> -- >>> >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!YswUx8vrAy00gLbhVHtR6nHAcUoAPFp5cyc5dEZ_bGmq7P3nb2BmpMlM28YywJBsvA4etxZNR_ARL3vqfTK42Cu61uefq04u$ > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!YswUx8vrAy00gLbhVHtR6nHAcUoAPFp5cyc5dEZ_bGmq7P3nb2BmpMlM28YywJBsvA4etxZNR_ARL3vqfTK42Cu61uefq04u$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jun 20 16:37:24 2025 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jun 2025 17:37:24 -0400 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: On Fri, Jun 20, 2025 at 5:14?PM Noam T. wrote: > Thank you once again, the code provides exactly what needed. > An alternative for the VTK use was subdividing cells using corner nodes > and integration points, such that all cells were first order. Any "better" > alternative format/visualization software for this purpose? > I do have this implemented. If you give -dm_plex_high_order_view, it will refine the grid and project into the linear space. You can see that the code is pretty simple: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plex.c?ref_type=heads*L2021__;Iw!!G_uCfscf7eWS!Yrsqxk3Y_G1ua_QR9o4DIqJAMyqlrZjtniZoHJqKFNV8FYYjw0XmvDvvukBBCZwlSNnbLDWBwQlY-wmRROXs$ It uses DMPlexInterpolate() to connect the spaces. > Possibly the last question in relation to this matter. We use first order > meshes only, as you suggest, and let PETSc handle everything high-order > through the approximation space. Hence, when retrieving node coordinates > with DMGetCoordinates(Local) or DMPlexGetCellCoordinates, one gets the > corner nodes only. > > Is the list of additional, high-order nodes coordinates readily available > (stored) somewhere to be retrieved? > Yes. The idea is to think of coordinates as a discretized field on the mesh, exactly as the solution field. Thus if you want higher order coordinates, you choose a higher order coordinate space. I give the coordinate space prefix cdm_, so you could say -cdm_petscspace_degree 2 to get quadratic coordinates. There are some tests in Plex tests ex33.c Thanks, Matt > They can be computed (e.g. using DMPlexReferenceToCoordinates knowing > their position in the reference cell; or using corner nodes coordinates), > but this will result in shared nodes being computed possibly several times; > the large the mesh, the worse. > > E.g. in the example of the image attached before, DMPlexGetClosureIndices > returns > [4, 5, 6, 0, 1, 3] for the first cell > [7, 8, 5, 1, 2, 3] for the second cell > so that 3 nodes (5, 1, 3) will be computed twice if done naively, cell by > cell. > > Thank you, > Noam > On Thursday, June 19th, 2025 at 12:43 AM, Matthew Knepley < > knepley at gmail.com> wrote: > > On Wed, Jun 18, 2025 at 6:49?PM Noam T. wrote: > >> See image attached. >> Connectivity of the top mesh (first order triangle), can be obtained with >> the code shared before. >> Connectivity of the bottom mesh (second order triangle) is what I would >> be interested in obtaining. >> >> However, given your clarification on what the Plex and the PetscSection >> handle, it might not work; I am trying to get form the Plex what's only >> available from the PetscSection. >> >> The purpose of this extended connectivity is plotting; in particular, >> using VTU files, where the "connectivity" of cells is required, and the >> extra nodes would be needed when using higher-order elements (e.g. >> VTK_QUADRATIC_TRIANGLE, VTK_QUADRATIC_QUAD, etc). >> > > Oh yes. VTK does this in a particularly ugly and backward way. Sigh. There > is nothing we can do about this now, but someone should replace VTK with a > proper interface at some point. > > So I understand why you want it and it is a defensible case, so here is > how you get that (with some explanation). Those locations, I think, should > not be understood as topological things, but rather as the locations of > point evaluation functionals constituting a basis for the dual space (to > your approximation space). I would call DMPlexGetClosureIndices() ( > https://urldefense.us/v3/__https://petsc.org/main/manualpages/DMPlex/DMPlexGetClosureIndices/__;!!G_uCfscf7eWS!Yrsqxk3Y_G1ua_QR9o4DIqJAMyqlrZjtniZoHJqKFNV8FYYjw0XmvDvvukBBCZwlSNnbLDWBwQlY-0p5NBkz$ ) with > a Section having the layout of P2 or Q2. This is the easy way to make that > > PetscSection gs; > PetscFE fe; > DMPolytopeType ct; > PetscInt dim, cStart; > > PetscCall(DMGetDimension(dm, &dim)); > PetscCall(DMPlexGetHeightStratum(dm, 0, &cStart, NULL)); > PetscCall(DMPlexGetCellType(dm, cStart, &ct)); > PetscCall(PetscFECreateLagrangeByCell(PETSC_COMM_SELF, dim, 1, ct, 2, > PETSC_DETERMINE, &fe)); > PetscCall(DMSetField(dm, 0, NULL, (PetscObject)fe)); > PetscCall(PetscFEDestroy(&fe)); > PetscCall(DMCreateDS(dm)); > PetscCall(DMGetGlobalSection(dm, &gs)); > > PetscInt *indices = NULL; > PetscInt Nidx; > > PetscCall(DMPlexGetClosureIndices(dm, gs, gs, cell, PETSC_TRUE, &Nidx, > &indices, NULL, NULL)); > > Thanks, > > MAtt > >> Perhaps I am over complicating things, and all this information can be >> obtained in a different, simpler way. >> >> Thanks. >> Noam >> On Tuesday, June 17th, 2025 at 5:42 PM, Matthew Knepley < >> knepley at gmail.com> wrote: >> >> On Tue, Jun 17, 2025 at 12:43?PM Noam T. >> wrote: >> >>> Thank you. For now, I am dealing with vertices only. >>> >>> Perhaps I did not explain myself properly, or I misunderstood your >>> response. >>> What I meant to say is, given an element of order higher than one, the >>> connectivity matrix I obtain this way only contains as many entries as the >>> first order element: 3 for a triangle, 4 for a tetrahedron, etc. >>> >>> Looking at the closure of any cell in the mesh, this is also the >>> case.However, the nodes are definitely present; e.g. from >>> >>> DMPlexGetCellCoordinates(dm, cell, NULL, nc, NULL, NULL) >>> >>> nc returns the expected value (12 for a 2nd order 6-node planar >>> triangle, 30 for a 2nd order 10-node tetrahedron, etc). >>> >>> The question is, are the indices of these extra nodes obtainable in a >>> similar way as with the code shared before? So that one can have e.g. [0, >>> 1, 2, 3, 4, 5] for a second order triangle, not just [0, 1, 2]. >>> >> >> I am having a hard time understanding what you are after. I think this is >> because many FEM approaches confuse topology with analysis. >> >> The Plex stores topology, and you can retrieve adjacencies between any >> two mesh points. >> >> The PetscSection maps mesh points (cells, faces, edges , vertices) to >> sets of dofs. This is how higher order elements are implemented. Thus, we >> do not have to change topology to get different function spaces. >> >> The intended interface is for you to call DMPlexVecGetClosure() to get >> the closure of a cell (or face, or edge). You can also call >> DMPlexGetClosureIndices(), but index wrangling is what I intended to >> eliminate. >> >> What exactly are you looking for here? >> >> Thanks, >> >> Matt >> >>> Thank you. >>> Noam >>> On Friday, June 13th, 2025 at 3:05 PM, Matthew Knepley < >>> knepley at gmail.com> wrote: >>> >>> On Thu, Jun 12, 2025 at 4:26?PM Noam T. >>> wrote: >>> >>>> >>>> Thank you for the code; it provides exactly what I was looking for. >>>> >>>> Following up on this matter, does this method not work for higher order >>>> elements? For example, using an 8-node quadrilateral, exporting to a >>>> PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node >>>> coordinates in geometry/vertices >>>> >>> >>> If you wanted to include edges/faces, you could do it. First, you would >>> need to decide how you would number things For example, would you number >>> all points contiguously, or separately number cells, vertices, faces and >>> edges. Second, you would check for faces/edges in the closure loop. Right >>> now, we only check for vertices. >>> >>> I would say that this is what convinced me not to do FEM this way. >>> >>> Thanks, >>> >>> Matt >>> >>>> (here a quadrilateral in [0, 10]) >>>> 5.0, 5.0 >>>> 0.0, 0.0 >>>> 10.0, 0.0 >>>> 10.0, 10.0 >>>> 0.0, 10.0 >>>> 5.0, 0.0 >>>> 10.0, 5.0 >>>> 5.0, 10.0 >>>> 0.0, 5.0 >>>> >>>> but the connectivity in viz/topology is >>>> >>>> 0 1 2 3 >>>> >>>> which are likely the corner nodes of the initial, first-order element, >>>> before adding extra nodes for the higher degree element. >>>> >>>> This connectivity values [0, 1, 2, 3, ...] are always the same, >>>> including for other elements, whereas the coordinates are correct >>>> >>>> E.g. for 3rd order triangle in [0, 1], coordinates are given left to >>>> right, bottom to top >>>> 0, 0 >>>> 1/3, 0, >>>> 2/3, 0, >>>> 1, 0 >>>> 0, 1/3 >>>> 1/3, 1/3 >>>> 2/3, 1/3 >>>> 0, 2/3, >>>> 1/3, 2/3 >>>> 0, 1 >>>> >>>> but the connectivity (viz/topology/cells) is [0, 1, 2]. >>>> >>>> Test meshes were created with gmsh from the python API, using >>>> gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... >>>> >>>> Thank you. >>>> Noam >>>> On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley < >>>> knepley at gmail.com> wrote: >>>> >>>> On Thu, May 22, 2025 at 12:25?PM Noam T. >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> Thank you the various options. >>>>> >>>>> Use case here would be obtaining the exact output generated by option >>>>> 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix >>>>> generated under /viz/topology/cells. >>>>> >>>>> There are several ways you might do this. It helps to know what you >>>>> are aiming for. >>>>> >>>>> 1) If you just want this output, it might be easier to just DMView() >>>>> with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the >>>>> cell-vertex topology and coordinates >>>>> >>>>> >>>>> Is it possible to get this information in memory, onto a Mat, Vec or >>>>> some other Int array object directly? it would be handy to have it in order >>>>> to manipulate it and/or save it to a different format/file. Saving to an >>>>> HDF5 and loading it again seems redundant. >>>>> >>>>> >>>>> 2) You can call DMPlexUninterpolate() to produce a mesh with just >>>>> cells and vertices, and output it in any format. >>>>> >>>>> 3) If you want it in memory, but still with global indices (I don't >>>>> understand this use case), then you can use DMPlexCreatePointNumbering() >>>>> for an overall global numbering, or DMPlexCreateCellNumbering() and >>>>> DMPlexCreateVertexNumbering() for separate global numberings. >>>>> >>>>> >>>>> Perhaps I missed it, but getting the connectivity matrix in >>>>> /viz/topology/cells/ did not seem directly trivial to me from the list of >>>>> global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I >>>>> assume all the operations done when calling DMView()). >>>>> >>>> >>>> Something like >>>> >>>> DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>>> DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); >>>> DMPlexGetVertexNumbering(dm, &globalVertexNumbers); >>>> ISGetIndices(globalVertexNumbers, &gv); >>>> for (PetscInt c = cStart; c < cEnd; ++c) { >>>> PetscInt *closure = NULL; >>>> >>>> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>> for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { >>>> if (closure[cl] < vStart || closure[cl] >= vEnd) continue; >>>> const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : >>>> gv[closure[cl]]; >>>> >>>> // Do something with v >>>> } >>>> DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>> } >>>> ISRestoreIndices(globalVertexNumbers, &gv); >>>> ISDestroy(&globalVertexNumbers); >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks, >>>>> Noam. >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yrsqxk3Y_G1ua_QR9o4DIqJAMyqlrZjtniZoHJqKFNV8FYYjw0XmvDvvukBBCZwlSNnbLDWBwQlY--vGRsSC$ >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yrsqxk3Y_G1ua_QR9o4DIqJAMyqlrZjtniZoHJqKFNV8FYYjw0XmvDvvukBBCZwlSNnbLDWBwQlY--vGRsSC$ >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yrsqxk3Y_G1ua_QR9o4DIqJAMyqlrZjtniZoHJqKFNV8FYYjw0XmvDvvukBBCZwlSNnbLDWBwQlY--vGRsSC$ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yrsqxk3Y_G1ua_QR9o4DIqJAMyqlrZjtniZoHJqKFNV8FYYjw0XmvDvvukBBCZwlSNnbLDWBwQlY--vGRsSC$ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yrsqxk3Y_G1ua_QR9o4DIqJAMyqlrZjtniZoHJqKFNV8FYYjw0XmvDvvukBBCZwlSNnbLDWBwQlY--vGRsSC$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 20 21:53:04 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 20 Jun 2025 22:53:04 -0400 Subject: [petsc-users] Workshop at ICERM Sep 13-14 2025 in honor of Bill Gropp's 70th birthday In-Reply-To: References: Message-ID: Sorry I sent an unhelpful URL in my initial email. The URL https://urldefense.us/v3/__https://icerm.brown.edu/program/hot_topics_workshop/htw-25-mlhpc__;!!G_uCfscf7eWS!doqDCkZN1F-L4k91ieXDC4kQH_1_YxokRdafY_IZ6QueVlTF3HD9Lf3GXhW4sHAb7zujHIMCEhRM73FgO8FaTPo$ will be more useful Barry > On Jun 20, 2025, at 1:21?PM, Barry Smith wrote: > > > Bill Gropp was extremely influential in the early days of PETSc. I learned numerical analysis from him as an undergraduate and was later his post-doc. > > ICERM is hosting a meeting, From Modeling to Learning with HPC, Sep 13-14 2025 in honor of Bill Gropp's 70th birthday. You can find more information at https://urldefense.us/v3/__https://app.icerm.brown.edu/Cube/apply__;!!G_uCfscf7eWS!doqDCkZN1F-L4k91ieXDC4kQH_1_YxokRdafY_IZ6QueVlTF3HD9Lf3GXhW4sHAb7zujHIMCEhRM73FgT2NmYcU$ > > Barry > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Jun 22 18:50:55 2025 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 22 Jun 2025 19:50:55 -0400 Subject: [petsc-users] Problem with composite DM index sets In-Reply-To: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> References: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> Message-ID: <49ED9181-FE44-4E24-93E4-4375FBFDED2C@petsc.dev> I'm sorry for not getting back to you sooner. I have attached a working version of your code. Since you were missing use petscdmcomposite the compiler could not generate the correct call to DMCompositeGetGlobalISs() Barry ? > On Jun 17, 2025, at 6:39?PM, Randall Mackie wrote: > > Dear Petsc users - > > I am trying to upgrade my code to petsc-3.23 (from 3.19), and I seem to have run into a problem with DMCompositeGetGlobalISs. > > The example program listed on the man page for DMCompositeGetGlobalISs, https://urldefense.us/v3/__https://petsc.org/release/src/snes/tutorials/ex73f90t.F90.html__;!!G_uCfscf7eWS!YAc7RXgD2pTlFNG8iFwVi1TLBPWnKA5evixlwBL203lkD1DSCns_LprWtUTmZyeLJT_8Fvd6stiDX66wf5t6rfw$ , seems to indicate that a call to DMCompositeGetGlobalISs does not need to allocate the IS pointer and you just pass it directly to DMCompositeGetGlobalISs. > > If I compile and run the simple attached test program (say on 2 processes), I get the following error: > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!YAc7RXgD2pTlFNG8iFwVi1TLBPWnKA5evixlwBL203lkD1DSCns_LprWtUTmZyeLJT_8Fvd6stiDX66w3acojyM$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!YAc7RXgD2pTlFNG8iFwVi1TLBPWnKA5evixlwBL203lkD1DSCns_LprWtUTmZyeLJT_8Fvd6stiDX66wXMLFE2g$ > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: The line numbers in the error traceback may not be exact. > [0]PETSC ERROR: #1 F90Array1dCreate() at /home/rmackie/PETSc/petsc-3.23.3/src/sys/ftn-custom/f90_cwrap.c:123 > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > > If I uncomment the line to allocate the pointer, I get a very long traceback with lots of error messages. > > What is the correct way to use DMCompositeGetGlobalISs in Fortran? With or without the pointer allocation, and what is the right way to do this without the errors it seems to generate? > > Thanks, > > Randy Mackie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.F90 Type: application/octet-stream Size: 1312 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Sun Jun 22 19:06:09 2025 From: rlmackie862 at gmail.com (Randall Mackie) Date: Sun, 22 Jun 2025 17:06:09 -0700 Subject: [petsc-users] Problem with composite DM index sets In-Reply-To: <49ED9181-FE44-4E24-93E4-4375FBFDED2C@petsc.dev> References: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> <49ED9181-FE44-4E24-93E4-4375FBFDED2C@petsc.dev> Message-ID: <219C534B-35E3-4D72-9B4D-7B6702A4BFBF@gmail.com> Thanks Barry! > On Jun 22, 2025, at 4:50?PM, Barry Smith wrote: > > > I'm sorry for not getting back to you sooner. I have attached a working version of your code. Since you were missing > > use petscdmcomposite > > the compiler could not generate the correct call to DMCompositeGetGlobalISs() > > Barry > > > > >> On Jun 17, 2025, at 6:39?PM, Randall Mackie wrote: >> >> Dear Petsc users - >> >> I am trying to upgrade my code to petsc-3.23 (from 3.19), and I seem to have run into a problem with DMCompositeGetGlobalISs. >> >> The example program listed on the man page for DMCompositeGetGlobalISs, https://urldefense.us/v3/__https://petsc.org/release/src/snes/tutorials/ex73f90t.F90.html__;!!G_uCfscf7eWS!ZyuqN-aqT36-BdaEGEwwCK3DP9cdT5SGsfHT9nMWiXXRDCdzcYU2-Ns-3lpwwEqrEu3XthkjLensAwqjSn0zoF_GSw$ , seems to indicate that a call to DMCompositeGetGlobalISs does not need to allocate the IS pointer and you just pass it directly to DMCompositeGetGlobalISs. >> >> If I compile and run the simple attached test program (say on 2 processes), I get the following error: >> >> [0]PETSC ERROR: ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!ZyuqN-aqT36-BdaEGEwwCK3DP9cdT5SGsfHT9nMWiXXRDCdzcYU2-Ns-3lpwwEqrEu3XthkjLensAwqjSn2EvwnboA$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZyuqN-aqT36-BdaEGEwwCK3DP9cdT5SGsfHT9nMWiXXRDCdzcYU2-Ns-3lpwwEqrEu3XthkjLensAwqjSn0Hqvgizg$ >> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [0]PETSC ERROR: The line numbers in the error traceback may not be exact. >> [0]PETSC ERROR: #1 F90Array1dCreate() at /home/rmackie/PETSc/petsc-3.23.3/src/sys/ftn-custom/f90_cwrap.c:123 >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> >> If I uncomment the line to allocate the pointer, I get a very long traceback with lots of error messages. >> >> What is the correct way to use DMCompositeGetGlobalISs in Fortran? With or without the pointer allocation, and what is the right way to do this without the errors it seems to generate? >> >> Thanks, >> >> Randy Mackie >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Sun Jun 22 19:46:46 2025 From: dontbugthedevs at proton.me (Noam T.) Date: Mon, 23 Jun 2025 00:46:46 +0000 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: References: Message-ID: <6-SC6e7RrMxCHHKW0RHH7hBQ9Pj3LO7wXSOQayzfWSWnYTQ-WXmS5az-tfyADwLofuhTEzQFld9joR3R-4ug4oQUiGVoDmc4QbgKMJHUtfU=@proton.me> On Friday, June 20th, 2025 at 9:37 PM, Matthew Knepley wrote: > I do have this implemented. If you give -dm_plex_high_order_view, it will refine the grid and project into the linear space. > You can see that the code is pretty simple: Thanks ,that will be handy. Perhaps this whole idea of using higher-order elements will offer no benefit for visualization and we;ll end up using linear elements only. On Friday, June 20th, 2025 at 9:37 PM, Matthew Knepley wrote: > Yes. The idea is to think of coordinates as a discretized field on the mesh, exactly as the solution field. Thus if you want higher order coordinates, you choose a higher order coordinate space. I give the coordinate space prefix cdm_, so you could say > > -cdm_petscspace_degree 2 The use of the flag "-..._petscspace_degree N" for higher order approximation space is indeed something we use. However, you mention this being a field, which is the part confusing me; I am not quite sure how to get coordinates values form it; I am only aware of functions such as where DMGetCoordinates / DMPlexGetCellCoordinates, which won't do the job here. The higher order space is indeed created, as shown with -dm_petscds_view --- Discrete System with 1 fields cell total dim 6 total comp 1 Field P2 FEM 1 component (implicit) (Nq 6 Nqc 1) 1-jet PetscFE Object: P2 (cdm_) 1 MPI process type: basic Basic Finite Element in 2 dimensions with 1 components PetscSpace Object: P2 (cdm_) 1 MPI process type: poly Space in 2 variables with 1 components, size 6 Polynomial space of degree 2 PetscDualSpace Object: P2 (cdm_) 1 MPI process type: lagrange Dual space with 1 components, size 6 Continuous Lagrange dual space Quadrature on a triangle of order 4 on 6 points (dim 2) Weak Form System with 1 fields --- Thanks, Noam ------- On Friday, June 20th, 2025 at 9:37 PM, Matthew Knepley wrote: > On Fri, Jun 20, 2025 at 5:14?PM Noam T. wrote: > >> Thank you once again, the code provides exactly what needed. >> An alternative for the VTK use was subdividing cells using corner nodes and integration points, such that all cells were first order. Any "better" alternative format/visualization software for this purpose? > > I do have this implemented. If you give -dm_plex_high_order_view, it will refine the grid and project into the linear space. > You can see that the code is pretty simple: > > https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plex.c?ref_type=heads*L2021__;Iw!!G_uCfscf7eWS!f8aNNKGwSGN8vZeoKzWun79cKAhlzOkGLwSR4km2j_SaiJJhvUBmTsk094zqOSUvigxdI4J6WmRkIJN617sEAidRPshVSYYl$ > > It uses DMPlexInterpolate() to connect the spaces. > >> Possibly the last question in relation to this matter. We use first order meshes only, as you suggest, and let PETSc handle everything high-order through the approximation space. Hence, when retrieving node coordinates with DMGetCoordinates(Local) or DMPlexGetCellCoordinates, one gets the corner nodes only. >> >> Is the list of additional, high-order nodes coordinates readily available (stored) somewhere to be retrieved? > > Yes. The idea is to think of coordinates as a discretized field on the mesh, exactly as the solution field. Thus if you want higher order coordinates, you choose a higher order coordinate space. I give the coordinate space prefix cdm_, so you could say > > -cdm_petscspace_degree 2 > > to get quadratic coordinates. There are some tests in Plex tests ex33.c > > Thanks, > > Matt > >> They can be computed (e.g. using DMPlexReferenceToCoordinates knowing their position in the reference cell; or using corner nodes coordinates), but this will result in shared nodes being computed possibly several times; the large the mesh, the worse. >> >> E.g. in the example of the image attached before, DMPlexGetClosureIndices returns >> [4, 5, 6, 0, 1, 3] for the first cell >> [7, 8, 5, 1, 2, 3] for the second cell >> so that 3 nodes (5, 1, 3) will be computed twice if done naively, cell by cell. >> >> Thank you, >> Noam >> On Thursday, June 19th, 2025 at 12:43 AM, Matthew Knepley wrote: >> >>> On Wed, Jun 18, 2025 at 6:49?PM Noam T. wrote: >>> >>>> See image attached. >>>> Connectivity of the top mesh (first order triangle), can be obtained with the code shared before. >>>> Connectivity of the bottom mesh (second order triangle) is what I would be interested in obtaining. >>>> >>>> However, given your clarification on what the Plex and the PetscSection handle, it might not work; I am trying to get form the Plex what's only available from the PetscSection. >>>> >>>> The purpose of this extended connectivity is plotting; in particular, using VTU files, where the "connectivity" of cells is required, and the extra nodes would be needed when using higher-order elements (e.g. VTK_QUADRATIC_TRIANGLE, VTK_QUADRATIC_QUAD, etc). >>> >>> Oh yes. VTK does this in a particularly ugly and backward way. Sigh. There is nothing we can do about this now, but someone should replace VTK with a proper interface at some point. >>> >>> So I understand why you want it and it is a defensible case, so here is how you get that (with some explanation). Those locations, I think, should not be understood as topological things, but rather as the locations of point evaluation functionals constituting a basis for the dual space (to your approximation space). I would call DMPlexGetClosureIndices() (https://urldefense.us/v3/__https://petsc.org/main/manualpages/DMPlex/DMPlexGetClosureIndices/__;!!G_uCfscf7eWS!f8aNNKGwSGN8vZeoKzWun79cKAhlzOkGLwSR4km2j_SaiJJhvUBmTsk094zqOSUvigxdI4J6WmRkIJN617sEAidRPhIhe3Gt$ ) with a Section having the layout of P2 or Q2. This is the easy way to make that >>> >>> PetscSection gs; >>> PetscFE fe; >>> DMPolytopeType ct; >>> PetscInt dim, cStart; >>> >>> PetscCall(DMGetDimension(dm, &dim)); >>> PetscCall(DMPlexGetHeightStratum(dm, 0, &cStart, NULL)); >>> PetscCall(DMPlexGetCellType(dm, cStart, &ct)); >>> PetscCall(PetscFECreateLagrangeByCell(PETSC_COMM_SELF, dim, 1, ct, 2, PETSC_DETERMINE, &fe)); >>> PetscCall(DMSetField(dm, 0, NULL, (PetscObject)fe)); >>> PetscCall(PetscFEDestroy(&fe)); >>> PetscCall(DMCreateDS(dm)); >>> PetscCall(DMGetGlobalSection(dm, &gs)); >>> >>> PetscInt *indices = NULL; >>> PetscInt Nidx; >>> >>> PetscCall(DMPlexGetClosureIndices(dm, gs, gs, cell, PETSC_TRUE, &Nidx, &indices, NULL, NULL)); >>> >>> Thanks, >>> >>> MAtt >>> >>>> Perhaps I am over complicating things, and all this information can be obtained in a different, simpler way. >>>> >>>> Thanks. >>>> Noam >>>> On Tuesday, June 17th, 2025 at 5:42 PM, Matthew Knepley wrote: >>>> >>>>> On Tue, Jun 17, 2025 at 12:43?PM Noam T. wrote: >>>>> >>>>>> Thank you. For now, I am dealing with vertices only. >>>>>> >>>>>> Perhaps I did not explain myself properly, or I misunderstood your response. >>>>>> What I meant to say is, given an element of order higher than one, the connectivity matrix I obtain this way only contains as many entries as the first order element: 3 for a triangle, 4 for a tetrahedron, etc. >>>>>> >>>>>> Looking at the closure of any cell in the mesh, this is also the case.However, the nodes are definitely present; e.g. from >>>>>> >>>>>> DMPlexGetCellCoordinates(dm, cell, NULL, nc, NULL, NULL) >>>>>> >>>>>> nc returns the expected value (12 for a 2nd order 6-node planar triangle, 30 for a 2nd order 10-node tetrahedron, etc). >>>>>> >>>>>> The question is, are the indices of these extra nodes obtainable in a similar way as with the code shared before? So that one can have e.g. [0, 1, 2, 3, 4, 5] for a second order triangle, not just [0, 1, 2]. >>>>> >>>>> I am having a hard time understanding what you are after. I think this is because many FEM approaches confuse topology with analysis. >>>>> >>>>> The Plex stores topology, and you can retrieve adjacencies between any two mesh points. >>>>> >>>>> The PetscSection maps mesh points (cells, faces, edges , vertices) to sets of dofs. This is how higher order elements are implemented. Thus, we do not have to change topology to get different function spaces. >>>>> >>>>> The intended interface is for you to call DMPlexVecGetClosure() to get the closure of a cell (or face, or edge). You can also call DMPlexGetClosureIndices(), but index wrangling is what I intended to eliminate. >>>>> >>>>> What exactly are you looking for here? >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> Thank you. >>>>>> Noam >>>>>> On Friday, June 13th, 2025 at 3:05 PM, Matthew Knepley wrote: >>>>>> >>>>>>> On Thu, Jun 12, 2025 at 4:26?PM Noam T. wrote: >>>>>>> >>>>>>>> Thank you for the code; it provides exactly what I was looking for. >>>>>>>> >>>>>>>> Following up on this matter, does this method not work for higher order elements? For example, using an 8-node quadrilateral, exporting to a PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node coordinates in geometry/vertices >>>>>>> >>>>>>> If you wanted to include edges/faces, you could do it. First, you would need to decide how you would number things For example, would you number all points contiguously, or separately number cells, vertices, faces and edges. Second, you would check for faces/edges in the closure loop. Right now, we only check for vertices. >>>>>>> >>>>>>> I would say that this is what convinced me not to do FEM this way. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>>> (here a quadrilateral in [0, 10]) >>>>>>>> 5.0, 5.0 >>>>>>>> 0.0, 0.0 >>>>>>>> 10.0, 0.0 >>>>>>>> 10.0, 10.0 >>>>>>>> 0.0, 10.0 >>>>>>>> 5.0, 0.0 >>>>>>>> 10.0, 5.0 >>>>>>>> 5.0, 10.0 >>>>>>>> 0.0, 5.0 >>>>>>>> >>>>>>>> but the connectivity in viz/topology is >>>>>>>> >>>>>>>> 0 1 2 3 >>>>>>>> >>>>>>>> which are likely the corner nodes of the initial, first-order element, before adding extra nodes for the higher degree element. >>>>>>>> >>>>>>>> This connectivity values [0, 1, 2, 3, ...] are always the same, including for other elements, whereas the coordinates are correct >>>>>>>> >>>>>>>> E.g. for 3rd order triangle in [0, 1], coordinates are given left to right, bottom to top >>>>>>>> 0, 0 >>>>>>>> 1/3, 0, >>>>>>>> 2/3, 0, >>>>>>>> 1, 0 >>>>>>>> 0, 1/3 >>>>>>>> 1/3, 1/3 >>>>>>>> 2/3, 1/3 >>>>>>>> 0, 2/3, >>>>>>>> 1/3, 2/3 >>>>>>>> 0, 1 >>>>>>>> >>>>>>>> but the connectivity (viz/topology/cells) is [0, 1, 2]. >>>>>>>> >>>>>>>> Test meshes were created with gmsh from the python API, using >>>>>>>> gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... >>>>>>>> >>>>>>>> Thank you. >>>>>>>> Noam >>>>>>>> On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley wrote: >>>>>>>> >>>>>>>>> On Thu, May 22, 2025 at 12:25?PM Noam T. wrote: >>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> Thank you the various options. >>>>>>>>>> >>>>>>>>>> Use case here would be obtaining the exact output generated by option 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix generated under /viz/topology/cells. >>>>>>>>>> >>>>>>>>>>> There are several ways you might do this. It helps to know what you are aiming for. >>>>>>>>>>> >>>>>>>>>>> 1) If you just want this output, it might be easier to just DMView() with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the cell-vertex topology and coordinates >>>>>>>>>> >>>>>>>>>> Is it possible to get this information in memory, onto a Mat, Vec or some other Int array object directly? it would be handy to have it in order to manipulate it and/or save it to a different format/file. Saving to an HDF5 and loading it again seems redundant. >>>>>>>>>> >>>>>>>>>>> 2) You can call DMPlexUninterpolate() to produce a mesh with just cells and vertices, and output it in any format. >>>>>>>>>>> >>>>>>>>>>> 3) If you want it in memory, but still with global indices (I don't understand this use case), then you can use DMPlexCreatePointNumbering() for an overall global numbering, or DMPlexCreateCellNumbering() and DMPlexCreateVertexNumbering() for separate global numberings. >>>>>>>>>> >>>>>>>>>> Perhaps I missed it, but getting the connectivity matrix in /viz/topology/cells/ did not seem directly trivial to me from the list of global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I assume all the operations done when calling DMView()). >>>>>>>>> >>>>>>>>> Something like >>>>>>>>> >>>>>>>>> DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>>>>>>>> DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); >>>>>>>>> DMPlexGetVertexNumbering(dm, &globalVertexNumbers); >>>>>>>>> ISGetIndices(globalVertexNumbers, &gv); >>>>>>>>> for (PetscInt c = cStart; c < cEnd; ++c) { >>>>>>>>> PetscInt *closure = NULL; >>>>>>>>> >>>>>>>>> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>>>>>>> for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { >>>>>>>>> if (closure[cl] < vStart || closure[cl] >= vEnd) continue; >>>>>>>>> const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : gv[closure[cl]]; >>>>>>>>> >>>>>>>>> // Do something with v >>>>>>>>> } >>>>>>>>> DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>>>>>>> } >>>>>>>>> ISRestoreIndices(globalVertexNumbers, &gv); >>>>>>>>> ISDestroy(&globalVertexNumbers); >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Noam. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!f8aNNKGwSGN8vZeoKzWun79cKAhlzOkGLwSR4km2j_SaiJJhvUBmTsk094zqOSUvigxdI4J6WmRkIJN617sEAidRPnN3rfK9$ >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!f8aNNKGwSGN8vZeoKzWun79cKAhlzOkGLwSR4km2j_SaiJJhvUBmTsk094zqOSUvigxdI4J6WmRkIJN617sEAidRPnN3rfK9$ >>>>> >>>>> -- >>>>> >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!f8aNNKGwSGN8vZeoKzWun79cKAhlzOkGLwSR4km2j_SaiJJhvUBmTsk094zqOSUvigxdI4J6WmRkIJN617sEAidRPnN3rfK9$ >>> >>> -- >>> >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!f8aNNKGwSGN8vZeoKzWun79cKAhlzOkGLwSR4km2j_SaiJJhvUBmTsk094zqOSUvigxdI4J6WmRkIJN617sEAidRPnN3rfK9$ > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/*(http:/*www.cse.buffalo.edu/*knepley/)__;fl0vfg!!G_uCfscf7eWS!f8aNNKGwSGN8vZeoKzWun79cKAhlzOkGLwSR4km2j_SaiJJhvUBmTsk094zqOSUvigxdI4J6WmRkIJN617sEAidRPnN3rfK9$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Jun 22 20:27:20 2025 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 22 Jun 2025 21:27:20 -0400 Subject: [petsc-users] Problem with composite DM index sets In-Reply-To: <219C534B-35E3-4D72-9B4D-7B6702A4BFBF@gmail.com> References: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> <49ED9181-FE44-4E24-93E4-4375FBFDED2C@petsc.dev> <219C534B-35E3-4D72-9B4D-7B6702A4BFBF@gmail.com> Message-ID: <3F363791-D3CC-47FD-8F19-0850B39AD5EE@petsc.dev> implicit none prevents using undeclared variables. Is there a way to avoid calling any functions/suboutines that don't have an interface declared in a module? That would have found this problem. Barry > On Jun 22, 2025, at 8:06?PM, Randall Mackie wrote: > > Thanks Barry! > > > >> On Jun 22, 2025, at 4:50?PM, Barry Smith wrote: >> >> >> I'm sorry for not getting back to you sooner. I have attached a working version of your code. Since you were missing >> >> use petscdmcomposite >> >> the compiler could not generate the correct call to DMCompositeGetGlobalISs() >> >> Barry >> >> >> >> >>> On Jun 17, 2025, at 6:39?PM, Randall Mackie wrote: >>> >>> Dear Petsc users - >>> >>> I am trying to upgrade my code to petsc-3.23 (from 3.19), and I seem to have run into a problem with DMCompositeGetGlobalISs. >>> >>> The example program listed on the man page for DMCompositeGetGlobalISs, https://urldefense.us/v3/__https://petsc.org/release/src/snes/tutorials/ex73f90t.F90.html__;!!G_uCfscf7eWS!f0L9a3sH41EKMm27K03wzM-zl9LjhI-XIHKVLQdkjm5v_R98tKT8zlw6goKrNPjT8PommbiIyWOxn1xaWVXMKjM$ , seems to indicate that a call to DMCompositeGetGlobalISs does not need to allocate the IS pointer and you just pass it directly to DMCompositeGetGlobalISs. >>> >>> If I compile and run the simple attached test program (say on 2 processes), I get the following error: >>> >>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!f0L9a3sH41EKMm27K03wzM-zl9LjhI-XIHKVLQdkjm5v_R98tKT8zlw6goKrNPjT8PommbiIyWOxn1xa4KzHtdM$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!f0L9a3sH41EKMm27K03wzM-zl9LjhI-XIHKVLQdkjm5v_R98tKT8zlw6goKrNPjT8PommbiIyWOxn1xaB1ieH9I$ >>> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [0]PETSC ERROR: The line numbers in the error traceback may not be exact. >>> [0]PETSC ERROR: #1 F90Array1dCreate() at /home/rmackie/PETSc/petsc-3.23.3/src/sys/ftn-custom/f90_cwrap.c:123 >>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >>> >>> >>> If I uncomment the line to allocate the pointer, I get a very long traceback with lots of error messages. >>> >>> What is the correct way to use DMCompositeGetGlobalISs in Fortran? With or without the pointer allocation, and what is the right way to do this without the errors it seems to generate? >>> >>> Thanks, >>> >>> Randy Mackie >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Jun 22 20:35:05 2025 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 22 Jun 2025 21:35:05 -0400 Subject: [petsc-users] Element connectivity of a DMPlex In-Reply-To: <6-SC6e7RrMxCHHKW0RHH7hBQ9Pj3LO7wXSOQayzfWSWnYTQ-WXmS5az-tfyADwLofuhTEzQFld9joR3R-4ug4oQUiGVoDmc4QbgKMJHUtfU=@proton.me> References: <6-SC6e7RrMxCHHKW0RHH7hBQ9Pj3LO7wXSOQayzfWSWnYTQ-WXmS5az-tfyADwLofuhTEzQFld9joR3R-4ug4oQUiGVoDmc4QbgKMJHUtfU=@proton.me> Message-ID: On Sun, Jun 22, 2025 at 8:46?PM Noam T. wrote: > On Friday, June 20th, 2025 at 9:37 PM, Matthew Knepley > wrote: > > > I do have this implemented. If you give -dm_plex_high_order_view, it will > refine the grid and project into the linear space. > You can see that the code is pretty simple: > > > Thanks ,that will be handy. Perhaps this whole idea of using higher-order > elements will offer no benefit for visualization and we;ll end up using > linear elements only. > > On Friday, June 20th, 2025 at 9:37 PM, Matthew Knepley > wrote: > > > Yes. The idea is to think of coordinates as a discretized field on the > mesh, exactly as the solution field. Thus if you want higher order > coordinates, you choose a higher order coordinate space. I give the > coordinate space prefix cdm_, so you could say > > -cdm_petscspace_degree 2 > > > The use of the flag "-..._petscspace_degree N" for higher order > approximation space is indeed something we use. However, you mention this > being a field, which is the part confusing me; I am not quite sure how to > get coordinates values form it; I am only aware of functions such as where > DMGetCoordinates / DMPlexGetCellCoordinates, which won't do the job here. > 1. You could get the values the same way you get values from a higher order solution field. That would probably use DMGetCoordinates() and the coordinate Section. 2. You could call DMPlexGetCellCoordinates(), which will mess around and give you the coordinates on the closure of the cell. Thanks, Matt > The higher order space is indeed created, as shown with -dm_petscds_view > > --- > Discrete System with 1 fields > cell total dim 6 total comp 1 > Field P2 FEM 1 component (implicit) (Nq 6 Nqc 1) 1-jet > PetscFE Object: P2 (cdm_) 1 MPI process > type: basic > Basic Finite Element in 2 dimensions with 1 components > PetscSpace Object: P2 (cdm_) 1 MPI process > type: poly > Space in 2 variables with 1 components, size 6 > Polynomial space of degree 2 > PetscDualSpace Object: P2 (cdm_) 1 MPI process > type: lagrange > Dual space with 1 components, size 6 > Continuous Lagrange dual space > Quadrature on a triangle of order 4 on 6 points (dim 2) > Weak Form System with 1 fields > --- > > Thanks, > Noam > > ------- > On Friday, June 20th, 2025 at 9:37 PM, Matthew Knepley > wrote: > > On Fri, Jun 20, 2025 at 5:14?PM Noam T. wrote: > >> Thank you once again, the code provides exactly what needed. >> An alternative for the VTK use was subdividing cells using corner nodes >> and integration points, such that all cells were first order. Any "better" >> alternative format/visualization software for this purpose? >> > > I do have this implemented. If you give -dm_plex_high_order_view, it will > refine the grid and project into the linear space. > You can see that the code is pretty simple: > > > https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plex.c?ref_type=heads*L2021__;Iw!!G_uCfscf7eWS!dp4Qq6UBZAvh4t0Grm8-HusDmQH03-Azcq6ZI_rLYL-jWzYxJO2Hydno1EYzrv9yjLb7YEUrBkc9OwL-_rO_$ > > It uses DMPlexInterpolate() to connect the spaces. > >> Possibly the last question in relation to this matter. We use first order >> meshes only, as you suggest, and let PETSc handle everything high-order >> through the approximation space. Hence, when retrieving node coordinates >> with DMGetCoordinates(Local) or DMPlexGetCellCoordinates, one gets the >> corner nodes only. >> >> Is the list of additional, high-order nodes coordinates readily available >> (stored) somewhere to be retrieved? >> > > Yes. The idea is to think of coordinates as a discretized field on the > mesh, exactly as the solution field. Thus if you want higher order > coordinates, you choose a higher order coordinate space. I give the > coordinate space prefix cdm_, so you could say > > -cdm_petscspace_degree 2 > > to get quadratic coordinates. There are some tests in Plex tests ex33.c > > Thanks, > > Matt > >> They can be computed (e.g. using DMPlexReferenceToCoordinates knowing >> their position in the reference cell; or using corner nodes coordinates), >> but this will result in shared nodes being computed possibly several times; >> the large the mesh, the worse. >> >> E.g. in the example of the image attached before, DMPlexGetClosureIndices >> returns >> [4, 5, 6, 0, 1, 3] for the first cell >> [7, 8, 5, 1, 2, 3] for the second cell >> so that 3 nodes (5, 1, 3) will be computed twice if done naively, cell by >> cell. >> >> Thank you, >> Noam >> On Thursday, June 19th, 2025 at 12:43 AM, Matthew Knepley < >> knepley at gmail.com> wrote: >> >> On Wed, Jun 18, 2025 at 6:49?PM Noam T. wrote: >> >>> See image attached. >>> Connectivity of the top mesh (first order triangle), can be obtained >>> with the code shared before. >>> Connectivity of the bottom mesh (second order triangle) is what I would >>> be interested in obtaining. >>> >>> However, given your clarification on what the Plex and the PetscSection >>> handle, it might not work; I am trying to get form the Plex what's only >>> available from the PetscSection. >>> >>> The purpose of this extended connectivity is plotting; in particular, >>> using VTU files, where the "connectivity" of cells is required, and the >>> extra nodes would be needed when using higher-order elements (e.g. >>> VTK_QUADRATIC_TRIANGLE, VTK_QUADRATIC_QUAD, etc). >>> >> >> Oh yes. VTK does this in a particularly ugly and backward way. Sigh. >> There is nothing we can do about this now, but someone should replace VTK >> with a proper interface at some point. >> >> So I understand why you want it and it is a defensible case, so here is >> how you get that (with some explanation). Those locations, I think, should >> not be understood as topological things, but rather as the locations of >> point evaluation functionals constituting a basis for the dual space (to >> your approximation space). I would call DMPlexGetClosureIndices() ( >> https://urldefense.us/v3/__https://petsc.org/main/manualpages/DMPlex/DMPlexGetClosureIndices/__;!!G_uCfscf7eWS!dp4Qq6UBZAvh4t0Grm8-HusDmQH03-Azcq6ZI_rLYL-jWzYxJO2Hydno1EYzrv9yjLb7YEUrBkc9O6jYpeQt$ ) with >> a Section having the layout of P2 or Q2. This is the easy way to make that >> >> PetscSection gs; >> PetscFE fe; >> DMPolytopeType ct; >> PetscInt dim, cStart; >> >> PetscCall(DMGetDimension(dm, &dim)); >> PetscCall(DMPlexGetHeightStratum(dm, 0, &cStart, NULL)); >> PetscCall(DMPlexGetCellType(dm, cStart, &ct)); >> PetscCall(PetscFECreateLagrangeByCell(PETSC_COMM_SELF, dim, 1, ct, 2, >> PETSC_DETERMINE, &fe)); >> PetscCall(DMSetField(dm, 0, NULL, (PetscObject)fe)); >> PetscCall(PetscFEDestroy(&fe)); >> PetscCall(DMCreateDS(dm)); >> PetscCall(DMGetGlobalSection(dm, &gs)); >> >> PetscInt *indices = NULL; >> PetscInt Nidx; >> >> PetscCall(DMPlexGetClosureIndices(dm, gs, gs, cell, PETSC_TRUE, &Nidx, >> &indices, NULL, NULL)); >> >> Thanks, >> >> MAtt >> >>> Perhaps I am over complicating things, and all this information can be >>> obtained in a different, simpler way. >>> >>> Thanks. >>> Noam >>> On Tuesday, June 17th, 2025 at 5:42 PM, Matthew Knepley < >>> knepley at gmail.com> wrote: >>> >>> On Tue, Jun 17, 2025 at 12:43?PM Noam T. >>> wrote: >>> >>>> Thank you. For now, I am dealing with vertices only. >>>> >>>> Perhaps I did not explain myself properly, or I misunderstood your >>>> response. >>>> What I meant to say is, given an element of order higher than one, the >>>> connectivity matrix I obtain this way only contains as many entries as the >>>> first order element: 3 for a triangle, 4 for a tetrahedron, etc. >>>> >>>> Looking at the closure of any cell in the mesh, this is also the >>>> case.However, the nodes are definitely present; e.g. from >>>> >>>> DMPlexGetCellCoordinates(dm, cell, NULL, nc, NULL, NULL) >>>> >>>> nc returns the expected value (12 for a 2nd order 6-node planar >>>> triangle, 30 for a 2nd order 10-node tetrahedron, etc). >>>> >>>> The question is, are the indices of these extra nodes obtainable in a >>>> similar way as with the code shared before? So that one can have e.g. [0, >>>> 1, 2, 3, 4, 5] for a second order triangle, not just [0, 1, 2]. >>>> >>> >>> I am having a hard time understanding what you are after. I think this >>> is because many FEM approaches confuse topology with analysis. >>> >>> The Plex stores topology, and you can retrieve adjacencies between any >>> two mesh points. >>> >>> The PetscSection maps mesh points (cells, faces, edges , vertices) to >>> sets of dofs. This is how higher order elements are implemented. Thus, we >>> do not have to change topology to get different function spaces. >>> >>> The intended interface is for you to call DMPlexVecGetClosure() to get >>> the closure of a cell (or face, or edge). You can also call >>> DMPlexGetClosureIndices(), but index wrangling is what I intended to >>> eliminate. >>> >>> What exactly are you looking for here? >>> >>> Thanks, >>> >>> Matt >>> >>>> Thank you. >>>> Noam >>>> On Friday, June 13th, 2025 at 3:05 PM, Matthew Knepley < >>>> knepley at gmail.com> wrote: >>>> >>>> On Thu, Jun 12, 2025 at 4:26?PM Noam T. >>>> wrote: >>>> >>>>> >>>>> Thank you for the code; it provides exactly what I was looking for. >>>>> >>>>> Following up on this matter, does this method not work for higher >>>>> order elements? For example, using an 8-node quadrilateral, exporting to a >>>>> PETSC_VIEWER_HDF5_VIZ viewer provides the correct matrix of node >>>>> coordinates in geometry/vertices >>>>> >>>> >>>> If you wanted to include edges/faces, you could do it. First, you would >>>> need to decide how you would number things For example, would you number >>>> all points contiguously, or separately number cells, vertices, faces and >>>> edges. Second, you would check for faces/edges in the closure loop. Right >>>> now, we only check for vertices. >>>> >>>> I would say that this is what convinced me not to do FEM this way. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> (here a quadrilateral in [0, 10]) >>>>> 5.0, 5.0 >>>>> 0.0, 0.0 >>>>> 10.0, 0.0 >>>>> 10.0, 10.0 >>>>> 0.0, 10.0 >>>>> 5.0, 0.0 >>>>> 10.0, 5.0 >>>>> 5.0, 10.0 >>>>> 0.0, 5.0 >>>>> >>>>> but the connectivity in viz/topology is >>>>> >>>>> 0 1 2 3 >>>>> >>>>> which are likely the corner nodes of the initial, first-order element, >>>>> before adding extra nodes for the higher degree element. >>>>> >>>>> This connectivity values [0, 1, 2, 3, ...] are always the same, >>>>> including for other elements, whereas the coordinates are correct >>>>> >>>>> E.g. for 3rd order triangle in [0, 1], coordinates are given left to >>>>> right, bottom to top >>>>> 0, 0 >>>>> 1/3, 0, >>>>> 2/3, 0, >>>>> 1, 0 >>>>> 0, 1/3 >>>>> 1/3, 1/3 >>>>> 2/3, 1/3 >>>>> 0, 2/3, >>>>> 1/3, 2/3 >>>>> 0, 1 >>>>> >>>>> but the connectivity (viz/topology/cells) is [0, 1, 2]. >>>>> >>>>> Test meshes were created with gmsh from the python API, using >>>>> gmsh.option.setNumber("Mesh.ElementOrder", n), for n = 1, 2, 3, ... >>>>> >>>>> Thank you. >>>>> Noam >>>>> On Friday, May 23rd, 2025 at 12:56 AM, Matthew Knepley < >>>>> knepley at gmail.com> wrote: >>>>> >>>>> On Thu, May 22, 2025 at 12:25?PM Noam T. >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> Thank you the various options. >>>>>> >>>>>> Use case here would be obtaining the exact output generated by option >>>>>> 1), DMView() with PETSC_VIEWER_HDF5_VIZ; in particular, the matrix >>>>>> generated under /viz/topology/cells. >>>>>> >>>>>> There are several ways you might do this. It helps to know what you >>>>>> are aiming for. >>>>>> >>>>>> 1) If you just want this output, it might be easier to just DMView() >>>>>> with the PETSC_VIEWER_HDF5_VIZ format, since that just outputs the >>>>>> cell-vertex topology and coordinates >>>>>> >>>>>> >>>>>> Is it possible to get this information in memory, onto a Mat, Vec or >>>>>> some other Int array object directly? it would be handy to have it in order >>>>>> to manipulate it and/or save it to a different format/file. Saving to an >>>>>> HDF5 and loading it again seems redundant. >>>>>> >>>>>> >>>>>> 2) You can call DMPlexUninterpolate() to produce a mesh with just >>>>>> cells and vertices, and output it in any format. >>>>>> >>>>>> 3) If you want it in memory, but still with global indices (I don't >>>>>> understand this use case), then you can use DMPlexCreatePointNumbering() >>>>>> for an overall global numbering, or DMPlexCreateCellNumbering() and >>>>>> DMPlexCreateVertexNumbering() for separate global numberings. >>>>>> >>>>>> >>>>>> Perhaps I missed it, but getting the connectivity matrix in >>>>>> /viz/topology/cells/ did not seem directly trivial to me from the list of >>>>>> global indices returned by DMPlexGetCell/Point/VertexNumbering() (i.e. I >>>>>> assume all the operations done when calling DMView()). >>>>>> >>>>> >>>>> Something like >>>>> >>>>> DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>>>> DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); >>>>> DMPlexGetVertexNumbering(dm, &globalVertexNumbers); >>>>> ISGetIndices(globalVertexNumbers, &gv); >>>>> for (PetscInt c = cStart; c < cEnd; ++c) { >>>>> PetscInt *closure = NULL; >>>>> >>>>> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>>> for (PetscInt cl = 0; c < Ncl * 2; cl += 2) { >>>>> if (closure[cl] < vStart || closure[cl] >= vEnd) continue; >>>>> const PetscInt v = gv[closure[cl]] < 0 ? -(gv[closure[cl]] + 1) : >>>>> gv[closure[cl]]; >>>>> >>>>> // Do something with v >>>>> } >>>>> DMPlexRestoreTransitiveClosure(dm, c, PETSC_TRUE, &Ncl, &closure); >>>>> } >>>>> ISRestoreIndices(globalVertexNumbers, &gv); >>>>> ISDestroy(&globalVertexNumbers); >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> Thanks, >>>>>> Noam. >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dp4Qq6UBZAvh4t0Grm8-HusDmQH03-Azcq6ZI_rLYL-jWzYxJO2Hydno1EYzrv9yjLb7YEUrBkc9O2GHzuqj$ >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dp4Qq6UBZAvh4t0Grm8-HusDmQH03-Azcq6ZI_rLYL-jWzYxJO2Hydno1EYzrv9yjLb7YEUrBkc9O2GHzuqj$ >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dp4Qq6UBZAvh4t0Grm8-HusDmQH03-Azcq6ZI_rLYL-jWzYxJO2Hydno1EYzrv9yjLb7YEUrBkc9O2GHzuqj$ >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dp4Qq6UBZAvh4t0Grm8-HusDmQH03-Azcq6ZI_rLYL-jWzYxJO2Hydno1EYzrv9yjLb7YEUrBkc9O2GHzuqj$ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dp4Qq6UBZAvh4t0Grm8-HusDmQH03-Azcq6ZI_rLYL-jWzYxJO2Hydno1EYzrv9yjLb7YEUrBkc9O2GHzuqj$ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dp4Qq6UBZAvh4t0Grm8-HusDmQH03-Azcq6ZI_rLYL-jWzYxJO2Hydno1EYzrv9yjLb7YEUrBkc9O2GHzuqj$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Mon Jun 23 09:29:27 2025 From: rlmackie862 at gmail.com (Randall Mackie) Date: Mon, 23 Jun 2025 07:29:27 -0700 Subject: [petsc-users] Problem with composite DM index sets In-Reply-To: <3F363791-D3CC-47FD-8F19-0850B39AD5EE@petsc.dev> References: <45D7BA59-F137-4033-88C1-F1707FB4CA95@gmail.com> <49ED9181-FE44-4E24-93E4-4375FBFDED2C@petsc.dev> <219C534B-35E3-4D72-9B4D-7B6702A4BFBF@gmail.com> <3F363791-D3CC-47FD-8F19-0850B39AD5EE@petsc.dev> Message-ID: Hi Barry, That was a good question, and I didn?t know the answer but I did a little digging, and in fact, in the 2018 Fortran standard, an extended form of implicit none was introduced that disable also implicit procedures: implicit none (type, external) Indeed, when adding that to the little code, the compiler correctly flagged the issue. Thank you for the suggestion! Randy > On Jun 22, 2025, at 6:27?PM, Barry Smith wrote: > > > implicit none prevents using undeclared variables. Is there a way to avoid calling any functions/suboutines that don't have an interface declared in a module? That would have found this problem. > > Barry > > >> On Jun 22, 2025, at 8:06?PM, Randall Mackie wrote: >> >> Thanks Barry! >> >> >> >>> On Jun 22, 2025, at 4:50?PM, Barry Smith wrote: >>> >>> >>> I'm sorry for not getting back to you sooner. I have attached a working version of your code. Since you were missing >>> >>> use petscdmcomposite >>> >>> the compiler could not generate the correct call to DMCompositeGetGlobalISs() >>> >>> Barry >>> >>> >>> >>> >>>> On Jun 17, 2025, at 6:39?PM, Randall Mackie wrote: >>>> >>>> Dear Petsc users - >>>> >>>> I am trying to upgrade my code to petsc-3.23 (from 3.19), and I seem to have run into a problem with DMCompositeGetGlobalISs. >>>> >>>> The example program listed on the man page for DMCompositeGetGlobalISs, https://urldefense.us/v3/__https://petsc.org/release/src/snes/tutorials/ex73f90t.F90.html__;!!G_uCfscf7eWS!cox-AqD0ZFTWb2Dte6AKZBGOrpXh9j2X73Xnry3XF8giEpfLk-toSdK6pFKr0OeA68ZdmvQ7SS5dSU6zZY6lr2TgZQ$ , seems to indicate that a call to DMCompositeGetGlobalISs does not need to allocate the IS pointer and you just pass it directly to DMCompositeGetGlobalISs. >>>> >>>> If I compile and run the simple attached test program (say on 2 processes), I get the following error: >>>> >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>> [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!cox-AqD0ZFTWb2Dte6AKZBGOrpXh9j2X73Xnry3XF8giEpfLk-toSdK6pFKr0OeA68ZdmvQ7SS5dSU6zZY4s9NJ4yg$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!cox-AqD0ZFTWb2Dte6AKZBGOrpXh9j2X73Xnry3XF8giEpfLk-toSdK6pFKr0OeA68ZdmvQ7SS5dSU6zZY7G_dOMCA$ >>>> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>>> [0]PETSC ERROR: The line numbers in the error traceback may not be exact. >>>> [0]PETSC ERROR: #1 F90Array1dCreate() at /home/rmackie/PETSc/petsc-3.23.3/src/sys/ftn-custom/f90_cwrap.c:123 >>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >>>> >>>> >>>> If I uncomment the line to allocate the pointer, I get a very long traceback with lots of error messages. >>>> >>>> What is the correct way to use DMCompositeGetGlobalISs in Fortran? With or without the pointer allocation, and what is the right way to do this without the errors it seems to generate? >>>> >>>> Thanks, >>>> >>>> Randy Mackie >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Mon Jun 23 12:37:03 2025 From: rlmackie862 at gmail.com (Randall Mackie) Date: Mon, 23 Jun 2025 10:37:03 -0700 Subject: [petsc-users] missing routines in the Fortran interfaces Message-ID: <23946890-0946-4824-AC82-4C75D2BC9599@gmail.com> After discovery of the implicit none (type, external) statement for disabling implicit procedures (thanks to Barry for the suggestion), I used that in my code to good effect. However, it also showed that were some routines I call that did not have explicit Fortran interfaces, and maybe they exist but I just didn?t include the right modules. Here are the ones I?m missing: MatShellSetOperation PetscOptionGetString PetscViewerASCIIPrintF VecSetValue KSPMonitorSet Do the Fortran interfaces exist for these? If not can they be added? Thanks, Randy From rlmackie862 at gmail.com Mon Jun 23 13:46:52 2025 From: rlmackie862 at gmail.com (Randall Mackie) Date: Mon, 23 Jun 2025 11:46:52 -0700 Subject: [petsc-users] Fortran, KSPMonitorSet, and KSPMonitorTrueResidual Message-ID: <5BFFB970-9483-4AF5-83D3-906EF3BC94C8@gmail.com> In previous versions of PETSc we use to be able to call KSPMonitorTrueResidual from within our custom KSPMonitor, using an approach that is now commented out in the example found at https://urldefense.us/v3/__https://petsc.org/release/src/ksp/ksp/tutorials/ex2f.F90.html__;!!G_uCfscf7eWS!ft8A0uxwHvTcBa5FwETVQYcImMgoOOueBeZT-7zP9jlaLFCDZ6ApRkBFUgoUzMRWCLbyU1FcxTgFoYMm8twfJqLvXA$ : ! 214: <>! Cannot also use the default KSP monitor routine showing how it may be used from Fortran 215: <>! since the Fortran compiler thinks the calling arguments are different in the two cases 216: <>! 217: <>! PetscCallA (PetscViewerAndFormatCreate (PETSC_VIEWER_STDOUT_WORLD ,PETSC_VIEWER_DEFAULT ,vf,ierr)) 218: <>! PetscCallA (KSPMonitorSet (ksp,KSPMonitorResidual ,vf,PetscViewerAndFormatDestroy ,ierr)) Instead, that example uses: 210: <> if (flg) then 211: <> vzero = 0 212: <> PetscCallA (KSPMonitorSet (ksp,MyKSPMonitor,vzero,PETSC_NULL_FUNCTION,ierr)) 213: <>! Regardless of which of these approaches I try, I cannot use KSPMonitorTrueResidual in the MyKSPMonitor routine. I get the following error: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Null argument, when expecting valid pointer [0]PETSC ERROR: Null Pointer: Parameter # 4 [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ft8A0uxwHvTcBa5FwETVQYcImMgoOOueBeZT-7zP9jlaLFCDZ6ApRkBFUgoUzMRWCLbyU1FcxTgFoYMm8tzxt44McA$ for trouble shooting. [0]PETSC ERROR: PETSc Release Version 3.23.3, May 30, 2025 [0]PETSC ERROR: ./test with 2 MPI process(es) and PETSC_ARCH linux-gfortran-complex-debug on rmackie-VirtualBox-2024 by rmackie Mon Jun 23 11:34:04 2025 [0]PETSC ERROR: Configure options: --with-clean=1 --with-scalar-type=complex --with-debugging=1 --with-fortran=1 --download-mpich=1 [0]PETSC ERROR: #1 KSPMonitorTrueResidual() at /home/rmackie/PETSc/petsc-3.23.3/src/ksp/ksp/interface/iterativ.c:400 [0]PETSC ERROR: #2 test.F90:303 I attach a slightly modified version of the example that demonstrates this behavior. Thanks for the help, Randy ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.F90 Type: application/octet-stream Size: 13409 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Jun 23 13:56:52 2025 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 23 Jun 2025 14:56:52 -0400 Subject: [petsc-users] missing routines in the Fortran interfaces In-Reply-To: <23946890-0946-4824-AC82-4C75D2BC9599@gmail.com> References: <23946890-0946-4824-AC82-4C75D2BC9599@gmail.com> Message-ID: <6B2180D1-2A96-4E52-9924-240F517AA932@petsc.dev> VecSetValue should now have an interface (generated automatically in the main branch), I'll check the others. BTW: Since you are working on this you should switch to the main branch (until the next release), we are busying with a Summer of Code project improving the Fortran bindings and have lots of new stuff in main that is not in release. Barry > On Jun 23, 2025, at 1:37?PM, Randall Mackie wrote: > > After discovery of the implicit none (type, external) statement for disabling implicit procedures (thanks to Barry for the suggestion), I used that in my code to good effect. > > However, it also showed that were some routines I call that did not have explicit Fortran interfaces, and maybe they exist but I just didn?t include the right modules. > > Here are the ones I?m missing: > > MatShellSetOperation > PetscOptionGetString > PetscViewerASCIIPrintF > VecSetValue > KSPMonitorSet > > Do the Fortran interfaces exist for these? If not can they be added? > > > Thanks, Randy > From Olivier.JAMOND at cea.fr Fri Jun 27 05:26:28 2025 From: Olivier.JAMOND at cea.fr (JAMOND Olivier) Date: Fri, 27 Jun 2025 10:26:28 +0000 Subject: [petsc-users] Efficient handling of missing diagonal entities Message-ID: Hello, I am working on a PDE solver which uses petsc to solve its sparse distributed linear systems. I am mainly dealing with MPIAIJ matrices. In some situations, it may happen that the matrices considered does not have non-zero term on the diagonal. For instance I work on a case which have a stokes like saddle-point structure (in a MPIAIJ, not a MATNEST): [A Bt][U]=[F] [B 0 ][L] [0] I do not insert null terms in the zero block. In some cases, I use the function `MatZeroRowsColumns` to handle "Dirichlet" boundary conditions. In this particular case, I apply Dirichlet BCs only on dofs of "U". But I get an error `Matrix is missing diagonal entry in row X` from the function `MatZeroRowsColumns`, where X is a row related to "L". My first question is: is it normal that I get an error for a missing diagonal in the function `MatZeroRowsColumns` entry for a dof that is not involved in the list of dofs that I pass to `MatZeroRowsColumns`? I then tried to make my code to detect that there are some missing diagonal entries, and add an explicit zero to them. My code which adds the missing diagonal entries looks like what follows. This is certainly not the best way to do that, as in my test case about ~80% of the total computation time is spent in this piece of code (more precisely in `MatSetValue(D, k, k, 0., ADD_VALUES)`). So my second question is: what would be the most efficient way to detect the missing diagonal entries, and ad explicit zeros on the diagonal at these places? Many thanks, Olivier ... MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY); Mat D; MatGetDiagonalBlock(A, &D); PetscBool missing; MatMissingDiagonal(D, &missing, NULL); if (missing) { IS missingDiagEntryRows; MatFindZeroDiagonals(D, &missingDiagEntryRows) PetscInt size; ISGetLocalSize(missingDiagEntryRows, &size); const PetscInt *ptr; ISGetIndices(missingDiagEntryRows, &ptr); for (Index i = 0; i < size; ++i) { PetscInt k = ptr[i]; MatSetValue(D, k, k, 0., ADD_VALUES); } MatAssemblyBegin(D, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(D, MAT_FINAL_ASSEMBLY); ISRestoreIndices(missingDiagEntryRows, &ptr); } _________________________________________ Olivier Jamond Research Engineer French Atomic Energy and Alternative Energies Commission DES/ISAS/DM2S/SEMT/DYN 91191 Gif sur Yvette, Cedex, France Email: olivier.jamond at cea.fr Phone: +336.78.18.18.25 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 27 06:32:38 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 27 Jun 2025 07:32:38 -0400 Subject: [petsc-users] Efficient handling of missing diagonal entities In-Reply-To: References: Message-ID: <44FE4BB3-7FD6-4024-858B-F8648FE96688@petsc.dev> Handling empty diagonal entries on matrices is often problematic, just as you describe. I suggest placing explicit zeros on the diagonal first before providing the other entries, which might be the cleanest and most efficient approach. So have each MPI rank loop over its local rows and call MatSetValue() for each diagonal entry and then continue with your other MatSetValues(). Do not call MatAssemblyBegin/End() after you have provided the zeros on the diagonal just chug straight into setting the other values. Barry As you observed, trying to add the zero entries in the matrix after it is assembled is terribly inefficient and not the way to go. I've considered adding a matrix option to force zero entries on the diagonal, but I never completed my consideration. For example, MatSetOption(A, MAT_NONEMPTY_DIAGONAL,PETSC_TRUE); and when this option is set, MatAssemblyBegin fills up any empty diagonal entries automatically. > On Jun 27, 2025, at 6:26?AM, JAMOND Olivier wrote: > > Hello, > > I am working on a PDE solver which uses petsc to solve its sparse distributed linear systems. I am mainly dealing with MPIAIJ matrices. > > In some situations, it may happen that the matrices considered does not have non-zero term on the diagonal. For instance I work on a case which have a stokes like saddle-point structure (in a MPIAIJ, not a MATNEST): > > [A Bt][U]=[F] > [B 0 ][L] [0] > > I do not insert null terms in the zero block. > > In some cases, I use the function `MatZeroRowsColumns` to handle "Dirichlet" boundary conditions. In this particular case, I apply Dirichlet BCs only on dofs of "U". But I get an error `Matrix is missing diagonal entry in row X` from the function `MatZeroRowsColumns`, where X is a row related to "L". > > My first question is: is it normal that I get an error for a missing diagonal in the function `MatZeroRowsColumns`entry for a dof that is not involved in the list of dofs that I pass to `MatZeroRowsColumns`? > > I then tried to make my code to detect that there are some missing diagonal entries, and add an explicit zero to them. My code which adds the missing diagonal entries looks like what follows. This is certainly not the best way to do that, as in my test case about ~80% of the total computation time is spent in this piece of code (more precisely in `MatSetValue(D, k, k, 0., ADD_VALUES)`). > So my second question is: what would be the most efficient way to detect the missing diagonal entries, and ad explicit zeros on the diagonal at these places? > > Many thanks, > Olivier > > ... > MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY); > > Mat D; > MatGetDiagonalBlock(A, &D); > > PetscBool missing; > MatMissingDiagonal(D, &missing, NULL); > > if (missing) { > > IS missingDiagEntryRows; > MatFindZeroDiagonals(D, &missingDiagEntryRows) > > PetscInt size; > ISGetLocalSize(missingDiagEntryRows, &size); > const PetscInt *ptr; > ISGetIndices(missingDiagEntryRows, &ptr); > > for (Index i = 0; i < size; ++i) { > PetscInt k = ptr[i]; > MatSetValue(D, k, k, 0., ADD_VALUES); > } > MatAssemblyBegin(D, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(D, MAT_FINAL_ASSEMBLY); > > ISRestoreIndices(missingDiagEntryRows, &ptr); > } > > > _________________________________________ > Olivier Jamond > Research Engineer > French Atomic Energy and Alternative Energies Commission > DES/ISAS/DM2S/SEMT/DYN > 91191 Gif sur Yvette, Cedex, France > Email: olivier.jamond @cea.fr Phone: +336.78.18.18.25 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Fri Jun 27 06:49:35 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 27 Jun 2025 13:49:35 +0200 Subject: [petsc-users] Efficient handling of missing diagonal entities In-Reply-To: <44FE4BB3-7FD6-4024-858B-F8648FE96688@petsc.dev> References: <44FE4BB3-7FD6-4024-858B-F8648FE96688@petsc.dev> Message-ID: <1B1D2240-D2B3-41D0-9EE4-73EED251B530@joliv.et> An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 27 08:20:27 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 27 Jun 2025 09:20:27 -0400 Subject: [petsc-users] Efficient handling of missing diagonal entities In-Reply-To: <1B1D2240-D2B3-41D0-9EE4-73EED251B530@joliv.et> References: <44FE4BB3-7FD6-4024-858B-F8648FE96688@petsc.dev> <1B1D2240-D2B3-41D0-9EE4-73EED251B530@joliv.et> Message-ID: <3FE25D55-AA1E-4CBF-B122-A98072D354C4@petsc.dev> Because I completely forgot that this option existed, and the LLM didn't save me from embarrassing myself. I see that this option sets mat->force_diagonals, but this variable is never used in the mat assembly routines, meaning it will not help in this situation. Presumably, MatAssemblyXXX_YYY() could/should be fixed to respect this flag? Then it would help the Olivier. Barry > On Jun 27, 2025, at 7:49?AM, Pierre Jolivet wrote: > > > >> On 27 Jun 2025, at 1:33?PM, Barry Smith wrote: >> >> ? >> >> Handling empty diagonal entries on matrices is often problematic, just as you describe. >> >> I suggest placing explicit zeros on the diagonal first before providing the other entries, which might be the cleanest and most efficient approach. So have each MPI rank loop over its local rows and call MatSetValue() for each diagonal entry and then continue with your other MatSetValues(). Do not call MatAssemblyBegin/End() after you have provided the zeros on the diagonal just chug straight into setting the other values. >> >> Barry >> >> As you observed, trying to add the zero entries in the matrix after it is assembled is terribly inefficient and not the way to go. >> >> I've considered adding a matrix option to force zero entries on the diagonal, but I never completed my consideration. For example, MatSetOption(A, MAT_NONEMPTY_DIAGONAL,PETSC_TRUE); > > Why would you need another option when there is already MAT_FORCE_DIAGONAL_ENTRIES? > > Thanks, > Pierre > >> and when this option is set, MatAssemblyBegin fills up any empty diagonal entries automatically. >> >> >> >>> On Jun 27, 2025, at 6:26?AM, JAMOND Olivier wrote: >>> >>> Hello, >>> >>> I am working on a PDE solver which uses petsc to solve its sparse distributed linear systems. I am mainly dealing with MPIAIJ matrices. >>> >>> In some situations, it may happen that the matrices considered does not have non-zero term on the diagonal. For instance I work on a case which have a stokes like saddle-point structure (in a MPIAIJ, not a MATNEST): >>> >>> [A Bt][U]=[F] >>> [B 0 ][L] [0] >>> >>> I do not insert null terms in the zero block. >>> >>> In some cases, I use the function `MatZeroRowsColumns` to handle "Dirichlet" boundary conditions. In this particular case, I apply Dirichlet BCs only on dofs of "U". But I get an error `Matrix is missing diagonal entry in row X` from the function `MatZeroRowsColumns`, where X is a row related to "L". >>> >>> My first question is: is it normal that I get an error for a missing diagonal in the function `MatZeroRowsColumns`entry for a dof that is not involved in the list of dofs that I pass to `MatZeroRowsColumns`? >>> >>> I then tried to make my code to detect that there are some missing diagonal entries, and add an explicit zero to them. My code which adds the missing diagonal entries looks like what follows. This is certainly not the best way to do that, as in my test case about ~80% of the total computation time is spent in this piece of code (more precisely in `MatSetValue(D, k, k, 0., ADD_VALUES)`). >>> So my second question is: what would be the most efficient way to detect the missing diagonal entries, and ad explicit zeros on the diagonal at these places? >>> >>> Many thanks, >>> Olivier >>> >>> ... >>> MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); >>> MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY); >>> >>> Mat D; >>> MatGetDiagonalBlock(A, &D); >>> >>> PetscBool missing; >>> MatMissingDiagonal(D, &missing, NULL); >>> >>> if (missing) { >>> >>> IS missingDiagEntryRows; >>> MatFindZeroDiagonals(D, &missingDiagEntryRows) >>> >>> PetscInt size; >>> ISGetLocalSize(missingDiagEntryRows, &size); >>> const PetscInt *ptr; >>> ISGetIndices(missingDiagEntryRows, &ptr); >>> >>> for (Index i = 0; i < size; ++i) { >>> PetscInt k = ptr[i]; >>> MatSetValue(D, k, k, 0., ADD_VALUES); >>> } >>> MatAssemblyBegin(D, MAT_FINAL_ASSEMBLY); >>> MatAssemblyEnd(D, MAT_FINAL_ASSEMBLY); >>> >>> ISRestoreIndices(missingDiagEntryRows, &ptr); >>> } >>> >>> >>> _________________________________________ >>> Olivier Jamond >>> Research Engineer >>> French Atomic Energy and Alternative Energies Commission >>> DES/ISAS/DM2S/SEMT/DYN >>> 91191 Gif sur Yvette, Cedex, France >>> Email: olivier.jamond @cea.fr Phone: +336.78.18.18.25 >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Fri Jun 27 08:50:29 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 27 Jun 2025 15:50:29 +0200 Subject: [petsc-users] Efficient handling of missing diagonal entities In-Reply-To: <3FE25D55-AA1E-4CBF-B122-A98072D354C4@petsc.dev> References: <3FE25D55-AA1E-4CBF-B122-A98072D354C4@petsc.dev> Message-ID: <3D2FB177-9C1F-4EDE-97B4-C6F6B92C0D52@joliv.et> An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jun 27 16:03:16 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 27 Jun 2025 17:03:16 -0400 Subject: [petsc-users] missing routines in the Fortran interfaces In-Reply-To: <23946890-0946-4824-AC82-4C75D2BC9599@gmail.com> References: <23946890-0946-4824-AC82-4C75D2BC9599@gmail.com> Message-ID: <8137397A-81CB-4D0B-9767-53332606AB80@petsc.dev> Randy, Thanks for trying this. I have started a merge request https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8501__;!!G_uCfscf7eWS!bceR7fahh0spEc8XRbLMfFNdOqIhH-v-WLSyDf6tadn0DTdqAVEV7w00DMInOOXFlqEF6dE7-UGW-sfYmO3DUMU$ with this requirement on all Fortran source code and found many missing interfaces. I have fixed some but there are still some outstanding that need to be resolved in the MR. Including some you pointed out below. Barry > On Jun 23, 2025, at 1:37?PM, Randall Mackie wrote: > > After discovery of the implicit none (type, external) statement for disabling implicit procedures (thanks to Barry for the suggestion), I used that in my code to good effect. > > However, it also showed that were some routines I call that did not have explicit Fortran interfaces, and maybe they exist but I just didn?t include the right modules. > > Here are the ones I?m missing: > > MatShellSetOperation > PetscOptionGetString > PetscViewerASCIIPrintF > VecSetValue > KSPMonitorSet > > Do the Fortran interfaces exist for these? If not can they be added? > > > Thanks, Randy > -------------- next part -------------- An HTML attachment was scrubbed... URL: