From myoung.space.science at gmail.com Mon May 1 20:57:39 2023 From: myoung.space.science at gmail.com (Matthew Young) Date: Mon, 1 May 2023 21:57:39 -0400 Subject: [petsc-users] DMSWARM with DMDA and KSP In-Reply-To: References: Message-ID: Thanks for the suggestion to keep DMs separate, and for pointing me toward that example. I now have a DM for the particle quantities (i.e., density and flux) and another for the potential. I'm hoping to use KSPSetComputeOperators with PCGAMG, so I packed the density DM into the application context and set the potential DM on the KSP, but I'm not sure how to communicate changes in the KSP DM (e.g., coarsening) to the density DM inside my operator function. --Matt ========================== Matthew Young, PhD (he/him) Research Scientist II Space Science Center University of New Hampshire Matthew.Young at unh.edu ========================== On Sun, Apr 30, 2023 at 1:52?PM Matthew Knepley wrote: > On Sun, Apr 30, 2023 at 1:12?PM Matthew Young < > myoung.space.science at gmail.com> wrote: > >> Hi all, >> >> I am developing a particle-in-cell code that models ions as particles and >> electrons as an inertialess fluid. I use a PIC DMSWARM for the ions, which >> I gather into density and flux before solving a linear system for the >> electrostatic potential (phi). I currently have one DMDA with 5 degrees of >> freedom -- one each for density, 3 flux components, and phi. >> >> When setting up the linear system to solve for phi, I've been following >> examples like KSP ex34.c and ex42.c when writing the KSP operator and RHS >> functions but I'm not sure I have the right approach, since 4 of the DOFs >> are known and 1 is unknown. >> >> I saw this thread >> >> that recommended using DMDAGetReducedDMDA, which I gather has been >> deprecated in favor of DMDACreateCompatibleDMDA. Is that a good approach >> for managing a regular grid with known and unknown quantities on each node? >> Could a composite DM be useful? Has anyone else worked on a problem like >> this? >> > > I recommend making a different DM for each kind of solve you want. > DMDACreateCompatibleDMDA() should be the implementation of DMClone(), but > we have yet to harmonize all things for all DMs. I would create one DM for > your Vlasov components and one for the Poisson. > We follow this strategy in our Vlasov-Poisson test for Landau damping: > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/swarm/tests/ex9.c > > Thanks, > > Matt > > >> --Matt >> ========================== >> Matthew Young, PhD (he/him) >> Research Scientist II >> Space Science Center >> University of New Hampshire >> Matthew.Young at unh.edu >> ========================== >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Mon May 1 21:51:12 2023 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 1 May 2023 19:51:12 -0700 Subject: [petsc-users] DMSWARM with DMDA and KSP In-Reply-To: References: Message-ID: On Mon 1. May 2023 at 18:57, Matthew Young wrote: > Thanks for the suggestion to keep DMs separate, and for pointing me toward > that example. I now have a DM for the particle quantities (i.e., density > and flux) and another for the potential. I'm hoping to use > KSPSetComputeOperators with PCGAMG, so I packed the density DM into the > application context and set the potential DM on the KSP, but I'm not sure > how to communicate changes in the KSP DM (e.g., coarsening) to the density > DM inside my operator function. > I don?t think you need to. GAMG only requires the fine grid operator - this will be the matrix assembled from KSPSetComputeOperators. Hence density DM and potential DM fields only need to be managed by you on the finest level. However, if you wanted to use PCMG with rediscretized operators on every level, then you would need the density DM field defined on each level of your geometric multigrid hierarchy. This could be done (possibly less than ideally) by calling DMCreateInterpolation() and then using the Mat to interpolate the density from the finest level to next coarsest level (and so on). Thanks, Dave > > --Matt > ========================== > Matthew Young, PhD (he/him) > Research Scientist II > Space Science Center > University of New Hampshire > Matthew.Young at unh.edu > ========================== > > > On Sun, Apr 30, 2023 at 1:52?PM Matthew Knepley wrote: > >> On Sun, Apr 30, 2023 at 1:12?PM Matthew Young < >> myoung.space.science at gmail.com> wrote: >> >>> Hi all, >>> >>> I am developing a particle-in-cell code that models ions as particles >>> and electrons as an inertialess fluid. I use a PIC DMSWARM for the ions, >>> which I gather into density and flux before solving a linear system for the >>> electrostatic potential (phi). I currently have one DMDA with 5 degrees of >>> freedom -- one each for density, 3 flux components, and phi. >>> >>> When setting up the linear system to solve for phi, I've been following >>> examples like KSP ex34.c and ex42.c when writing the KSP operator and RHS >>> functions but I'm not sure I have the right approach, since 4 of the DOFs >>> are known and 1 is unknown. >>> >>> I saw this thread >>> >>> that recommended using DMDAGetReducedDMDA, which I gather has been >>> deprecated in favor of DMDACreateCompatibleDMDA. Is that a good approach >>> for managing a regular grid with known and unknown quantities on each node? >>> Could a composite DM be useful? Has anyone else worked on a problem like >>> this? >>> >> >> I recommend making a different DM for each kind of solve you want. >> DMDACreateCompatibleDMDA() should be the implementation of DMClone(), but >> we have yet to harmonize all things for all DMs. I would create one DM for >> your Vlasov components and one for the Poisson. >> We follow this strategy in our Vlasov-Poisson test for Landau damping: >> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/swarm/tests/ex9.c >> >> Thanks, >> >> Matt >> >> >>> --Matt >>> ========================== >>> Matthew Young, PhD (he/him) >>> Research Scientist II >>> Space Science Center >>> University of New Hampshire >>> Matthew.Young at unh.edu >>> ========================== >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksl7912 at snu.ac.kr Tue May 2 01:56:16 2023 From: ksl7912 at snu.ac.kr (=?UTF-8?B?wq3qtozsirnrpqwgLyDtlZnsg50gLyDtla3qs7XsmrDso7zqs7XtlZnqs7w=?=) Date: Tue, 2 May 2023 15:56:16 +0900 Subject: [petsc-users] 'mpirun' run not found error Message-ID: Dear developers I'm trying to use the mpi, but I'm encountering error messages like below: //////// Command 'mpirun' not found, but can be installed with: sudo apt install lam-runtime # version 7.1.4-6build2, or sudo apt install mpich # version 3.3.2-2build1 sudo apt install openmpi-bin # version 4.0.3-0ubuntu1 sudo apt install slurm-wlm-torque # version 19.05.5-1 ////////// However, I've already installed the mpich. cd $PETSC_DIR ./configure --download-mpich --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mumps --download-scalapack --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc Could you recommend some advice related to this? Best, Seung Lee Kwon -- Seung Lee Kwon, Ph.D.Candidate Aerospace Structures and Materials Laboratory Department of Mechanical and Aerospace Engineering Seoul National University Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 E-mail : ksl7912 at snu.ac.kr Office : +82-2-880-7389 C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.jolivet at lip6.fr Tue May 2 02:01:11 2023 From: pierre.jolivet at lip6.fr (Pierre Jolivet) Date: Tue, 2 May 2023 09:01:11 +0200 Subject: [petsc-users] 'mpirun' run not found error In-Reply-To: References: Message-ID: > On 2 May 2023, at 8:56 AM, ???? / ?? / ??????? wrote: > > Dear developers > > I'm trying to use the mpi, but I'm encountering error messages like below: > > //////// > Command 'mpirun' not found, but can be installed with: > sudo apt install lam-runtime # version 7.1.4-6build2, or > sudo apt install mpich # version 3.3.2-2build1 > sudo apt install openmpi-bin # version 4.0.3-0ubuntu1 > sudo apt install slurm-wlm-torque # version 19.05.5-1 > ////////// > > However, I've already installed the mpich. > cd $PETSC_DIR > ./configure --download-mpich --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mumps --download-scalapack --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc > > Could you recommend some advice related to this? Most likely you do not want to run mpirun, but ${PETSC_DIR}/${PETSC_ARCH}/bin/mpirun instead. Or add ${PETSC_DIR}/${PETSC_ARCH}/bin to your PATH environment variable. Thanks, Pierre > Best, > Seung Lee Kwon > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From karthikeyan.chockalingam at stfc.ac.uk Tue May 2 07:25:06 2023 From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI) Date: Tue, 2 May 2023 12:25:06 +0000 Subject: [petsc-users] Node numbering in parallel partitioned mesh Message-ID: Hello, This is not exactly a PETSc question. I have a parallel partitioned finite element mesh. What are the steps involved in having a contiguous but unique set of node numbering from one partition to the next? There are nodes which are shared between different partitions. Moreover, this partition has to coincide parallel partition of PETSc Vec/Mat, which ensures data locality. If you can post the algorithm or cite a reference, it will prove helpful. Many thanks. Kind regards, Karthik. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 2 07:34:41 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 May 2023 08:34:41 -0400 Subject: [petsc-users] Node numbering in parallel partitioned mesh In-Reply-To: References: Message-ID: On Tue, May 2, 2023 at 8:25?AM Karthikeyan Chockalingam - STFC UKRI via petsc-users wrote: > Hello, > > > > This is not exactly a PETSc question. I have a parallel partitioned finite > element mesh. What are the steps involved in having a contiguous but unique > set of node numbering from one partition to the next? There are nodes which > are shared between different partitions. Moreover, this partition has to > coincide parallel partition of PETSc Vec/Mat, which ensures data locality. > > > > If you can post the algorithm or cite a reference, it will prove helpful. > Somehow, you have to know what "nodes" are shared. Once you know this, you can make a rule for numbering, such as "the lowest rank gets the shared nodes". We encapsulate this ownership relation in the PetscSF. Roots are owned, and leaves are not owned. The rule above is not great for load balance, so we have an optimization routine for the simple PetscSF: https://petsc.org/main/manualpages/DMPlex/DMPlexRebalanceSharedPoints/ Thanks, Matt > Many thanks. > > > > Kind regards, > > Karthik. > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.blauth at itwm.fraunhofer.de Tue May 2 03:52:16 2023 From: sebastian.blauth at itwm.fraunhofer.de (Blauth, Sebastian) Date: Tue, 2 May 2023 08:52:16 +0000 Subject: [petsc-users] Scalable Solver for Incompressible Flow Message-ID: Hello, I am having a problem using / configuring PETSc to obtain a scalable solver for the incompressible Navier Stokes equations. I am discretizing the equations using FEM (with the library fenics) and I am using the stable P2-P1 Taylor-Hood elements. I have read and tried a lot regarding preconditioners for incompressible Navier Stokes and I am aware that this is very much an active research field, but maybe I can get some hints / tips. I am interested in solving large-scale 3D problems, but I cannot even set up a scaleable 2D solver for the problems. All of my approaches at the moment are trying to use a Schur Complement approach, but I cannot get a ?good? preconditioner for the Schur complement matrix. For the velocity block, I am using the AMG provided by hypre (which seems to work fine and is most likely not the problem). To test the solver, I am using a simple 2D channel flow problem with do-nothing conditions at the outlet. I am facing the following difficulties at the moment: - First, I am having trouble with using -pc_fieldsplit_schur_precondition selfp. With this setup, the cost for solving the Schur complement part in the fieldsplit preconditioner (approximately) increase when the mesh is refined. I am using the following options for this setup (note that I am using exact solves for the velocity part to debug, but using, e.g., gmres with hypre boomeramg reaches a given tolerance with a number of iterations that is independent of the mesh) -ksp_type fgmres -ksp_rtol 1e-6 -ksp_atol 1e-30 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type full -pc_fieldsplit_schur_precondition selfp -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type lu -fieldsplit_1_ksp_type gmres -fieldsplit_1_ksp_pc_side right -fieldsplit_1_ksp_max_it 1000 -fieldsplit_1_ksp_rtol 1e-1 -fieldsplit_1_ksp_atol 1e-30 -fieldsplit_1_pc_type lu -fieldsplit_1_ksp_converged_reason -ksp_converged_reason Note, that I use direct solvers for the subproblems to get an ?ideal? convergence. Even if I replace the direct solver with boomeramg, the behavior is the same and the number of iterations does not change much. In particular, I get the following behavior: For a 8x8 mesh, I need, on average, 25 iterations to solve fieldsplit_1 For a 16x16 mesh, I need 40 iterations For a 32x32 mesh, I need 70 iterations For a 64x64 mesh, I need 100 iterations However, the outer fgmres requires, as expected, always the same number of iterations to reach convergence (as expected). I do understand that the selfp preconditioner for the Schur complement is expected to deteriorate as the Reynolds number increases and the problem becomes more convective in nature, but I had hoped that I can at least get a scaleable preconditioner with respect to the mesh size out of it. Are there any tips on how to achieve this? My second problem is concerning the LSC preconditioner. When I am using this, again both with exact solves of the linear problems or when using boomeramg, I do not get a scalable solver with respect to the mesh size. On the contrary, here the number of solves required for solving fieldsplit_1 to a fixed relative tolerance seem to behave linearly w.r.t. the problem size. For this problem, I suspect that the issue lies in the scaling of the LSC preconditioner matrices (in the book of Elman, Sylvester and Wathen, the matrices are scaled with the inverse of the diagonal velocity mass matrix). Is it possible to achieve this with PETSc? I started experimenting with supplying the velocity mass matrix as preconditioner matrix and using ?use_amat?, but I am not sure where / how to do it this way. And finally, more of an observation and question: I noticed that the AMG approximations for the velocity block became worse with increase of the Reynolds number when using the default options. However, when using -pc_hypre_boomeramg_relax_weight_all 0.0 I noticed that boomeramg performed way more robustly w.r.t. the Reynolds number. Are there any other ways to improve the AMG performance in this regard? Thanks a lot in advance and I am looking forward to your reply, Sebastian -- Dr. Sebastian Blauth Fraunhofer-Institut f?r Techno- und Wirtschaftsmathematik ITWM Abteilung Transportvorg?nge Fraunhofer-Platz 1, 67663 Kaiserslautern Telefon: +49 631 31600-4968 sebastian.blauth at itwm.fraunhofer.de https://www.itwm.fraunhofer.de -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 7943 bytes Desc: not available URL: From knepley at gmail.com Tue May 2 08:12:25 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 May 2023 09:12:25 -0400 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: References: Message-ID: On Tue, May 2, 2023 at 9:07?AM Blauth, Sebastian < sebastian.blauth at itwm.fraunhofer.de> wrote: > Hello, > > > > I am having a problem using / configuring PETSc to obtain a scalable > solver for the incompressible Navier Stokes equations. I am discretizing > the equations using FEM (with the library fenics) and I am using the stable > P2-P1 Taylor-Hood elements. I have read and tried a lot regarding > preconditioners for incompressible Navier Stokes and I am aware that this > is very much an active research field, but maybe I can get some hints / > tips. > > I am interested in solving large-scale 3D problems, but I cannot even set > up a scaleable 2D solver for the problems. All of my approaches at the > moment are trying to use a Schur Complement approach, but I cannot get a > ?good? preconditioner for the Schur complement matrix. For the velocity > block, I am using the AMG provided by hypre (which seems to work fine and > is most likely not the problem). > > > > To test the solver, I am using a simple 2D channel flow problem with > do-nothing conditions at the outlet. > > > > I am facing the following difficulties at the moment: > > > > - First, I am having trouble with using -pc_fieldsplit_schur_precondition > selfp. With this setup, the cost for solving the Schur complement part in > the fieldsplit preconditioner (approximately) increase when the mesh is > refined. I am using the following options for this setup (note that I am > using exact solves for the velocity part to debug, but using, e.g., gmres > with hypre boomeramg reaches a given tolerance with a number of iterations > that is independent of the mesh) > The diagonal of the momentum block is a bad preconditioner for the Schur complement, because S is spectrally equivalent to the mass matrix. You should build the mass matrix and use that as the preconditioning matrix for the Schur part. The FEniCS people can show you how to do that. This will provide mesh-independent convergence (you can see me doing this in SNES ex69). Thanks, Matt > -ksp_type fgmres > > -ksp_rtol 1e-6 > > -ksp_atol 1e-30 > > -pc_type fieldsplit > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type full > > -pc_fieldsplit_schur_precondition selfp > > -fieldsplit_0_ksp_type preonly > > -fieldsplit_0_pc_type lu > > -fieldsplit_1_ksp_type gmres > > -fieldsplit_1_ksp_pc_side right > > -fieldsplit_1_ksp_max_it 1000 > > -fieldsplit_1_ksp_rtol 1e-1 > > -fieldsplit_1_ksp_atol 1e-30 > > -fieldsplit_1_pc_type lu > > -fieldsplit_1_ksp_converged_reason > > -ksp_converged_reason > > > > Note, that I use direct solvers for the subproblems to get an ?ideal? > convergence. Even if I replace the direct solver with boomeramg, the > behavior is the same and the number of iterations does not change much. > > In particular, I get the following behavior: > > For a 8x8 mesh, I need, on average, 25 iterations to solve fieldsplit_1 > > For a 16x16 mesh, I need 40 iterations > > For a 32x32 mesh, I need 70 iterations > > For a 64x64 mesh, I need 100 iterations > > > > However, the outer fgmres requires, as expected, always the same number of > iterations to reach convergence (as expected). > > I do understand that the selfp preconditioner for the Schur complement is > expected to deteriorate as the Reynolds number increases and the problem > becomes more convective in nature, but I had hoped that I can at least get > a scaleable preconditioner with respect to the mesh size out of it. Are > there any tips on how to achieve this? > > > > My second problem is concerning the LSC preconditioner. When I am using > this, again both with exact solves of the linear problems or when using > boomeramg, I do not get a scalable solver with respect to the mesh size. On > the contrary, here the number of solves required for solving fieldsplit_1 > to a fixed relative tolerance seem to behave linearly w.r.t. the problem > size. For this problem, I suspect that the issue lies in the scaling of the > LSC preconditioner matrices (in the book of Elman, Sylvester and Wathen, > the matrices are scaled with the inverse of the diagonal velocity mass > matrix). Is it possible to achieve this with PETSc? I started experimenting > with supplying the velocity mass matrix as preconditioner matrix and using > ?use_amat?, but I am not sure where / how to do it this way. > > > > And finally, more of an observation and question: I noticed that the AMG > approximations for the velocity block became worse with increase of the > Reynolds number when using the default options. However, when using > -pc_hypre_boomeramg_relax_weight_all 0.0 I noticed that boomeramg performed > way more robustly w.r.t. the Reynolds number. Are there any other ways to > improve the AMG performance in this regard? > > > > Thanks a lot in advance and I am looking forward to your reply, > > Sebastian > > > > -- > > Dr. Sebastian Blauth > > Fraunhofer-Institut f?r > > Techno- und Wirtschaftsmathematik ITWM > > Abteilung Transportvorg?nge > > Fraunhofer-Platz 1, 67663 Kaiserslautern > > Telefon: +49 631 31600-4968 > > sebastian.blauth at itwm.fraunhofer.de > > https://www.itwm.fraunhofer.de > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue May 2 09:44:50 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 2 May 2023 10:44:50 -0400 Subject: [petsc-users] 'mpirun' run not found error In-Reply-To: References: Message-ID: <602ED6C1-B587-4B0E-94B6-59C69164E011@petsc.dev> For any PETSc install you can run make getmpiexec in the PETSC_DIR directory to see how to use mpiexec for that PETSc install. Barry > On May 2, 2023, at 2:56 AM, ???? / ?? / ??????? wrote: > > Dear developers > > I'm trying to use the mpi, but I'm encountering error messages like below: > > //////// > Command 'mpirun' not found, but can be installed with: > sudo apt install lam-runtime # version 7.1.4-6build2, or > sudo apt install mpich # version 3.3.2-2build1 > sudo apt install openmpi-bin # version 4.0.3-0ubuntu1 > sudo apt install slurm-wlm-torque # version 19.05.5-1 > ////////// > > However, I've already installed the mpich. > cd $PETSC_DIR > ./configure --download-mpich --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mumps --download-scalapack --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc > > Could you recommend some advice related to this? > > Best, > Seung Lee Kwon > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue May 2 09:50:00 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 2 May 2023 10:50:00 -0400 Subject: [petsc-users] Node numbering in parallel partitioned mesh In-Reply-To: References: Message-ID: <3B1A8F0A-62FC-4B10-A5BF-50AE6B0BC2DF@petsc.dev> Assuming you have generated your renumbering, you can use https://petsc.org/release/manualpages/AO/AO/#ao to convert lists in the old (or new) numbering to the new (or old) numbering. Barry > On May 2, 2023, at 8:34 AM, Matthew Knepley wrote: > > On Tue, May 2, 2023 at 8:25?AM Karthikeyan Chockalingam - STFC UKRI via petsc-users > wrote: >> Hello, >> >> >> >> This is not exactly a PETSc question. I have a parallel partitioned finite element mesh. What are the steps involved in having a contiguous but unique set of node numbering from one partition to the next? There are nodes which are shared between different partitions. Moreover, this partition has to coincide parallel partition of PETSc Vec/Mat, which ensures data locality. >> >> >> >> If you can post the algorithm or cite a reference, it will prove helpful. >> > > Somehow, you have to know what "nodes" are shared. Once you know this, you can make a rule for numbering, such > as "the lowest rank gets the shared nodes". We encapsulate this ownership relation in the PetscSF. Roots are owned, > and leaves are not owned. The rule above is not great for load balance, so we have an optimization routine for the > simple PetscSF: https://petsc.org/main/manualpages/DMPlex/DMPlexRebalanceSharedPoints/ > > Thanks, > > Matt > >> Many thanks. >> >> >> >> Kind regards, >> >> Karthik. >> >> >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From karthikeyan.chockalingam at stfc.ac.uk Tue May 2 10:03:26 2023 From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI) Date: Tue, 2 May 2023 15:03:26 +0000 Subject: [petsc-users] Node numbering in parallel partitioned mesh In-Reply-To: References: Message-ID: Thank you Matt. I will look to find out those shared nodes. Sorry, I didn?t get it when you say ?Roots are owned, and leaves are not owned? My question was specifically related to numbering ? how do I start numbering in a partition from where I left off from the previous partition without double counting so that the node numbers are unique? Let's say I have a VECMPI which is distributed among the partitions. When I try to retrieve the data using VecGetValues, I often run into problems accessing non-local data (so, for now, I scatter the vector). When some nodes are shared, will I not always have this problem accessing those nodes from the wrong partition unless those nodes are ghosted? Maybe I am not thinking about it correctly. Kind regards, Karthik. From: Matthew Knepley Date: Tuesday, 2 May 2023 at 13:35 To: Chockalingam, Karthikeyan (STFC,DL,HC) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Node numbering in parallel partitioned mesh On Tue, May 2, 2023 at 8:25?AM Karthikeyan Chockalingam - STFC UKRI via petsc-users > wrote: Hello, This is not exactly a PETSc question. I have a parallel partitioned finite element mesh. What are the steps involved in having a contiguous but unique set of node numbering from one partition to the next? There are nodes which are shared between different partitions. Moreover, this partition has to coincide parallel partition of PETSc Vec/Mat, which ensures data locality. If you can post the algorithm or cite a reference, it will prove helpful. Somehow, you have to know what "nodes" are shared. Once you know this, you can make a rule for numbering, such as "the lowest rank gets the shared nodes". We encapsulate this ownership relation in the PetscSF. Roots are owned, and leaves are not owned. The rule above is not great for load balance, so we have an optimization routine for the simple PetscSF: https://petsc.org/main/manualpages/DMPlex/DMPlexRebalanceSharedPoints/ Thanks, Matt Many thanks. Kind regards, Karthik. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 2 10:25:42 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 May 2023 11:25:42 -0400 Subject: [petsc-users] Node numbering in parallel partitioned mesh In-Reply-To: References: Message-ID: On Tue, May 2, 2023 at 11:03?AM Karthikeyan Chockalingam - STFC UKRI < karthikeyan.chockalingam at stfc.ac.uk> wrote: > Thank you Matt. > > > > I will look to find out those shared nodes. Sorry, I didn?t get it when > you say ?Roots are owned, and leaves are not owned? > That is the nomenclature from PetscSF. > > > My question was specifically related to numbering ? how do I start > numbering in a partition from where I left off from the previous partition > without double counting so that the node numbers are unique? > 1) Determine the local sizes Run over the local nodes. If any are not owned, do not count them. 2) Get the local offset nStart Add up the local sizes to get the offset for each process using MPI_Scan() 3) Number locally Run over local nodes and number each owned node, starting with nStart Thanks, Matt > > Let's say I have a VECMPI which is distributed among the partitions. When I > try to retrieve the data using VecGetValues, I often run into problems > accessing non-local data (so, for now, I scatter the vector). When some > nodes are shared, will I not always have this problem accessing those nodes > from the wrong partition unless those nodes are ghosted? Maybe I am not > thinking about it correctly. > > > > Kind regards, > > Karthik. > > > > > > *From: *Matthew Knepley > *Date: *Tuesday, 2 May 2023 at 13:35 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] Node numbering in parallel partitioned mesh > > On Tue, May 2, 2023 at 8:25?AM Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > Hello, > > > > This is not exactly a PETSc question. I have a parallel partitioned finite > element mesh. What are the steps involved in having a contiguous but unique > set of node numbering from one partition to the next? There are nodes which > are shared between different partitions. Moreover, this partition has to > coincide parallel partition of PETSc Vec/Mat, which ensures data locality. > > > > If you can post the algorithm or cite a reference, it will prove helpful. > > > > Somehow, you have to know what "nodes" are shared. Once you know this, you > can make a rule for numbering, such > > as "the lowest rank gets the shared nodes". We encapsulate this ownership > relation in the PetscSF. Roots are owned, > > and leaves are not owned. The rule above is not great for load balance, so > we have an optimization routine for the > > simple PetscSF: > https://petsc.org/main/manualpages/DMPlex/DMPlexRebalanceSharedPoints/ > > > > Thanks, > > > > Matt > > > > Many thanks. > > > > Kind regards, > > Karthik. > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From myoung.space.science at gmail.com Tue May 2 10:26:51 2023 From: myoung.space.science at gmail.com (Matthew Young) Date: Tue, 2 May 2023 11:26:51 -0400 Subject: [petsc-users] DMSWARM with DMDA and KSP In-Reply-To: References: Message-ID: Yup -- I realized that I had '-pc_type mg' in the script I was using to build and run as I developed. I guess that was causing the KSP to coarsen its DM, which made me think I had to force the density DM to be consistent. Still, refactoring my original grid DM into one for Vlasov components and one for the potential was useful. Thanks for the help! --Matt ========================== Matthew Young, PhD (he/him) Research Scientist II Space Science Center University of New Hampshire Matthew.Young at unh.edu ========================== On Mon, May 1, 2023 at 10:51?PM Dave May wrote: > > > On Mon 1. May 2023 at 18:57, Matthew Young > wrote: > >> Thanks for the suggestion to keep DMs separate, and for pointing me >> toward that example. I now have a DM for the particle quantities (i.e., >> density and flux) and another for the potential. I'm hoping to use >> KSPSetComputeOperators with PCGAMG, so I packed the density DM into the >> application context and set the potential DM on the KSP, but I'm not sure >> how to communicate changes in the KSP DM (e.g., coarsening) to the density >> DM inside my operator function. >> > > I don?t think you need to. > > GAMG only requires the fine grid operator - this will be the matrix > assembled from KSPSetComputeOperators. Hence density DM and potential DM > fields only need to be managed by you on the finest level. > > However, if you wanted to use PCMG with rediscretized operators on every > level, then you would need the density DM field defined on each level of > your geometric multigrid hierarchy. This could be done (possibly less than > ideally) by calling DMCreateInterpolation() and then using the Mat to > interpolate the density from the finest level to next coarsest level (and > so on). > > Thanks, > Dave > > >> >> --Matt >> ========================== >> Matthew Young, PhD (he/him) >> Research Scientist II >> Space Science Center >> University of New Hampshire >> Matthew.Young at unh.edu >> ========================== >> >> >> On Sun, Apr 30, 2023 at 1:52?PM Matthew Knepley >> wrote: >> >>> On Sun, Apr 30, 2023 at 1:12?PM Matthew Young < >>> myoung.space.science at gmail.com> wrote: >>> >>>> Hi all, >>>> >>>> I am developing a particle-in-cell code that models ions as particles >>>> and electrons as an inertialess fluid. I use a PIC DMSWARM for the ions, >>>> which I gather into density and flux before solving a linear system for the >>>> electrostatic potential (phi). I currently have one DMDA with 5 degrees of >>>> freedom -- one each for density, 3 flux components, and phi. >>>> >>>> When setting up the linear system to solve for phi, I've been following >>>> examples like KSP ex34.c and ex42.c when writing the KSP operator and RHS >>>> functions but I'm not sure I have the right approach, since 4 of the DOFs >>>> are known and 1 is unknown. >>>> >>>> I saw this thread >>>> >>>> that recommended using DMDAGetReducedDMDA, which I gather has been >>>> deprecated in favor of DMDACreateCompatibleDMDA. Is that a good approach >>>> for managing a regular grid with known and unknown quantities on each node? >>>> Could a composite DM be useful? Has anyone else worked on a problem like >>>> this? >>>> >>> >>> I recommend making a different DM for each kind of solve you want. >>> DMDACreateCompatibleDMDA() should be the implementation of DMClone(), but >>> we have yet to harmonize all things for all DMs. I would create one DM for >>> your Vlasov components and one for the Poisson. >>> We follow this strategy in our Vlasov-Poisson test for Landau damping: >>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/swarm/tests/ex9.c >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> --Matt >>>> ========================== >>>> Matthew Young, PhD (he/him) >>>> Research Scientist II >>>> Space Science Center >>>> University of New Hampshire >>>> Matthew.Young at unh.edu >>>> ========================== >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.blauth at itwm.fraunhofer.de Tue May 2 12:26:42 2023 From: sebastian.blauth at itwm.fraunhofer.de (Sebastian Blauth) Date: Tue, 2 May 2023 19:26:42 +0200 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: References: Message-ID: On 02.05.2023 15:12, Matthew Knepley wrote: > On Tue, May 2, 2023 at 9:07?AM Blauth, Sebastian > > wrote: > > Hello,____ > > __ __ > > I am having a problem using / configuring PETSc to obtain a scalable > solver for the incompressible Navier Stokes equations. I am > discretizing the equations using FEM (with the library fenics) and I > am using the stable P2-P1 Taylor-Hood elements. I have read and > tried a lot regarding preconditioners for incompressible Navier > Stokes and I am aware that this is very much an active research > field, but maybe I can get some hints / tips. ____ > > I am interested in solving large-scale 3D problems, but I cannot > even set up a scaleable 2D solver for the problems. All of my > approaches at the moment are trying to use a Schur Complement > approach, but I cannot get a ?good? preconditioner for the Schur > complement matrix. For the velocity block, I am using the AMG > provided by hypre (which seems to work fine and is most likely not > the problem).____ > > __ __ > > To test the solver, I am using a simple 2D channel flow problem with > do-nothing conditions at the outlet.____ > > __ __ > > I am facing the following difficulties at the moment:____ > > __ __ > > - First, I am having trouble with using > -pc_fieldsplit_schur_precondition selfp. With this setup, the cost > for solving the Schur complement part in the fieldsplit > preconditioner (approximately) increase when the mesh is refined. I > am using the following options for this setup (note that I am using > exact solves for the velocity part to debug, but using, e.g., gmres > with hypre boomeramg reaches a given tolerance with a number of > iterations that is independent of the mesh) > > > The diagonal of the momentum block is a bad preconditioner for the Schur > complement, because S is spectrally equivalent to the mass matrix. You > should build the mass matrix and use that as the preconditioning matrix > for the Schur part. The FEniCS people can show you how to do that. This > will provide mesh-independent convergence (you can see me doing this in > SNES ex69). > > ? Thanks, > > ? ? ?Matt I agree with your comment for the Stokes equations - for these, I have already tried and used the pressure mass matrix as part of a (additive) block preconditioner and it gave mesh independent results. However, for the Navier Stokes equations, is the Schur complement really spectrally equivalent to the pressure mass matrix? And even if it is, the convergence is only good for small Reynolds numbers, for moderately high ones the convergence really deteriorates. This is why I am trying to make fieldsplit_schur_precondition selfp work better (this is, if I understand it correctly, a SIMPLE type preconditioner). Best regards, Sebastian > > ??? -ksp_type fgmres____ > > ??? -ksp_rtol 1e-6____ > > -ksp_atol 1e-30____ > > ??? -pc_type fieldsplit____ > > ??? -pc_fieldsplit_type schur____ > > ??? -pc_fieldsplit_schur_fact_type full____ > > ??? -pc_fieldsplit_schur_precondition selfp____ > > ??? -fieldsplit_0_ksp_type preonly____ > > ??? -fieldsplit_0_pc_type lu____ > > ??? -fieldsplit_1_ksp_type gmres____ > > ??? -fieldsplit_1_ksp_pc_side right____ > > ??? -fieldsplit_1_ksp_max_it 1000____ > > ??? -fieldsplit_1_ksp_rtol 1e-1____ > > ??? -fieldsplit_1_ksp_atol 1e-30____ > > ??? -fieldsplit_1_pc_type lu____ > > ??? -fieldsplit_1_ksp_converged_reason____ > > ??? -ksp_converged_reason____ > > __ __ > > Note, that I use direct solvers for the subproblems to get an > ?ideal? convergence. Even if I replace the direct solver with > boomeramg, the behavior is the same and the number of iterations > does not change much. ____ > > In particular, I get the following behavior:____ > > For a 8x8 mesh, I need, on average, 25 iterations to solve > fieldsplit_1____ > > For a 16x16 mesh, I need 40 iterations____ > > For a 32x32 mesh, I need 70 iterations____ > > For a 64x64 mesh, I need 100 iterations____ > > __ __ > > However, the outer fgmres requires, as expected, always the same > number of iterations to reach convergence (as expected).____ > > I do understand that the selfp preconditioner for the Schur > complement is expected to deteriorate as the Reynolds number > increases and the problem becomes more convective in nature, but I > had hoped that I can at least get a scaleable preconditioner with > respect to the mesh size out of it. Are there any tips on how to > achieve this?____ > > __ __ > > My second problem is concerning the LSC preconditioner. When I am > using this, again both with exact solves of the linear problems or > when using boomeramg, I do not get a scalable solver with respect to > the mesh size. On the contrary, here the number of solves required > for solving fieldsplit_1 to a fixed relative tolerance seem to > behave linearly w.r.t. the problem size. For this problem, I suspect > that the issue lies in the scaling of the LSC preconditioner > matrices (in the book of Elman, Sylvester and Wathen, the matrices > are scaled with the inverse of the diagonal velocity mass matrix). > Is it possible to achieve this with PETSc? I started experimenting > with supplying the velocity mass matrix as preconditioner matrix and > using ?use_amat?, but I am not sure where / how to do it this way.____ > > __ __ > > And finally, more of an observation and question: I noticed that the > AMG approximations for the velocity block became worse with increase > of the Reynolds number when using the default options. However, when > using -pc_hypre_boomeramg_relax_weight_all 0.0 I noticed that > boomeramg performed way more robustly w.r.t. the Reynolds number. > Are there any other ways to improve the AMG performance in this > regard?____ > > __ __ > > Thanks a lot in advance and I am looking forward to your reply,____ > > Sebastian____ > > __ __ > > --____ > > Dr. Sebastian Blauth____ > > Fraunhofer-Institut f?r____ > > Techno- und Wirtschaftsmathematik ITWM____ > > Abteilung Transportvorg?nge____ > > Fraunhofer-Platz 1, 67663 Kaiserslautern____ > > Telefon: +49 631 31600-4968____ > > sebastian.blauth at itwm.fraunhofer.de > ____ > > https://www.itwm.fraunhofer.de ____ > > __ __ > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -- Dr. Sebastian Blauth Fraunhofer-Institut f?r Techno- und Wirtschaftsmathematik ITWM Abteilung Transportvorg?nge Fraunhofer-Platz 1, 67663 Kaiserslautern Telefon: +49 631 31600-4968 sebastian.blauth at itwm.fraunhofer.de www.itwm.fraunhofer.de From jed at jedbrown.org Tue May 2 13:29:54 2023 From: jed at jedbrown.org (Jed Brown) Date: Tue, 02 May 2023 12:29:54 -0600 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: References: Message-ID: <87cz3i7fj1.fsf@jedbrown.org> Sebastian Blauth writes: > I agree with your comment for the Stokes equations - for these, I have > already tried and used the pressure mass matrix as part of a (additive) > block preconditioner and it gave mesh independent results. > > However, for the Navier Stokes equations, is the Schur complement really > spectrally equivalent to the pressure mass matrix? No, it's not. You'd want something like PCD (better, but not algebraic) or LSC. > And even if it is, the convergence is only good for small Reynolds numbers, for moderately high ones the convergence really deteriorates. This is why I am trying to make fieldsplit_schur_precondition selfp work better (this is, if I understand it correctly, a SIMPLE type preconditioner). SIMPLE is for short time steps (not too far from resolving CFL) and bad for steady. This taxonomy is useful, though the problems are super academic and they don't use high aspect ratio. https://doi.org/10.1016/j.jcp.2007.09.026 From knepley at gmail.com Tue May 2 14:33:31 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 May 2023 15:33:31 -0400 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: <87cz3i7fj1.fsf@jedbrown.org> References: <87cz3i7fj1.fsf@jedbrown.org> Message-ID: On Tue, May 2, 2023 at 2:29?PM Jed Brown wrote: > Sebastian Blauth writes: > > > I agree with your comment for the Stokes equations - for these, I have > > already tried and used the pressure mass matrix as part of a (additive) > > block preconditioner and it gave mesh independent results. > > > > However, for the Navier Stokes equations, is the Schur complement really > > spectrally equivalent to the pressure mass matrix? > > No, it's not. You'd want something like PCD (better, but not algebraic) or > LSC. > I think you can do a better job than that using something like https://arxiv.org/abs/1810.03315 Basically, you use an augmented Lagrangian thing to make the Schur complement well-conditioned, and then use a special smoother to handle that perturbation. > > And even if it is, the convergence is only good for small Reynolds > numbers, for moderately high ones the convergence really deteriorates. This > is why I am trying to make fieldsplit_schur_precondition selfp work better > (this is, if I understand it correctly, a SIMPLE type preconditioner). > > SIMPLE is for short time steps (not too far from resolving CFL) and bad > for steady. This taxonomy is useful, though the problems are super academic > and they don't use high aspect ratio. > > https://doi.org/10.1016/j.jcp.2007.09.026 > Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tt73 at njit.edu Tue May 2 16:40:37 2023 From: tt73 at njit.edu (Takahashi, Tadanaga) Date: Tue, 2 May 2023 17:40:37 -0400 Subject: [petsc-users] Help using FAS as an initial guess Message-ID: Hi, I want to know how to configure the FAS so that it solves a problem on a coarse grid of size 4h, interpolate the solution, and then stop. Here is the context: I am using Newton LS to solve a problem on square domain discretized with DMDA meshed with step size h. I have a subroutine to compute the initial guess. I want this subroutine to first do a Newton solve on a coarse grid of size 4h. Then it interpolates the solution to the main mesh. I think this is achievable by using one iteration of FAS. Below is the gist of what my subroutine looks like: PetscErrorCode InitialState(DM da, Vec u) { // SNES snes; SNESCreate(PETSC_COMM_WORLD,&snes); SNESSetDM(snes,da); SNESSetType(snes,SNESFAS); // solve with multigrid SNESSetTolerances(snes,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT,1,PETSC_DEFAULT); // just one iteration VecSet(u,0.0); // start with zeros SNESSolve(snes,NULL,u); // cheap solve SNESGetSolution(snes,&u); // extract solution } For some reason, my initial guess is too accurate. The initial guess produced by this subroutine looks exactly like the final solution. My guess is that it's doing more than what I want it to do. I'm still new to FAS. How can I tell FAS to do just one crude solve and an interpolation? -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue May 2 17:10:33 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 2 May 2023 18:10:33 -0400 Subject: [petsc-users] Help using FAS as an initial guess In-Reply-To: References: Message-ID: <71A1CBDC-9BB6-4D12-AEC4-63435AD88922@petsc.dev> You might consider https://petsc.org/release/manualpages/SNES/SNESSetGridSequence/ it does exactly what I think you want to do. FAS is a bit more subtle than that. The "coarse grid problem" that FAS builds and solves are dependent on the current fine grid solution so you need an "approximate" fine grid solution already in order to create a FAS coarse problem. Of course FAS can, and often should, be used also with grid sequencing to boot-strap the fine grid solutions. Barry > On May 2, 2023, at 5:40 PM, Takahashi, Tadanaga wrote: > > Hi, > > I want to know how to configure the FAS so that it solves a problem on a coarse grid of size 4h, interpolate the solution, and then stop. > > Here is the context: I am using Newton LS to solve a problem on square domain discretized with DMDA meshed with step size h. I have a subroutine to compute the initial guess. I want this subroutine to first do a Newton solve on a coarse grid of size 4h. Then it interpolates the solution to the main mesh. I think this is achievable by using one iteration of FAS. Below is the gist of what my subroutine looks like: > > PetscErrorCode InitialState(DM da, Vec u) { // > SNES snes; > SNESCreate(PETSC_COMM_WORLD,&snes); > SNESSetDM(snes,da); > SNESSetType(snes,SNESFAS); // solve with multigrid > SNESSetTolerances(snes,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT,1,PETSC_DEFAULT); // just one iteration > VecSet(u,0.0); // start with zeros > SNESSolve(snes,NULL,u); // cheap solve > SNESGetSolution(snes,&u); // extract solution > } > > For some reason, my initial guess is too accurate. The initial guess produced by this subroutine looks exactly like the final solution. My guess is that it's doing more than what I want it to do. I'm still new to FAS. How can I tell FAS to do just one crude solve and an interpolation? -------------- next part -------------- An HTML attachment was scrubbed... URL: From tt73 at njit.edu Tue May 2 17:41:30 2023 From: tt73 at njit.edu (tt73) Date: Tue, 02 May 2023 18:41:30 -0400 Subject: [petsc-users] Help using FAS as an initial guess In-Reply-To: <71A1CBDC-9BB6-4D12-AEC4-63435AD88922@petsc.dev> Message-ID: <6451919a.050a0220.d6304.d03e@mx.google.com> Thanks, Barry. I'll look into it.? -------- Original message --------From: Barry Smith Date: 5/2/23 6:10 PM (GMT-05:00) To: "Takahashi, Tadanaga" Cc: PETSc Subject: Re: [petsc-users] Help using FAS as an initial guess ? You might consider?https://petsc.org/release/manualpages/SNES/SNESSetGridSequence/?it does exactly what I think you want to do.? FAS is a bit more subtle than that. The "coarse grid problem" that FAS builds and solves are dependent on the current fine grid solution so you need an "approximate" fine grid solution already in order to create a FAS coarse problem. Of course FAS can, and often should, be used also with grid sequencing to boot-strap the fine grid solutions.? BarryOn May 2, 2023, at 5:40 PM, Takahashi, Tadanaga wrote:Hi,?I want to know how to configure the FAS so that it solves a problem on?a?coarse?grid of size 4h, interpolate the solution,?and then stop.?Here is the context: I am using Newton LS to solve a problem on square domain discretized with DMDA meshed with step size?h. I have a subroutine to compute the initial guess. I want this subroutine to first do a Newton solve on a coarse grid of size?4h. Then it interpolates the solution to the main mesh. I think this is achievable by using one iteration of FAS. Below is the gist of what my subroutine looks like:?PetscErrorCode InitialState(DM da, Vec u) { //?SNES snes;?SNESCreate(PETSC_COMM_WORLD,&snes);??SNESSetDM(snes,da);SNESSetType(snes,SNESFAS); // solve with multigridSNESSetTolerances(snes,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT,1,PETSC_DEFAULT); // just one iterationVecSet(u,0.0);? ? ? ? ? ? // start with zeros?SNESSolve(snes,NULL,u);? ?// cheap solve?SNESGetSolution(snes,&u); // extract solution?}For some reason, my initial guess is too accurate. The initial guess produced by this subroutine looks exactly like the final solution. My guess is that it's doing more than what I?want it?to do. I'm still new to FAS. How can I tell FAS to do just one crude solve and an interpolation? -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.blauth at itwm.fraunhofer.de Wed May 3 02:07:09 2023 From: sebastian.blauth at itwm.fraunhofer.de (Sebastian Blauth) Date: Wed, 3 May 2023 09:07:09 +0200 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: References: <87cz3i7fj1.fsf@jedbrown.org> Message-ID: <3287ff5f-5ac1-fdff-52d1-97888568c098@itwm.fraunhofer.de> First of all, yes you are correct that I am trying to solve the stationary incompressible Navier Stokes equations. On 02.05.2023 21:33, Matthew Knepley wrote: > On Tue, May 2, 2023 at 2:29?PM Jed Brown > wrote: > > Sebastian Blauth > writes: > > > I agree with your comment for the Stokes equations - for these, I > have > > already tried and used the pressure mass matrix as part of a > (additive) > > block preconditioner and it gave mesh independent results. > > > > However, for the Navier Stokes equations, is the Schur complement > really > > spectrally equivalent to the pressure mass matrix? > > No, it's not. You'd want something like PCD (better, but not > algebraic) or LSC. > I would like to take a look at the LSC preconditioner. For this, I did also not achieve mesh-independent results. I am using the following options (I know that the tolerances are too high at the moment, but it should just illustrate the behavior w.r.t. mesh refinement). Again, I am using a simple 2D channel problem for testing purposes. I am using the following options -ksp_type fgmres -ksp_gmres_restart 100 -ksp_gmres_cgs_refinement_type refine_ifneeded -ksp_max_it 1000 -ksp_rtol 1e-10 -ksp_atol 1e-30 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type full -pc_fieldsplit_schur_precondition self -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type lu -fieldsplit_1_ksp_type gmres -fieldsplit_1_ksp_pc_side right -fieldsplit_1_ksp_gmres_restart 100 -fieldsplit_1_ksp_gmres_cgs_refinement_type refine_ifneeded -fieldsplit_1_ksp_max_it 1000 -fieldsplit_1_ksp_rtol 1e-10 -fieldsplit_1_ksp_atol 1e-30 -fieldsplit_1_pc_type lsc -fieldsplit_1_lsc_ksp_type preonly -fieldsplit_1_lsc_pc_type lu -fieldsplit_1_ksp_converged_reason Again, the direct solvers are used so that only the influence of the LSC preconditioner is seen. I have suitable preconditioners for all of these available (using boomeramg). At the bottom, I attach the output for different discretizations. As you can see there, the number of iterations increases nearly linearly with the problem size. I think that the problem could occur due to a wrong scaling. In your docs https://petsc.org/release/manualpages/PC/PCLSC/ , you write that the LSC preconditioner is implemented as inv(S) \approx inv(A10 A01) A10 A00 A01 inv(A10 A01) However, in the book of Elman, Sylvester and Wathen (Finite Elements and Fast Iterative Solvers), the LSC preconditioner is defined as inv(S) \approx inv(A10 inv(T) A01) A10 inv(T) A00 inv(T) A01 inv(A10 inv(T) A01) where T = diag(Q) and Q is the velocity mass matrix. There is an options -pc_lsc_scale_diag, which states that it uses the diagonal of A for scaling. I suppose, that this means, that the diagonal of the A00 block is used for scaling - however, this is not the right scaling, is it? Even in the source code for the LSC preconditioner, in /src/ksp/pc/impls/lsc/lsc.c it is mentioned, that a mass matrix should be used... Is there any way to implement this in PETSc? Maybe by supplying the mass matrix as Pmat? Thanks a lot in advance, Sebastian > > I think you can do a better job than that using something like > > https://arxiv.org/abs/1810.03315 > > Basically, you use an augmented Lagrangian thing to make the Schur > complement well-conditioned, > and then use a special smoother to handle that perturbation. > > > And even if it is, the convergence is only good for small > Reynolds numbers, for moderately high ones the convergence really > deteriorates. This is why I am trying to make > fieldsplit_schur_precondition selfp work better (this is, if I > understand it correctly, a SIMPLE type preconditioner). > > SIMPLE is for short time steps (not too far from resolving CFL) and > bad for steady. This taxonomy is useful, though the problems are > super academic and they don't use high aspect ratio. > Okay, I get that I cannot expect the SIMPLE preconditioning (schur_precondition selfp) to work efficiently. I guess the reason it works for small time steps (for the instationary equations) is that the velocity block becomes diagonally dominant in this case, so that diag(A) is a good approximation of A. > https://doi.org/10.1016/j.jcp.2007.09.026 > > > > ? ?Thanks, > > ? ? ? Matt > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ And here is the output of my scaling tests 8x8 discretization Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. tol) Newton solver: 0, 1.023e+03 (1.00e-30), 1.000e+00 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 38 Newton solver: 1, 1.313e+03 (1.00e-30), 1.283e+00 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 76 Newton solver: 2, 1.198e+02 (1.00e-30), 1.171e-01 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 74 Newton solver: 3, 7.249e-01 (1.00e-30), 7.084e-04 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 74 Newton solver: 4, 3.883e-05 (1.00e-30), 3.795e-08 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 74 Newton solver: 5, 2.778e-12 (1.00e-30), 2.714e-15 (1.00e-10) 16x16 discretization Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. tol) Newton solver: 0, 1.113e+03 (1.00e-30), 1.000e+00 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 62 Newton solver: 1, 8.316e+02 (1.00e-30), 7.475e-01 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 141 Newton solver: 2, 5.806e+01 (1.00e-30), 5.218e-02 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 119 Newton solver: 3, 3.309e-01 (1.00e-30), 2.974e-04 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 118 Newton solver: 4, 9.085e-06 (1.00e-30), 8.166e-09 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 120 Newton solver: 5, 3.475e-12 (1.00e-30), 3.124e-15 (1.00e-10) 32x32 discretization Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. tol) Newton solver: 0, 1.330e+03 (1.00e-30), 1.000e+00 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 98 Newton solver: 1, 5.913e+02 (1.00e-30), 4.445e-01 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 183 Newton solver: 2, 3.214e+01 (1.00e-30), 2.416e-02 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 152 Newton solver: 3, 2.059e-01 (1.00e-30), 1.547e-04 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 151 Newton solver: 4, 6.949e-06 (1.00e-30), 5.223e-09 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 149 Newton solver: 5, 5.300e-12 (1.00e-30), 3.983e-15 (1.00e-10) 64x64 discretization Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. tol) Newton solver: 0, 1.707e+03 (1.00e-30), 1.000e+00 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 198 Newton solver: 1, 4.259e+02 (1.00e-30), 2.494e-01 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 357 Newton solver: 2, 1.706e+01 (1.00e-30), 9.993e-03 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 266 Newton solver: 3, 1.134e-01 (1.00e-30), 6.639e-05 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 261 Newton solver: 4, 4.285e-06 (1.00e-30), 2.510e-09 (1.00e-10) Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 263 Newton solver: 5, 9.650e-12 (1.00e-30), 5.652e-15 (1.00e-10) -- Dr. Sebastian Blauth Fraunhofer-Institut f?r Techno- und Wirtschaftsmathematik ITWM Abteilung Transportvorg?nge Fraunhofer-Platz 1, 67663 Kaiserslautern Telefon: +49 631 31600-4968 sebastian.blauth at itwm.fraunhofer.de www.itwm.fraunhofer.de From ksl7912 at snu.ac.kr Wed May 3 05:04:35 2023 From: ksl7912 at snu.ac.kr (=?UTF-8?B?wq3qtozsirnrpqwgLyDtlZnsg50gLyDtla3qs7XsmrDso7zqs7XtlZnqs7w=?=) Date: Wed, 3 May 2023 19:04:35 +0900 Subject: [petsc-users] parallel computing error Message-ID: Dear developers I'm trying to use parallel computing and I ran the command 'mpirun -np 4 ./app' In this case, there are two problems. *First,* I encountered error message /// [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Invalid argument [1]PETSC ERROR: Comm must be of size 1 /// The code on the error position is MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); Could "MatCreateSeqDense" not be used in parallel computing? *Second*, the same error message is repeated as many times as the number of cores. if I use command -np 4, then the error message is repeated 4 times. Could you recommend some advice related to this? Best, Seung Lee Kwon -- Seung Lee Kwon, Ph.D.Candidate Aerospace Structures and Materials Laboratory Department of Mechanical and Aerospace Engineering Seoul National University Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 E-mail : ksl7912 at snu.ac.kr Office : +82-2-880-7389 C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 3 05:30:01 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 May 2023 06:30:01 -0400 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? wrote: > Dear developers > > I'm trying to use parallel computing and I ran the command 'mpirun -np 4 > ./app' > > In this case, there are two problems. > > *First,* I encountered error message > /// > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Invalid argument > [1]PETSC ERROR: Comm must be of size 1 > /// > The code on the error position is > MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); > 1) "Seq" means sequential, that is "not parallel". 2) This line should still be fine since PETSC_COMM_SELF is a serial communicator 3) You should be checking the error code for each call, maybe using the CHKERRQ() macro 4) Please always send the entire error message, not a snippet THanks Matt > Could "MatCreateSeqDense" not be used in parallel computing? > > *Second*, the same error message is repeated as many times as the number > of cores. > if I use command -np 4, then the error message is repeated 4 times. > Could you recommend some advice related to this? > > Best, > Seung Lee Kwon > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Wed May 3 19:42:27 2023 From: mlohry at gmail.com (Mark Lohry) Date: Wed, 3 May 2023 20:42:27 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial Message-ID: I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with SNESSetLagJacobian -2 (compute once and then frozen jacobian). I'm seeing slight (<1%) but nonzero differences in residuals from run to run. I'm wondering where randomness might enter here -- does the jacobian coloring use a random seed? -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 3 19:50:27 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 3 May 2023 20:50:27 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: Message-ID: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> No, the coloring should be identical every time. Do you see differences with 1 MPI rank? (Or much smaller ones?). > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: > > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with SNESSetLagJacobian -2 (compute once and then frozen jacobian). > > I'm seeing slight (<1%) but nonzero differences in residuals from run to run. I'm wondering where randomness might enter here -- does the jacobian coloring use a random seed? From mlohry at gmail.com Wed May 3 20:07:49 2023 From: mlohry at gmail.com (Mark Lohry) Date: Wed, 3 May 2023 21:07:49 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: This is on a single MPI rank. I haven't checked the coloring, was just guessing there. But the solutions/residuals are slightly different from run to run. Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise identical results? On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: > > No, the coloring should be identical every time. Do you see differences > with 1 MPI rank? (Or much smaller ones?). > > > > > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: > > > > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear > solver where I give it the sparsity. PC asm, KSP gmres, with > SNESSetLagJacobian -2 (compute once and then frozen jacobian). > > > > I'm seeing slight (<1%) but nonzero differences in residuals from run to > run. I'm wondering where randomness might enter here -- does the jacobian > coloring use a random seed? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksl7912 at snu.ac.kr Wed May 3 21:29:04 2023 From: ksl7912 at snu.ac.kr (=?UTF-8?B?wq3qtozsirnrpqwgLyDtlZnsg50gLyDtla3qs7XsmrDso7zqs7XtlZnqs7w=?=) Date: Thu, 4 May 2023 11:29:04 +0900 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: Dear developers Thank you for your explanation. But I should use the MatCreateSeqDense because I want to use the MatMatSolve that B matrix must be a SeqDense matrix. Using MatMatSolve is an inevitable part of my code. Could you give me a comment to avoid this error? Best, Seung Lee Kwon 2023? 5? 3? (?) ?? 7:30, Matthew Knepley ?? ??: > On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? > wrote: > >> Dear developers >> >> I'm trying to use parallel computing and I ran the command 'mpirun -np 4 >> ./app' >> >> In this case, there are two problems. >> >> *First,* I encountered error message >> /// >> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Invalid argument >> [1]PETSC ERROR: Comm must be of size 1 >> /// >> The code on the error position is >> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >> > > 1) "Seq" means sequential, that is "not parallel". > > 2) This line should still be fine since PETSC_COMM_SELF is a serial > communicator > > 3) You should be checking the error code for each call, maybe using the > CHKERRQ() macro > > 4) Please always send the entire error message, not a snippet > > THanks > > Matt > > >> Could "MatCreateSeqDense" not be used in parallel computing? >> >> *Second*, the same error message is repeated as many times as the number >> of cores. >> if I use command -np 4, then the error message is repeated 4 times. >> Could you recommend some advice related to this? >> >> Best, >> Seung Lee Kwon >> >> -- >> Seung Lee Kwon, Ph.D.Candidate >> Aerospace Structures and Materials Laboratory >> Department of Mechanical and Aerospace Engineering >> Seoul National University >> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >> E-mail : ksl7912 at snu.ac.kr >> Office : +82-2-880-7389 >> C. P : +82-10-4695-1062 >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Seung Lee Kwon, Ph.D.Candidate Aerospace Structures and Materials Laboratory Department of Mechanical and Aerospace Engineering Seoul National University Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 E-mail : ksl7912 at snu.ac.kr Office : +82-2-880-7389 C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 3 22:05:05 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 3 May 2023 23:05:05 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: Do they start very similarly and then slowly drift further apart? That is the first couple of KSP iterations they are almost identical but then for each iteration get a bit further. Similar for the SNES iterations, starting close and then for more iterations and more solves they start moving apart. Or do they suddenly jump to be very different? You can run with -snes_monitor -ksp_monitor > On May 3, 2023, at 9:07 PM, Mark Lohry wrote: > > This is on a single MPI rank. I haven't checked the coloring, was just guessing there. But the solutions/residuals are slightly different from run to run. > > Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise identical results? > > > On Wed, May 3, 2023, 8:50 PM Barry Smith > wrote: >> >> No, the coloring should be identical every time. Do you see differences with 1 MPI rank? (Or much smaller ones?). >> >> >> >> > On May 3, 2023, at 8:42 PM, Mark Lohry > wrote: >> > >> > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >> > >> > I'm seeing slight (<1%) but nonzero differences in residuals from run to run. I'm wondering where randomness might enter here -- does the jacobian coloring use a random seed? >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 3 22:08:33 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 3 May 2023 23:08:33 -0400 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: You can configure with MUMPS ./configure --download-mumps --download-scalapack --download-ptscotch --download-metis --download-parmetis And then use MatMatSolve() as in src/mat/tests/ex125.c with parallel MatMatSolve() using MUMPS as the solver. Barry > On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? wrote: > > Dear developers > > Thank you for your explanation. > > But I should use the MatCreateSeqDense because I want to use the MatMatSolve that B matrix must be a SeqDense matrix. > > Using MatMatSolve is an inevitable part of my code. > > Could you give me a comment to avoid this error? > > Best, > > Seung Lee Kwon > > 2023? 5? 3? (?) ?? 7:30, Matthew Knepley >?? ??: >> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? > wrote: >>> Dear developers >>> >>> I'm trying to use parallel computing and I ran the command 'mpirun -np 4 ./app' >>> >>> In this case, there are two problems. >>> >>> First, I encountered error message >>> /// >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [1]PETSC ERROR: Invalid argument >>> [1]PETSC ERROR: Comm must be of size 1 >>> /// >>> The code on the error position is >>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >> >> 1) "Seq" means sequential, that is "not parallel". >> >> 2) This line should still be fine since PETSC_COMM_SELF is a serial communicator >> >> 3) You should be checking the error code for each call, maybe using the CHKERRQ() macro >> >> 4) Please always send the entire error message, not a snippet >> >> THanks >> >> Matt >> >>> Could "MatCreateSeqDense" not be used in parallel computing? >>> >>> Second, the same error message is repeated as many times as the number of cores. >>> if I use command -np 4, then the error message is repeated 4 times. >>> Could you recommend some advice related to this? >>> >>> Best, >>> Seung Lee Kwon >>> >>> -- >>> Seung Lee Kwon, Ph.D.Candidate >>> Aerospace Structures and Materials Laboratory >>> Department of Mechanical and Aerospace Engineering >>> Seoul National University >>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>> E-mail : ksl7912 at snu.ac.kr >>> Office : +82-2-880-7389 >>> C. P : +82-10-4695-1062 >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksl7912 at snu.ac.kr Thu May 4 02:20:24 2023 From: ksl7912 at snu.ac.kr (=?UTF-8?B?wq3qtozsirnrpqwgLyDtlZnsg50gLyDtla3qs7XsmrDso7zqs7XtlZnqs7w=?=) Date: Thu, 4 May 2023 16:20:24 +0900 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: Dear Barry Smith Thank you for your reply. I've already installed MUMPS. And I checked the example you said (ex125.c), I don't understand why the RHS matrix becomes the SeqDense matrix. Could you explain in more detail? Best regards Seung Lee Kwon 2023? 5? 4? (?) ?? 12:08, Barry Smith ?? ??: > > You can configure with MUMPS ./configure --download-mumps > --download-scalapack --download-ptscotch --download-metis > --download-parmetis > > And then use MatMatSolve() as in src/mat/tests/ex125.c with parallel > MatMatSolve() using MUMPS as the solver. > > Barry > > > On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? > wrote: > > Dear developers > > Thank you for your explanation. > > But I should use the MatCreateSeqDense because I want to use the > MatMatSolve that B matrix must be a SeqDense matrix. > > Using MatMatSolve is an inevitable part of my code. > > Could you give me a comment to avoid this error? > > Best, > > Seung Lee Kwon > > 2023? 5? 3? (?) ?? 7:30, Matthew Knepley ?? ??: > >> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? >> wrote: >> >>> Dear developers >>> >>> I'm trying to use parallel computing and I ran the command 'mpirun -np 4 >>> ./app' >>> >>> In this case, there are two problems. >>> >>> *First,* I encountered error message >>> /// >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [1]PETSC ERROR: Invalid argument >>> [1]PETSC ERROR: Comm must be of size 1 >>> /// >>> The code on the error position is >>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >>> >> >> 1) "Seq" means sequential, that is "not parallel". >> >> 2) This line should still be fine since PETSC_COMM_SELF is a serial >> communicator >> >> 3) You should be checking the error code for each call, maybe using the >> CHKERRQ() macro >> >> 4) Please always send the entire error message, not a snippet >> >> THanks >> >> Matt >> >> >>> Could "MatCreateSeqDense" not be used in parallel computing? >>> >>> *Second*, the same error message is repeated as many times as the >>> number of cores. >>> if I use command -np 4, then the error message is repeated 4 times. >>> Could you recommend some advice related to this? >>> >>> Best, >>> Seung Lee Kwon >>> >>> -- >>> Seung Lee Kwon, Ph.D.Candidate >>> Aerospace Structures and Materials Laboratory >>> Department of Mechanical and Aerospace Engineering >>> Seoul National University >>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>> E-mail : ksl7912 at snu.ac.kr >>> Office : +82-2-880-7389 >>> C. P : +82-10-4695-1062 >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 > > > -- Seung Lee Kwon, Ph.D.Candidate Aerospace Structures and Materials Laboratory Department of Mechanical and Aerospace Engineering Seoul National University Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 E-mail : ksl7912 at snu.ac.kr Office : +82-2-880-7389 C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Thu May 4 07:13:07 2023 From: mlohry at gmail.com (Mark Lohry) Date: Thu, 4 May 2023 08:13:07 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: > > Do they start very similarly and then slowly drift further apart? Yes, this. I take it this sounds familiar? See these two examples with 20 fixed iterations pasted at the end. The difference for one solve is slight (final SNES norm is identical to 5 digits), but in the context I'm using it in (repeated applications to solve a steady state multigrid problem, though here just one level) the differences add up such that I might reach global convergence in 35 iterations or 38. It's not the end of the world, but I was expecting that with -np 1 these would be identical and I'm not sure where the root cause would be. 0 SNES Function norm 2.801842107848e+04 0 KSP Residual norm 4.045639499595e+01 1 KSP Residual norm 1.917999809040e+01 2 KSP Residual norm 1.616048521958e+01 [...] 19 KSP Residual norm 8.788043518111e-01 20 KSP Residual norm 6.570851270214e-01 Linear solve converged due to CONVERGED_ITS iterations 20 1 SNES Function norm 1.801309983345e+03 Nonlinear solve converged due to CONVERGED_ITS iterations 1 Same system, identical initial 0 SNES norm, 0 KSP is slightly different 0 SNES Function norm 2.801842107848e+04 0 KSP Residual norm 4.045639473002e+01 1 KSP Residual norm 1.917999883034e+01 2 KSP Residual norm 1.616048572016e+01 [...] 19 KSP Residual norm 8.788046348957e-01 20 KSP Residual norm 6.570859588610e-01 Linear solve converged due to CONVERGED_ITS iterations 20 1 SNES Function norm 1.801311320322e+03 Nonlinear solve converged due to CONVERGED_ITS iterations 1 On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: > > Do they start very similarly and then slowly drift further apart? That > is the first couple of KSP iterations they are almost identical but then > for each iteration get a bit further. Similar for the SNES iterations, > starting close and then for more iterations and more solves they start > moving apart. Or do they suddenly jump to be very different? You can run > with -snes_monitor -ksp_monitor > > On May 3, 2023, at 9:07 PM, Mark Lohry wrote: > > This is on a single MPI rank. I haven't checked the coloring, was just > guessing there. But the solutions/residuals are slightly different from run > to run. > > Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise > identical results? > > > On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: > >> >> No, the coloring should be identical every time. Do you see differences >> with 1 MPI rank? (Or much smaller ones?). >> >> >> >> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >> > >> > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear >> solver where I give it the sparsity. PC asm, KSP gmres, with >> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >> > >> > I'm seeing slight (<1%) but nonzero differences in residuals from run >> to run. I'm wondering where randomness might enter here -- does the >> jacobian coloring use a random seed? >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 4 07:25:06 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2023 08:25:06 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: > Do they start very similarly and then slowly drift further apart? > > > Yes, this. I take it this sounds familiar? > > See these two examples with 20 fixed iterations pasted at the end. The > difference for one solve is slight (final SNES norm is identical to 5 > digits), but in the context I'm using it in (repeated applications to solve > a steady state multigrid problem, though here just one level) the > differences add up such that I might reach global convergence in 35 > iterations or 38. It's not the end of the world, but I was expecting that > with -np 1 these would be identical and I'm not sure where the root cause > would be. > The initial KSP residual is different, so its the PC. Please send the output of -snes_view. If your ASM is using direct factorization, then it could be randomness in whatever LU you are using. Thanks, Matt > 0 SNES Function norm 2.801842107848e+04 > 0 KSP Residual norm 4.045639499595e+01 > 1 KSP Residual norm 1.917999809040e+01 > 2 KSP Residual norm 1.616048521958e+01 > [...] > 19 KSP Residual norm 8.788043518111e-01 > 20 KSP Residual norm 6.570851270214e-01 > Linear solve converged due to CONVERGED_ITS iterations 20 > 1 SNES Function norm 1.801309983345e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > > > Same system, identical initial 0 SNES norm, 0 KSP is slightly different > > 0 SNES Function norm 2.801842107848e+04 > 0 KSP Residual norm 4.045639473002e+01 > 1 KSP Residual norm 1.917999883034e+01 > 2 KSP Residual norm 1.616048572016e+01 > [...] > 19 KSP Residual norm 8.788046348957e-01 > 20 KSP Residual norm 6.570859588610e-01 > Linear solve converged due to CONVERGED_ITS iterations 20 > 1 SNES Function norm 1.801311320322e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > > On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: > >> >> Do they start very similarly and then slowly drift further apart? That >> is the first couple of KSP iterations they are almost identical but then >> for each iteration get a bit further. Similar for the SNES iterations, >> starting close and then for more iterations and more solves they start >> moving apart. Or do they suddenly jump to be very different? You can run >> with -snes_monitor -ksp_monitor >> >> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >> >> This is on a single MPI rank. I haven't checked the coloring, was just >> guessing there. But the solutions/residuals are slightly different from run >> to run. >> >> Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise >> identical results? >> >> >> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >> >>> >>> No, the coloring should be identical every time. Do you see >>> differences with 1 MPI rank? (Or much smaller ones?). >>> >>> >>> >>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>> > >>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>> > >>> > I'm seeing slight (<1%) but nonzero differences in residuals from run >>> to run. I'm wondering where randomness might enter here -- does the >>> jacobian coloring use a random seed? >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu May 4 07:29:55 2023 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 4 May 2023 08:29:55 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: If you are using MG what is the coarse grid solver? -snes_view might give you that. On Thu, May 4, 2023 at 8:25?AM Matthew Knepley wrote: > On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: > >> Do they start very similarly and then slowly drift further apart? >> >> >> Yes, this. I take it this sounds familiar? >> >> See these two examples with 20 fixed iterations pasted at the end. The >> difference for one solve is slight (final SNES norm is identical to 5 >> digits), but in the context I'm using it in (repeated applications to solve >> a steady state multigrid problem, though here just one level) the >> differences add up such that I might reach global convergence in 35 >> iterations or 38. It's not the end of the world, but I was expecting that >> with -np 1 these would be identical and I'm not sure where the root cause >> would be. >> > > The initial KSP residual is different, so its the PC. Please send the > output of -snes_view. If your ASM is using direct factorization, then it > could be randomness in whatever LU you are using. > > Thanks, > > Matt > > >> 0 SNES Function norm 2.801842107848e+04 >> 0 KSP Residual norm 4.045639499595e+01 >> 1 KSP Residual norm 1.917999809040e+01 >> 2 KSP Residual norm 1.616048521958e+01 >> [...] >> 19 KSP Residual norm 8.788043518111e-01 >> 20 KSP Residual norm 6.570851270214e-01 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> 1 SNES Function norm 1.801309983345e+03 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> >> >> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >> >> 0 SNES Function norm 2.801842107848e+04 >> 0 KSP Residual norm 4.045639473002e+01 >> 1 KSP Residual norm 1.917999883034e+01 >> 2 KSP Residual norm 1.616048572016e+01 >> [...] >> 19 KSP Residual norm 8.788046348957e-01 >> 20 KSP Residual norm 6.570859588610e-01 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> 1 SNES Function norm 1.801311320322e+03 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> >> On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: >> >>> >>> Do they start very similarly and then slowly drift further apart? That >>> is the first couple of KSP iterations they are almost identical but then >>> for each iteration get a bit further. Similar for the SNES iterations, >>> starting close and then for more iterations and more solves they start >>> moving apart. Or do they suddenly jump to be very different? You can run >>> with -snes_monitor -ksp_monitor >>> >>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>> >>> This is on a single MPI rank. I haven't checked the coloring, was just >>> guessing there. But the solutions/residuals are slightly different from run >>> to run. >>> >>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise >>> identical results? >>> >>> >>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>> >>>> >>>> No, the coloring should be identical every time. Do you see >>>> differences with 1 MPI rank? (Or much smaller ones?). >>>> >>>> >>>> >>>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>>> > >>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>> > >>>> > I'm seeing slight (<1%) but nonzero differences in residuals from run >>>> to run. I'm wondering where randomness might enter here -- does the >>>> jacobian coloring use a random seed? >>>> >>>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Thu May 4 07:31:45 2023 From: mlohry at gmail.com (Mark Lohry) Date: Thu, 4 May 2023 08:31:45 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: > > Please send the output of -snes_view. > pasted below. anything stand out? SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=22 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: asm total subdomain blocks = 1, amount of overlap = 0 restriction/interpolation type - RESTRICT Local solver information for first block is in the following KSP and PC objects on rank 0: Use -ksp_view ::ascii_info_detail to display information for all blocks KSP Object: (sub_) 1 MPI process type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI process type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: (sub_) 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 package used to perform factorization: petsc total: nonzeros=1277952, allocated nonzeros=1277952 block size is 16 linear system matrix = precond matrix: Mat Object: (sub_) 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=16384, cols=16384 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: > If you are using MG what is the coarse grid solver? > -snes_view might give you that. > > On Thu, May 4, 2023 at 8:25?AM Matthew Knepley wrote: > >> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >> >>> Do they start very similarly and then slowly drift further apart? >>> >>> >>> Yes, this. I take it this sounds familiar? >>> >>> See these two examples with 20 fixed iterations pasted at the end. The >>> difference for one solve is slight (final SNES norm is identical to 5 >>> digits), but in the context I'm using it in (repeated applications to solve >>> a steady state multigrid problem, though here just one level) the >>> differences add up such that I might reach global convergence in 35 >>> iterations or 38. It's not the end of the world, but I was expecting that >>> with -np 1 these would be identical and I'm not sure where the root cause >>> would be. >>> >> >> The initial KSP residual is different, so its the PC. Please send the >> output of -snes_view. If your ASM is using direct factorization, then it >> could be randomness in whatever LU you are using. >> >> Thanks, >> >> Matt >> >> >>> 0 SNES Function norm 2.801842107848e+04 >>> 0 KSP Residual norm 4.045639499595e+01 >>> 1 KSP Residual norm 1.917999809040e+01 >>> 2 KSP Residual norm 1.616048521958e+01 >>> [...] >>> 19 KSP Residual norm 8.788043518111e-01 >>> 20 KSP Residual norm 6.570851270214e-01 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> 1 SNES Function norm 1.801309983345e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> >>> >>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>> >>> 0 SNES Function norm 2.801842107848e+04 >>> 0 KSP Residual norm 4.045639473002e+01 >>> 1 KSP Residual norm 1.917999883034e+01 >>> 2 KSP Residual norm 1.616048572016e+01 >>> [...] >>> 19 KSP Residual norm 8.788046348957e-01 >>> 20 KSP Residual norm 6.570859588610e-01 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> 1 SNES Function norm 1.801311320322e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> >>> On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: >>> >>>> >>>> Do they start very similarly and then slowly drift further apart? >>>> That is the first couple of KSP iterations they are almost identical but >>>> then for each iteration get a bit further. Similar for the SNES iterations, >>>> starting close and then for more iterations and more solves they start >>>> moving apart. Or do they suddenly jump to be very different? You can run >>>> with -snes_monitor -ksp_monitor >>>> >>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>> >>>> This is on a single MPI rank. I haven't checked the coloring, was just >>>> guessing there. But the solutions/residuals are slightly different from run >>>> to run. >>>> >>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>> bitwise identical results? >>>> >>>> >>>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>>> >>>>> >>>>> No, the coloring should be identical every time. Do you see >>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>> >>>>> >>>>> >>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>>>> > >>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>> > >>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from >>>>> run to run. I'm wondering where randomness might enter here -- does the >>>>> jacobian coloring use a random seed? >>>>> >>>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 4 07:34:59 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2023 08:34:59 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: > Please send the output of -snes_view. >> > pasted below. anything stand out? > Try -pc_type none. If the first KSP residual is different, then it is something in the MatMFFD, and then I would guess it is something variable in the residual evaluation routine. That should be easy to check by printing norms of the residual each time it is evaluated. Thanks, Matt > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=22 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is applied matrix-free with differencing > Preconditioning Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: asm > total subdomain blocks = 1, amount of overlap = 0 > restriction/interpolation type - RESTRICT > Local solver information for first block is in the following KSP and > PC objects on rank 0: > Use -ksp_view ::ascii_info_detail to display information for all > blocks > KSP Object: (sub_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI process > type: ilu > out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: (sub_) 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > package used to perform factorization: petsc > total: nonzeros=1277952, allocated nonzeros=1277952 > block size is 16 > linear system matrix = precond matrix: > Mat Object: (sub_) 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > > On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: > >> If you are using MG what is the coarse grid solver? >> -snes_view might give you that. >> >> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley wrote: >> >>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >>> >>>> Do they start very similarly and then slowly drift further apart? >>>> >>>> >>>> Yes, this. I take it this sounds familiar? >>>> >>>> See these two examples with 20 fixed iterations pasted at the end. The >>>> difference for one solve is slight (final SNES norm is identical to 5 >>>> digits), but in the context I'm using it in (repeated applications to solve >>>> a steady state multigrid problem, though here just one level) the >>>> differences add up such that I might reach global convergence in 35 >>>> iterations or 38. It's not the end of the world, but I was expecting that >>>> with -np 1 these would be identical and I'm not sure where the root cause >>>> would be. >>>> >>> >>> The initial KSP residual is different, so its the PC. Please send the >>> output of -snes_view. If your ASM is using direct factorization, then it >>> could be randomness in whatever LU you are using. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> 0 SNES Function norm 2.801842107848e+04 >>>> 0 KSP Residual norm 4.045639499595e+01 >>>> 1 KSP Residual norm 1.917999809040e+01 >>>> 2 KSP Residual norm 1.616048521958e+01 >>>> [...] >>>> 19 KSP Residual norm 8.788043518111e-01 >>>> 20 KSP Residual norm 6.570851270214e-01 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> 1 SNES Function norm 1.801309983345e+03 >>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>> >>>> >>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>>> >>>> 0 SNES Function norm 2.801842107848e+04 >>>> 0 KSP Residual norm 4.045639473002e+01 >>>> 1 KSP Residual norm 1.917999883034e+01 >>>> 2 KSP Residual norm 1.616048572016e+01 >>>> [...] >>>> 19 KSP Residual norm 8.788046348957e-01 >>>> 20 KSP Residual norm 6.570859588610e-01 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> 1 SNES Function norm 1.801311320322e+03 >>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>> >>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: >>>> >>>>> >>>>> Do they start very similarly and then slowly drift further apart? >>>>> That is the first couple of KSP iterations they are almost identical but >>>>> then for each iteration get a bit further. Similar for the SNES iterations, >>>>> starting close and then for more iterations and more solves they start >>>>> moving apart. Or do they suddenly jump to be very different? You can run >>>>> with -snes_monitor -ksp_monitor >>>>> >>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>> >>>>> This is on a single MPI rank. I haven't checked the coloring, was just >>>>> guessing there. But the solutions/residuals are slightly different from run >>>>> to run. >>>>> >>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>> bitwise identical results? >>>>> >>>>> >>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>>>> >>>>>> >>>>>> No, the coloring should be identical every time. Do you see >>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>> >>>>>> >>>>>> >>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>>>>> > >>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>> > >>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from >>>>>> run to run. I'm wondering where randomness might enter here -- does the >>>>>> jacobian coloring use a random seed? >>>>>> >>>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu May 4 07:41:14 2023 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 4 May 2023 08:41:14 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: ASM is just the sub PC with one proc but gets weaker with more procs unless you use jacobi. (maybe I am missing something). On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: > Please send the output of -snes_view. >> > pasted below. anything stand out? > > > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=22 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is applied matrix-free with differencing > Preconditioning Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: asm > total subdomain blocks = 1, amount of overlap = 0 > restriction/interpolation type - RESTRICT > Local solver information for first block is in the following KSP and > PC objects on rank 0: > Use -ksp_view ::ascii_info_detail to display information for all > blocks > KSP Object: (sub_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI process > type: ilu > out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: (sub_) 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > package used to perform factorization: petsc > total: nonzeros=1277952, allocated nonzeros=1277952 > block size is 16 > linear system matrix = precond matrix: > Mat Object: (sub_) 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > > On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: > >> If you are using MG what is the coarse grid solver? >> -snes_view might give you that. >> >> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley wrote: >> >>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >>> >>>> Do they start very similarly and then slowly drift further apart? >>>> >>>> >>>> Yes, this. I take it this sounds familiar? >>>> >>>> See these two examples with 20 fixed iterations pasted at the end. The >>>> difference for one solve is slight (final SNES norm is identical to 5 >>>> digits), but in the context I'm using it in (repeated applications to solve >>>> a steady state multigrid problem, though here just one level) the >>>> differences add up such that I might reach global convergence in 35 >>>> iterations or 38. It's not the end of the world, but I was expecting that >>>> with -np 1 these would be identical and I'm not sure where the root cause >>>> would be. >>>> >>> >>> The initial KSP residual is different, so its the PC. Please send the >>> output of -snes_view. If your ASM is using direct factorization, then it >>> could be randomness in whatever LU you are using. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> 0 SNES Function norm 2.801842107848e+04 >>>> 0 KSP Residual norm 4.045639499595e+01 >>>> 1 KSP Residual norm 1.917999809040e+01 >>>> 2 KSP Residual norm 1.616048521958e+01 >>>> [...] >>>> 19 KSP Residual norm 8.788043518111e-01 >>>> 20 KSP Residual norm 6.570851270214e-01 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> 1 SNES Function norm 1.801309983345e+03 >>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>> >>>> >>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>>> >>>> 0 SNES Function norm 2.801842107848e+04 >>>> 0 KSP Residual norm 4.045639473002e+01 >>>> 1 KSP Residual norm 1.917999883034e+01 >>>> 2 KSP Residual norm 1.616048572016e+01 >>>> [...] >>>> 19 KSP Residual norm 8.788046348957e-01 >>>> 20 KSP Residual norm 6.570859588610e-01 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> 1 SNES Function norm 1.801311320322e+03 >>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>> >>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: >>>> >>>>> >>>>> Do they start very similarly and then slowly drift further apart? >>>>> That is the first couple of KSP iterations they are almost identical but >>>>> then for each iteration get a bit further. Similar for the SNES iterations, >>>>> starting close and then for more iterations and more solves they start >>>>> moving apart. Or do they suddenly jump to be very different? You can run >>>>> with -snes_monitor -ksp_monitor >>>>> >>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>> >>>>> This is on a single MPI rank. I haven't checked the coloring, was just >>>>> guessing there. But the solutions/residuals are slightly different from run >>>>> to run. >>>>> >>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>> bitwise identical results? >>>>> >>>>> >>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>>>> >>>>>> >>>>>> No, the coloring should be identical every time. Do you see >>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>> >>>>>> >>>>>> >>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>>>>> > >>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>> > >>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from >>>>>> run to run. I'm wondering where randomness might enter here -- does the >>>>>> jacobian coloring use a random seed? >>>>>> >>>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Thu May 4 07:54:32 2023 From: mlohry at gmail.com (Mark Lohry) Date: Thu, 4 May 2023 08:54:32 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: > > Try -pc_type none. > With -pc_type none the 0 KSP residual looks identical. But *sometimes* it's producing exactly the same history and others it's gradually changing. I'm reasonably confident my residual evaluation has no randomness, see info after the petsc output. solve history 1: 0 SNES Function norm 3.424003312857e+04 0 KSP Residual norm 3.424003312857e+04 1 KSP Residual norm 2.871734444536e+04 2 KSP Residual norm 2.490276931041e+04 ... 20 KSP Residual norm 7.449686034356e+03 Linear solve converged due to CONVERGED_ITS iterations 20 1 SNES Function norm 1.085015821006e+04 solve history 2, identical to 1: 0 SNES Function norm 3.424003312857e+04 0 KSP Residual norm 3.424003312857e+04 1 KSP Residual norm 2.871734444536e+04 2 KSP Residual norm 2.490276931041e+04 ... 20 KSP Residual norm 7.449686034356e+03 Linear solve converged due to CONVERGED_ITS iterations 20 1 SNES Function norm 1.085015821006e+04 solve history 3, identical KSP at 0 and 1, slight change at 2, growing difference to the end: 0 SNES Function norm 3.424003312857e+04 0 KSP Residual norm 3.424003312857e+04 1 KSP Residual norm 2.871734444536e+04 2 KSP Residual norm 2.490276930242e+04 ... 20 KSP Residual norm 7.449686095424e+03 Linear solve converged due to CONVERGED_ITS iterations 20 1 SNES Function norm 1.085015646971e+04 Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 iterations, so 30 calls of the same residual evaluation, identical residuals every time run 1: # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time # 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.34834e-01 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.40063e-01 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.45166e-01 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.50494e-01 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.55656e-01 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.60872e-01 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.66041e-01 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.71316e-01 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.76447e-01 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.81716e-01 run N: # # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time # 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.23316e-01 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.28510e-01 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.33558e-01 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.38773e-01 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.43887e-01 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.49073e-01 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.54167e-01 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.59394e-01 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.64516e-01 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.69677e-01 On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: > ASM is just the sub PC with one proc but gets weaker with more procs > unless you use jacobi. (maybe I am missing something). > > On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: > >> Please send the output of -snes_view. >>> >> pasted below. anything stand out? >> >> >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=22 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is applied matrix-free with differencing >> Preconditioning Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: asm >> total subdomain blocks = 1, amount of overlap = 0 >> restriction/interpolation type - RESTRICT >> Local solver information for first block is in the following KSP >> and PC objects on rank 0: >> Use -ksp_view ::ascii_info_detail to display information for all >> blocks >> KSP Object: (sub_) 1 MPI process >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (sub_) 1 MPI process >> type: ilu >> out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: (sub_) 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> package used to perform factorization: petsc >> total: nonzeros=1277952, allocated nonzeros=1277952 >> block size is 16 >> linear system matrix = precond matrix: >> Mat Object: (sub_) 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> linear system matrix followed by preconditioner matrix: >> Mat Object: 1 MPI process >> type: mffd >> rows=16384, cols=16384 >> Matrix-free approximation: >> err=1.49012e-08 (relative error in function evaluation) >> Using wp compute h routine >> Does not compute normU >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> >> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >> >>> If you are using MG what is the coarse grid solver? >>> -snes_view might give you that. >>> >>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>> wrote: >>> >>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >>>> >>>>> Do they start very similarly and then slowly drift further apart? >>>>> >>>>> >>>>> Yes, this. I take it this sounds familiar? >>>>> >>>>> See these two examples with 20 fixed iterations pasted at the end. The >>>>> difference for one solve is slight (final SNES norm is identical to 5 >>>>> digits), but in the context I'm using it in (repeated applications to solve >>>>> a steady state multigrid problem, though here just one level) the >>>>> differences add up such that I might reach global convergence in 35 >>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>> would be. >>>>> >>>> >>>> The initial KSP residual is different, so its the PC. Please send the >>>> output of -snes_view. If your ASM is using direct factorization, then it >>>> could be randomness in whatever LU you are using. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> 0 SNES Function norm 2.801842107848e+04 >>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>> [...] >>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> 1 SNES Function norm 1.801309983345e+03 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> >>>>> >>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>>>> >>>>> 0 SNES Function norm 2.801842107848e+04 >>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>> [...] >>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> 1 SNES Function norm 1.801311320322e+03 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> >>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: >>>>> >>>>>> >>>>>> Do they start very similarly and then slowly drift further apart? >>>>>> That is the first couple of KSP iterations they are almost identical but >>>>>> then for each iteration get a bit further. Similar for the SNES iterations, >>>>>> starting close and then for more iterations and more solves they start >>>>>> moving apart. Or do they suddenly jump to be very different? You can run >>>>>> with -snes_monitor -ksp_monitor >>>>>> >>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>>> >>>>>> This is on a single MPI rank. I haven't checked the coloring, was >>>>>> just guessing there. But the solutions/residuals are slightly different >>>>>> from run to run. >>>>>> >>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>>> bitwise identical results? >>>>>> >>>>>> >>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>>>>> >>>>>>> >>>>>>> No, the coloring should be identical every time. Do you see >>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>> >>>>>>> >>>>>>> >>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>>>>>> > >>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>> > >>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from >>>>>>> run to run. I'm wondering where randomness might enter here -- does the >>>>>>> jacobian coloring use a random seed? >>>>>>> >>>>>>> >>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu May 4 08:02:25 2023 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 4 May 2023 06:02:25 -0700 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: Is your code valgrind clean? On Thu 4. May 2023 at 05:54, Mark Lohry wrote: > Try -pc_type none. >> > > With -pc_type none the 0 KSP residual looks identical. But *sometimes* > it's producing exactly the same history and others it's gradually > changing. I'm reasonably confident my residual evaluation has no > randomness, see info after the petsc output. > > solve history 1: > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276931041e+04 > ... > 20 KSP Residual norm 7.449686034356e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > 1 SNES Function norm 1.085015821006e+04 > > solve history 2, identical to 1: > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276931041e+04 > ... > 20 KSP Residual norm 7.449686034356e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > 1 SNES Function norm 1.085015821006e+04 > > solve history 3, identical KSP at 0 and 1, slight change at 2, growing > difference to the end: > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276930242e+04 > ... > 20 KSP Residual norm 7.449686095424e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > 1 SNES Function norm 1.085015646971e+04 > > > Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 > iterations, so 30 calls of the same residual evaluation, identical > residuals every time > > run 1: > > # iteration rho rhou rhov > rhoE abs_res rel_res umin > vmax vmin elapsed_time > # > > > 1.00000e+00 1.086860616292e+00 2.782316758416e+02 > 4.482867643761e+00 2.993435920340e+02 2.04353e+02 > 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 > 6.34834e-01 > 2.00000e+00 2.310547487017e+00 1.079059352425e+02 > 3.958323921837e+00 5.058927165686e+02 2.58647e+02 > 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 > 6.40063e-01 > 3.00000e+00 2.361005867444e+00 5.706213331683e+01 > 6.130016323357e+00 4.688968362579e+02 2.36201e+02 > 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 > 6.45166e-01 > 4.00000e+00 2.167518999963e+00 3.757541401594e+01 > 6.313917437428e+00 4.054310291628e+02 2.03612e+02 > 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 > 6.50494e-01 > 5.00000e+00 1.941443738676e+00 2.884190334049e+01 > 6.237106158479e+00 3.539201037156e+02 1.77577e+02 > 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 > 6.55656e-01 > 6.00000e+00 1.736947124693e+00 2.429485695670e+01 > 5.996962200407e+00 3.148280178142e+02 1.57913e+02 > 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 > 6.60872e-01 > 7.00000e+00 1.564153212635e+00 2.149609219810e+01 > 5.786910705204e+00 2.848717011033e+02 1.42872e+02 > 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 > 6.66041e-01 > 8.00000e+00 1.419280815384e+00 1.950619804089e+01 > 5.627281158306e+00 2.606623371229e+02 1.30728e+02 > 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 > 6.71316e-01 > 9.00000e+00 1.296115915975e+00 1.794843530745e+01 > 5.514933264437e+00 2.401524522393e+02 1.20444e+02 > 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 > 6.76447e-01 > 1.00000e+01 1.189639693918e+00 1.665381754953e+01 > 5.433183087037e+00 2.222572900473e+02 1.11475e+02 > 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 > 6.81716e-01 > > run N: > > > # > > > # iteration rho rhou rhov > rhoE abs_res rel_res umin > vmax vmin elapsed_time > # > > > 1.00000e+00 1.086860616292e+00 2.782316758416e+02 > 4.482867643761e+00 2.993435920340e+02 2.04353e+02 > 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 > 6.23316e-01 > 2.00000e+00 2.310547487017e+00 1.079059352425e+02 > 3.958323921837e+00 5.058927165686e+02 2.58647e+02 > 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 > 6.28510e-01 > 3.00000e+00 2.361005867444e+00 5.706213331683e+01 > 6.130016323357e+00 4.688968362579e+02 2.36201e+02 > 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 > 6.33558e-01 > 4.00000e+00 2.167518999963e+00 3.757541401594e+01 > 6.313917437428e+00 4.054310291628e+02 2.03612e+02 > 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 > 6.38773e-01 > 5.00000e+00 1.941443738676e+00 2.884190334049e+01 > 6.237106158479e+00 3.539201037156e+02 1.77577e+02 > 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 > 6.43887e-01 > 6.00000e+00 1.736947124693e+00 2.429485695670e+01 > 5.996962200407e+00 3.148280178142e+02 1.57913e+02 > 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 > 6.49073e-01 > 7.00000e+00 1.564153212635e+00 2.149609219810e+01 > 5.786910705204e+00 2.848717011033e+02 1.42872e+02 > 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 > 6.54167e-01 > 8.00000e+00 1.419280815384e+00 1.950619804089e+01 > 5.627281158306e+00 2.606623371229e+02 1.30728e+02 > 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 > 6.59394e-01 > 9.00000e+00 1.296115915975e+00 1.794843530745e+01 > 5.514933264437e+00 2.401524522393e+02 1.20444e+02 > 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 > 6.64516e-01 > 1.00000e+01 1.189639693918e+00 1.665381754953e+01 > 5.433183087037e+00 2.222572900473e+02 1.11475e+02 > 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 > 6.69677e-01 > > > > > > On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: > >> ASM is just the sub PC with one proc but gets weaker with more procs >> unless you use jacobi. (maybe I am missing something). >> >> On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: >> >>> Please send the output of -snes_view. >>>> >>> pasted below. anything stand out? >>> >>> >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=22 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is applied matrix-free with differencing >>> Preconditioning Jacobian is built using finite differences with >>> coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: asm >>> total subdomain blocks = 1, amount of overlap = 0 >>> restriction/interpolation type - RESTRICT >>> Local solver information for first block is in the following KSP >>> and PC objects on rank 0: >>> Use -ksp_view ::ascii_info_detail to display information for all >>> blocks >>> KSP Object: (sub_) 1 MPI process >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (sub_) 1 MPI process >>> type: ilu >>> out-of-place factorization >>> 0 levels of fill >>> tolerance for zero pivot 2.22045e-14 >>> matrix ordering: natural >>> factor fill ratio given 1., needed 1. >>> Factored matrix follows: >>> Mat Object: (sub_) 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> package used to perform factorization: petsc >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> block size is 16 >>> linear system matrix = precond matrix: >>> Mat Object: (sub_) 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> >>> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >>> >>>> If you are using MG what is the coarse grid solver? >>>> -snes_view might give you that. >>>> >>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >>>>> >>>>>> Do they start very similarly and then slowly drift further apart? >>>>>> >>>>>> >>>>>> Yes, this. I take it this sounds familiar? >>>>>> >>>>>> See these two examples with 20 fixed iterations pasted at the end. >>>>>> The difference for one solve is slight (final SNES norm is identical to 5 >>>>>> digits), but in the context I'm using it in (repeated applications to solve >>>>>> a steady state multigrid problem, though here just one level) the >>>>>> differences add up such that I might reach global convergence in 35 >>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>> would be. >>>>>> >>>>> >>>>> The initial KSP residual is different, so its the PC. Please send the >>>>> output of -snes_view. If your ASM is using direct factorization, then it >>>>> could be randomness in whatever LU you are using. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>> [...] >>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>> >>>>>> >>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>> different >>>>>> >>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>> [...] >>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>> >>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: >>>>>> >>>>>>> >>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>> That is the first couple of KSP iterations they are almost identical but >>>>>>> then for each iteration get a bit further. Similar for the SNES iterations, >>>>>>> starting close and then for more iterations and more solves they start >>>>>>> moving apart. Or do they suddenly jump to be very different? You can run >>>>>>> with -snes_monitor -ksp_monitor >>>>>>> >>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>>>> >>>>>>> This is on a single MPI rank. I haven't checked the coloring, was >>>>>>> just guessing there. But the solutions/residuals are slightly different >>>>>>> from run to run. >>>>>>> >>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>>>> bitwise identical results? >>>>>>> >>>>>>> >>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>>>>>> >>>>>>>> >>>>>>>> No, the coloring should be identical every time. Do you see >>>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>>>>>>> > >>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>> > >>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from >>>>>>>> run to run. I'm wondering where randomness might enter here -- does the >>>>>>>> jacobian coloring use a random seed? >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu May 4 08:18:59 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 4 May 2023 09:18:59 -0400 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: The code in ex125.c contains PetscCall(MatCreate(PETSC_COMM_WORLD, &C)); PetscCall(MatSetOptionsPrefix(C, "rhs_")); PetscCall(MatSetSizes(C, m, PETSC_DECIDE, PETSC_DECIDE, nrhs)); PetscCall(MatSetType(C, MATDENSE)); PetscCall(MatSetFromOptions(C)); PetscCall(MatSetUp(C)); This dense parallel matrix is suitable for passing to MatMatSolve() as the right-hand side matrix. Note it is created with PETSC_COMM_WORLD and its type is set to be MATDENSE. You may need to make a sample code by stripping out all the excess code in ex125.c to just create an MATAIJ and MATDENSE and solves with MatMatSolve() to determine why you code does not work. > On May 4, 2023, at 3:20 AM, ???? / ?? / ??????? wrote: > > Dear Barry Smith > > Thank you for your reply. > > I've already installed MUMPS. > > And I checked the example you said (ex125.c), I don't understand why the RHS matrix becomes the SeqDense matrix. > > Could you explain in more detail? > > Best regards > Seung Lee Kwon > > 2023? 5? 4? (?) ?? 12:08, Barry Smith >?? ??: >> >> You can configure with MUMPS ./configure --download-mumps --download-scalapack --download-ptscotch --download-metis --download-parmetis >> >> And then use MatMatSolve() as in src/mat/tests/ex125.c with parallel MatMatSolve() using MUMPS as the solver. >> >> Barry >> >> >>> On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? > wrote: >>> >>> Dear developers >>> >>> Thank you for your explanation. >>> >>> But I should use the MatCreateSeqDense because I want to use the MatMatSolve that B matrix must be a SeqDense matrix. >>> >>> Using MatMatSolve is an inevitable part of my code. >>> >>> Could you give me a comment to avoid this error? >>> >>> Best, >>> >>> Seung Lee Kwon >>> >>> 2023? 5? 3? (?) ?? 7:30, Matthew Knepley >?? ??: >>>> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? > wrote: >>>>> Dear developers >>>>> >>>>> I'm trying to use parallel computing and I ran the command 'mpirun -np 4 ./app' >>>>> >>>>> In this case, there are two problems. >>>>> >>>>> First, I encountered error message >>>>> /// >>>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>>> [1]PETSC ERROR: Invalid argument >>>>> [1]PETSC ERROR: Comm must be of size 1 >>>>> /// >>>>> The code on the error position is >>>>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >>>> >>>> 1) "Seq" means sequential, that is "not parallel". >>>> >>>> 2) This line should still be fine since PETSC_COMM_SELF is a serial communicator >>>> >>>> 3) You should be checking the error code for each call, maybe using the CHKERRQ() macro >>>> >>>> 4) Please always send the entire error message, not a snippet >>>> >>>> THanks >>>> >>>> Matt >>>> >>>>> Could "MatCreateSeqDense" not be used in parallel computing? >>>>> >>>>> Second, the same error message is repeated as many times as the number of cores. >>>>> if I use command -np 4, then the error message is repeated 4 times. >>>>> Could you recommend some advice related to this? >>>>> >>>>> Best, >>>>> Seung Lee Kwon >>>>> >>>>> -- >>>>> Seung Lee Kwon, Ph.D.Candidate >>>>> Aerospace Structures and Materials Laboratory >>>>> Department of Mechanical and Aerospace Engineering >>>>> Seoul National University >>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>> E-mail : ksl7912 at snu.ac.kr >>>>> Office : +82-2-880-7389 >>>>> C. P : +82-10-4695-1062 >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> -- >>> Seung Lee Kwon, Ph.D.Candidate >>> Aerospace Structures and Materials Laboratory >>> Department of Mechanical and Aerospace Engineering >>> Seoul National University >>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>> E-mail : ksl7912 at snu.ac.kr >>> Office : +82-2-880-7389 >>> C. P : +82-10-4695-1062 >> > > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 4 09:10:42 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2023 10:10:42 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: > Try -pc_type none. >> > > With -pc_type none the 0 KSP residual looks identical. But *sometimes* > it's producing exactly the same history and others it's gradually > changing. I'm reasonably confident my residual evaluation has no > randomness, see info after the petsc output. > We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? If not, then it could be your routine, or it could be MatMFFD. So run a few with -snes_view, and we can see if the "w" parameter changes. Thanks, Matt > solve history 1: > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276931041e+04 > ... > 20 KSP Residual norm 7.449686034356e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > 1 SNES Function norm 1.085015821006e+04 > > solve history 2, identical to 1: > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276931041e+04 > ... > 20 KSP Residual norm 7.449686034356e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > 1 SNES Function norm 1.085015821006e+04 > > solve history 3, identical KSP at 0 and 1, slight change at 2, growing > difference to the end: > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276930242e+04 > ... > 20 KSP Residual norm 7.449686095424e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > 1 SNES Function norm 1.085015646971e+04 > > > Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 > iterations, so 30 calls of the same residual evaluation, identical > residuals every time > > run 1: > > # iteration rho rhou rhov > rhoE abs_res rel_res umin > vmax vmin elapsed_time > # > > > 1.00000e+00 1.086860616292e+00 2.782316758416e+02 > 4.482867643761e+00 2.993435920340e+02 2.04353e+02 > 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 > 6.34834e-01 > 2.00000e+00 2.310547487017e+00 1.079059352425e+02 > 3.958323921837e+00 5.058927165686e+02 2.58647e+02 > 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 > 6.40063e-01 > 3.00000e+00 2.361005867444e+00 5.706213331683e+01 > 6.130016323357e+00 4.688968362579e+02 2.36201e+02 > 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 > 6.45166e-01 > 4.00000e+00 2.167518999963e+00 3.757541401594e+01 > 6.313917437428e+00 4.054310291628e+02 2.03612e+02 > 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 > 6.50494e-01 > 5.00000e+00 1.941443738676e+00 2.884190334049e+01 > 6.237106158479e+00 3.539201037156e+02 1.77577e+02 > 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 > 6.55656e-01 > 6.00000e+00 1.736947124693e+00 2.429485695670e+01 > 5.996962200407e+00 3.148280178142e+02 1.57913e+02 > 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 > 6.60872e-01 > 7.00000e+00 1.564153212635e+00 2.149609219810e+01 > 5.786910705204e+00 2.848717011033e+02 1.42872e+02 > 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 > 6.66041e-01 > 8.00000e+00 1.419280815384e+00 1.950619804089e+01 > 5.627281158306e+00 2.606623371229e+02 1.30728e+02 > 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 > 6.71316e-01 > 9.00000e+00 1.296115915975e+00 1.794843530745e+01 > 5.514933264437e+00 2.401524522393e+02 1.20444e+02 > 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 > 6.76447e-01 > 1.00000e+01 1.189639693918e+00 1.665381754953e+01 > 5.433183087037e+00 2.222572900473e+02 1.11475e+02 > 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 > 6.81716e-01 > > run N: > > > # > > > # iteration rho rhou rhov > rhoE abs_res rel_res umin > vmax vmin elapsed_time > # > > > 1.00000e+00 1.086860616292e+00 2.782316758416e+02 > 4.482867643761e+00 2.993435920340e+02 2.04353e+02 > 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 > 6.23316e-01 > 2.00000e+00 2.310547487017e+00 1.079059352425e+02 > 3.958323921837e+00 5.058927165686e+02 2.58647e+02 > 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 > 6.28510e-01 > 3.00000e+00 2.361005867444e+00 5.706213331683e+01 > 6.130016323357e+00 4.688968362579e+02 2.36201e+02 > 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 > 6.33558e-01 > 4.00000e+00 2.167518999963e+00 3.757541401594e+01 > 6.313917437428e+00 4.054310291628e+02 2.03612e+02 > 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 > 6.38773e-01 > 5.00000e+00 1.941443738676e+00 2.884190334049e+01 > 6.237106158479e+00 3.539201037156e+02 1.77577e+02 > 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 > 6.43887e-01 > 6.00000e+00 1.736947124693e+00 2.429485695670e+01 > 5.996962200407e+00 3.148280178142e+02 1.57913e+02 > 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 > 6.49073e-01 > 7.00000e+00 1.564153212635e+00 2.149609219810e+01 > 5.786910705204e+00 2.848717011033e+02 1.42872e+02 > 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 > 6.54167e-01 > 8.00000e+00 1.419280815384e+00 1.950619804089e+01 > 5.627281158306e+00 2.606623371229e+02 1.30728e+02 > 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 > 6.59394e-01 > 9.00000e+00 1.296115915975e+00 1.794843530745e+01 > 5.514933264437e+00 2.401524522393e+02 1.20444e+02 > 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 > 6.64516e-01 > 1.00000e+01 1.189639693918e+00 1.665381754953e+01 > 5.433183087037e+00 2.222572900473e+02 1.11475e+02 > 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 > 6.69677e-01 > > > > > > On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: > >> ASM is just the sub PC with one proc but gets weaker with more procs >> unless you use jacobi. (maybe I am missing something). >> >> On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: >> >>> Please send the output of -snes_view. >>>> >>> pasted below. anything stand out? >>> >>> >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=22 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is applied matrix-free with differencing >>> Preconditioning Jacobian is built using finite differences with >>> coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: asm >>> total subdomain blocks = 1, amount of overlap = 0 >>> restriction/interpolation type - RESTRICT >>> Local solver information for first block is in the following KSP >>> and PC objects on rank 0: >>> Use -ksp_view ::ascii_info_detail to display information for all >>> blocks >>> KSP Object: (sub_) 1 MPI process >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (sub_) 1 MPI process >>> type: ilu >>> out-of-place factorization >>> 0 levels of fill >>> tolerance for zero pivot 2.22045e-14 >>> matrix ordering: natural >>> factor fill ratio given 1., needed 1. >>> Factored matrix follows: >>> Mat Object: (sub_) 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> package used to perform factorization: petsc >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> block size is 16 >>> linear system matrix = precond matrix: >>> Mat Object: (sub_) 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> >>> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >>> >>>> If you are using MG what is the coarse grid solver? >>>> -snes_view might give you that. >>>> >>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >>>>> >>>>>> Do they start very similarly and then slowly drift further apart? >>>>>> >>>>>> >>>>>> Yes, this. I take it this sounds familiar? >>>>>> >>>>>> See these two examples with 20 fixed iterations pasted at the end. >>>>>> The difference for one solve is slight (final SNES norm is identical to 5 >>>>>> digits), but in the context I'm using it in (repeated applications to solve >>>>>> a steady state multigrid problem, though here just one level) the >>>>>> differences add up such that I might reach global convergence in 35 >>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>> would be. >>>>>> >>>>> >>>>> The initial KSP residual is different, so its the PC. Please send the >>>>> output of -snes_view. If your ASM is using direct factorization, then it >>>>> could be randomness in whatever LU you are using. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>> [...] >>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>> >>>>>> >>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>> different >>>>>> >>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>> [...] >>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>> >>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith wrote: >>>>>> >>>>>>> >>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>> That is the first couple of KSP iterations they are almost identical but >>>>>>> then for each iteration get a bit further. Similar for the SNES iterations, >>>>>>> starting close and then for more iterations and more solves they start >>>>>>> moving apart. Or do they suddenly jump to be very different? You can run >>>>>>> with -snes_monitor -ksp_monitor >>>>>>> >>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>>>> >>>>>>> This is on a single MPI rank. I haven't checked the coloring, was >>>>>>> just guessing there. But the solutions/residuals are slightly different >>>>>>> from run to run. >>>>>>> >>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>>>> bitwise identical results? >>>>>>> >>>>>>> >>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>>>>>> >>>>>>>> >>>>>>>> No, the coloring should be identical every time. Do you see >>>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>>>>>>> > >>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>> > >>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from >>>>>>>> run to run. I'm wondering where randomness might enter here -- does the >>>>>>>> jacobian coloring use a random seed? >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Thu May 4 10:23:39 2023 From: leonardo.mutti01 at universitadipavia.it (LEONARDO MUTTI) Date: Thu, 4 May 2023 17:23:39 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: Thank you for the help. Adding to my example: * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* results in: * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS referenced in function ... * * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS referenced in function ... * I'm not sure if the interfaces are missing or if I have a compilation problem. Thank you again. Best, Leonardo Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith ha scritto: > > Thank you for the test code. I have a fix in the branch > barry/2023-04-29/fix-pcasmcreatesubdomains2d > with > merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 > > The functions did not have proper Fortran stubs and interfaces so I had > to provide them manually in the new branch. > > Use > > git fetch > git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d > > ./configure etc > > Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to > change things slightly and I updated the error handling for the latest > version. > > Please let us know if you have any later questions. > > Barry > > > > > On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < > leonardo.mutti01 at universitadipavia.it> wrote: > > Hello. I am having a hard time understanding the index sets to feed > PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To > get more intuition on how the IS objects behave I tried the following > minimal (non) working example, which should tile a 16x16 matrix into 16 > square, non-overlapping submatrices: > > #include > #include > #include > USE petscmat > USE petscksp > USE petscpc > > Mat :: A > PetscInt :: M, NSubx, dof, overlap, NSub > INTEGER :: I,J > PetscErrorCode :: ierr > PetscScalar :: v > KSP :: ksp > PC :: pc > IS :: subdomains_IS, inflated_IS > > call PetscInitialize(PETSC_NULL_CHARACTER , ierr) > > !-----Create a dummy matrix > M = 16 > call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, > & M, M, > & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, > & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, > & A, ierr) > > DO I=1,M > DO J=1,M > v = I*J > CALL MatSetValue (A,I-1,J-1,v, > & INSERT_VALUES , ierr) > END DO > END DO > > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) > > !-----Create KSP and PC > call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) > call KSPSetOperators(ksp,A,A, ierr) > call KSPSetType(ksp,"bcgs",ierr) > call KSPGetPC(ksp,pc,ierr) > call KSPSetUp(ksp, ierr) > call PCSetType(pc,PCGASM, ierr) > call PCSetUp(pc , ierr) > > !-----GASM setup > NSubx = 4 > dof = 1 > overlap = 0 > > call PCGASMCreateSubdomains2D(pc, > & M, M, > & NSubx, NSubx, > & dof, overlap, > & NSub, subdomains_IS, inflated_IS, ierr) > > call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) > > call KSPDestroy(ksp, ierr) > call PetscFinalize(ierr) > > Running this on one processor, I get NSub = 4. > If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as > expected. > Moreover, I get in the end "forrtl: severe (157): Program Exception - > access violation". So: > 1) why do I get two different results with ASM, and GASM? > 2) why do I get access violation and how can I solve this? > In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. > As I see on the Fortran interface, the arguments to > PCGASMCreateSubdomains2D are IS objects: > > subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) > import tPC,tIS > PC a ! PC > PetscInt b ! PetscInt > PetscInt c ! PetscInt > PetscInt d ! PetscInt > PetscInt e ! PetscInt > PetscInt f ! PetscInt > PetscInt g ! PetscInt > PetscInt h ! PetscInt > IS i ! IS > IS j ! IS > PetscErrorCode z > end subroutine PCGASMCreateSubdomains2D > Thus: > 3) what should be inside e.g., subdomains_IS? I expect it to contain, for > every created subdomain, the list of rows and columns defining the subblock > in the matrix, am I right? > > Context: I have a block-tridiagonal system arising from space-time finite > elements, and I want to solve it with GMRES+PCGASM preconditioner, where > each overlapping submatrix is on the diagonal and of size 3x3 blocks (and > spanning multiple processes). This is PETSc 3.17.1 on Windows. > > Thanks in advance, > Leonardo > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu May 4 11:01:25 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 4 May 2023 12:01:25 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: <36D315B5-3871-4A63-B415-3CBE9ACEDCA2@petsc.dev> Looks like we don't have Fortran interfaces for these. We need to add them. Barry > On May 4, 2023, at 11:23 AM, LEONARDO MUTTI wrote: > > Thank you for the help. > Adding to my example: > call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) > call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr) > results in: > Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS referenced in function ... > Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS referenced in function ... > I'm not sure if the interfaces are missing or if I have a compilation problem. > Thank you again. > Best, > Leonardo > > Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith > ha scritto: >> >> Thank you for the test code. I have a fix in the branch barry/2023-04-29/fix-pcasmcreatesubdomains2d with merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >> >> The functions did not have proper Fortran stubs and interfaces so I had to provide them manually in the new branch. >> >> Use >> >> git fetch >> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >> ./configure etc >> >> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to change things slightly and I updated the error handling for the latest version. >> >> Please let us know if you have any later questions. >> >> Barry >> >> >> >> >>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI > wrote: >>> >>> Hello. I am having a hard time understanding the index sets to feed PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To get more intuition on how the IS objects behave I tried the following minimal (non) working example, which should tile a 16x16 matrix into 16 square, non-overlapping submatrices: >>> >>> #include >>> #include >>> #include >>> USE petscmat >>> USE petscksp >>> USE petscpc >>> >>> Mat :: A >>> PetscInt :: M, NSubx, dof, overlap, NSub >>> INTEGER :: I,J >>> PetscErrorCode :: ierr >>> PetscScalar :: v >>> KSP :: ksp >>> PC :: pc >>> IS :: subdomains_IS, inflated_IS >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>> >>> !-----Create a dummy matrix >>> M = 16 >>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>> & M, M, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & A, ierr) >>> >>> DO I=1,M >>> DO J=1,M >>> v = I*J >>> CALL MatSetValue (A,I-1,J-1,v, >>> & INSERT_VALUES , ierr) >>> END DO >>> END DO >>> >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>> >>> !-----Create KSP and PC >>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>> call KSPSetOperators(ksp,A,A, ierr) >>> call KSPSetType(ksp,"bcgs",ierr) >>> call KSPGetPC(ksp,pc,ierr) >>> call KSPSetUp(ksp, ierr) >>> call PCSetType(pc,PCGASM, ierr) >>> call PCSetUp(pc , ierr) >>> >>> !-----GASM setup >>> NSubx = 4 >>> dof = 1 >>> overlap = 0 >>> >>> call PCGASMCreateSubdomains2D(pc, >>> & M, M, >>> & NSubx, NSubx, >>> & dof, overlap, >>> & NSub, subdomains_IS, inflated_IS, ierr) >>> >>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>> >>> call KSPDestroy(ksp, ierr) >>> call PetscFinalize(ierr) >>> >>> Running this on one processor, I get NSub = 4. >>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as expected. >>> Moreover, I get in the end "forrtl: severe (157): Program Exception - access violation". So: >>> 1) why do I get two different results with ASM, and GASM? >>> 2) why do I get access violation and how can I solve this? >>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. As I see on the Fortran interface, the arguments to PCGASMCreateSubdomains2D are IS objects: >>> >>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>> import tPC,tIS >>> PC a ! PC >>> PetscInt b ! PetscInt >>> PetscInt c ! PetscInt >>> PetscInt d ! PetscInt >>> PetscInt e ! PetscInt >>> PetscInt f ! PetscInt >>> PetscInt g ! PetscInt >>> PetscInt h ! PetscInt >>> IS i ! IS >>> IS j ! IS >>> PetscErrorCode z >>> end subroutine PCGASMCreateSubdomains2D >>> Thus: >>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, for every created subdomain, the list of rows and columns defining the subblock in the matrix, am I right? >>> >>> Context: I have a block-tridiagonal system arising from space-time finite elements, and I want to solve it with GMRES+PCGASM preconditioner, where each overlapping submatrix is on the diagonal and of size 3x3 blocks (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>> >>> Thanks in advance, >>> Leonardo >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 4 11:04:57 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2023 12:04:57 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: On Thu, May 4, 2023 at 11:24?AM LEONARDO MUTTI < leonardo.mutti01 at universitadipavia.it> wrote: > Thank you for the help. > Adding to my example: > > > * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) > call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* > results in: > > * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS > referenced in function ... * > > * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS > referenced in function ... * > I'm not sure if the interfaces are missing or if I have a compilation > problem. > I just want to make sure you really want GASM. It sounded like you might able to do what you want just with ASM. Can you tell me again what you want to do overall? Thanks, Matt > Thank you again. > Best, > Leonardo > > Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith > ha scritto: > >> >> Thank you for the test code. I have a fix in the branch >> barry/2023-04-29/fix-pcasmcreatesubdomains2d >> with >> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >> >> The functions did not have proper Fortran stubs and interfaces so I >> had to provide them manually in the new branch. >> >> Use >> >> git fetch >> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >> >> ./configure etc >> >> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to >> change things slightly and I updated the error handling for the latest >> version. >> >> Please let us know if you have any later questions. >> >> Barry >> >> >> >> >> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >> Hello. I am having a hard time understanding the index sets to feed >> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >> get more intuition on how the IS objects behave I tried the following >> minimal (non) working example, which should tile a 16x16 matrix into 16 >> square, non-overlapping submatrices: >> >> #include >> #include >> #include >> USE petscmat >> USE petscksp >> USE petscpc >> >> Mat :: A >> PetscInt :: M, NSubx, dof, overlap, NSub >> INTEGER :: I,J >> PetscErrorCode :: ierr >> PetscScalar :: v >> KSP :: ksp >> PC :: pc >> IS :: subdomains_IS, inflated_IS >> >> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >> >> !-----Create a dummy matrix >> M = 16 >> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >> & M, M, >> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >> & A, ierr) >> >> DO I=1,M >> DO J=1,M >> v = I*J >> CALL MatSetValue (A,I-1,J-1,v, >> & INSERT_VALUES , ierr) >> END DO >> END DO >> >> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >> >> !-----Create KSP and PC >> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >> call KSPSetOperators(ksp,A,A, ierr) >> call KSPSetType(ksp,"bcgs",ierr) >> call KSPGetPC(ksp,pc,ierr) >> call KSPSetUp(ksp, ierr) >> call PCSetType(pc,PCGASM, ierr) >> call PCSetUp(pc , ierr) >> >> !-----GASM setup >> NSubx = 4 >> dof = 1 >> overlap = 0 >> >> call PCGASMCreateSubdomains2D(pc, >> & M, M, >> & NSubx, NSubx, >> & dof, overlap, >> & NSub, subdomains_IS, inflated_IS, ierr) >> >> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >> >> call KSPDestroy(ksp, ierr) >> call PetscFinalize(ierr) >> >> Running this on one processor, I get NSub = 4. >> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as >> expected. >> Moreover, I get in the end "forrtl: severe (157): Program Exception - >> access violation". So: >> 1) why do I get two different results with ASM, and GASM? >> 2) why do I get access violation and how can I solve this? >> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. >> As I see on the Fortran interface, the arguments to >> PCGASMCreateSubdomains2D are IS objects: >> >> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >> import tPC,tIS >> PC a ! PC >> PetscInt b ! PetscInt >> PetscInt c ! PetscInt >> PetscInt d ! PetscInt >> PetscInt e ! PetscInt >> PetscInt f ! PetscInt >> PetscInt g ! PetscInt >> PetscInt h ! PetscInt >> IS i ! IS >> IS j ! IS >> PetscErrorCode z >> end subroutine PCGASMCreateSubdomains2D >> Thus: >> 3) what should be inside e.g., subdomains_IS? I expect it to contain, for >> every created subdomain, the list of rows and columns defining the subblock >> in the matrix, am I right? >> >> Context: I have a block-tridiagonal system arising from space-time finite >> elements, and I want to solve it with GMRES+PCGASM preconditioner, where >> each overlapping submatrix is on the diagonal and of size 3x3 blocks (and >> spanning multiple processes). This is PETSc 3.17.1 on Windows. >> >> Thanks in advance, >> Leonardo >> >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Thu May 4 12:43:20 2023 From: leonardo.mutti01 at universitadipavia.it (LEONARDO MUTTI) Date: Thu, 4 May 2023 19:43:20 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: Of course, I'll try to explain. I am solving a parabolic equation with space-time FEM and I want an efficient solver/preconditioner for the resulting system. The corresponding matrix, call it X, has an e.g. block bi-diagonal structure, if the cG(1)-dG(0) method is used (i.e. implicit Euler solved in batch). Every block-row of X corresponds to a time instant. I want to introduce parallelism in time by subdividing X into overlapping submatrices of e.g 2x2 or 3x3 blocks, along the block diagonal. For instance, call X_i the individual blocks. The submatrices would be, for various i, (X_{i-1,i-1},X_{i-1,i};X_{i,i-1},X_{i,i}). I'd like each submatrix to be solved in parallel, to combine the various results together in an ASM like fashion. Every submatrix has thus a predecessor and a successor, and it overlaps with both, so that as far as I could understand, GASM has to be used in place of ASM. Hope this helps. Best, Leonardo Il giorno gio 4 mag 2023 alle ore 18:05 Matthew Knepley ha scritto: > On Thu, May 4, 2023 at 11:24?AM LEONARDO MUTTI < > leonardo.mutti01 at universitadipavia.it> wrote: > >> Thank you for the help. >> Adding to my example: >> >> >> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >> inflated_IS,ierr) call >> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >> results in: >> >> * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS >> referenced in function ... * >> >> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >> referenced in function ... * >> I'm not sure if the interfaces are missing or if I have a compilation >> problem. >> > > I just want to make sure you really want GASM. It sounded like you might > able to do what you want just with ASM. > Can you tell me again what you want to do overall? > > Thanks, > > Matt > > >> Thank you again. >> Best, >> Leonardo >> >> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith >> ha scritto: >> >>> >>> Thank you for the test code. I have a fix in the branch >>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>> with >>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>> >>> The functions did not have proper Fortran stubs and interfaces so I >>> had to provide them manually in the new branch. >>> >>> Use >>> >>> git fetch >>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>> >>> ./configure etc >>> >>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had >>> to change things slightly and I updated the error handling for the latest >>> version. >>> >>> Please let us know if you have any later questions. >>> >>> Barry >>> >>> >>> >>> >>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>> leonardo.mutti01 at universitadipavia.it> wrote: >>> >>> Hello. I am having a hard time understanding the index sets to feed >>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>> get more intuition on how the IS objects behave I tried the following >>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>> square, non-overlapping submatrices: >>> >>> #include >>> #include >>> #include >>> USE petscmat >>> USE petscksp >>> USE petscpc >>> >>> Mat :: A >>> PetscInt :: M, NSubx, dof, overlap, NSub >>> INTEGER :: I,J >>> PetscErrorCode :: ierr >>> PetscScalar :: v >>> KSP :: ksp >>> PC :: pc >>> IS :: subdomains_IS, inflated_IS >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>> >>> !-----Create a dummy matrix >>> M = 16 >>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>> & M, M, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & A, ierr) >>> >>> DO I=1,M >>> DO J=1,M >>> v = I*J >>> CALL MatSetValue (A,I-1,J-1,v, >>> & INSERT_VALUES , ierr) >>> END DO >>> END DO >>> >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>> >>> !-----Create KSP and PC >>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>> call KSPSetOperators(ksp,A,A, ierr) >>> call KSPSetType(ksp,"bcgs",ierr) >>> call KSPGetPC(ksp,pc,ierr) >>> call KSPSetUp(ksp, ierr) >>> call PCSetType(pc,PCGASM, ierr) >>> call PCSetUp(pc , ierr) >>> >>> !-----GASM setup >>> NSubx = 4 >>> dof = 1 >>> overlap = 0 >>> >>> call PCGASMCreateSubdomains2D(pc, >>> & M, M, >>> & NSubx, NSubx, >>> & dof, overlap, >>> & NSub, subdomains_IS, inflated_IS, ierr) >>> >>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>> >>> call KSPDestroy(ksp, ierr) >>> call PetscFinalize(ierr) >>> >>> Running this on one processor, I get NSub = 4. >>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 >>> as expected. >>> Moreover, I get in the end "forrtl: severe (157): Program Exception - >>> access violation". So: >>> 1) why do I get two different results with ASM, and GASM? >>> 2) why do I get access violation and how can I solve this? >>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. >>> As I see on the Fortran interface, the arguments to >>> PCGASMCreateSubdomains2D are IS objects: >>> >>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>> import tPC,tIS >>> PC a ! PC >>> PetscInt b ! PetscInt >>> PetscInt c ! PetscInt >>> PetscInt d ! PetscInt >>> PetscInt e ! PetscInt >>> PetscInt f ! PetscInt >>> PetscInt g ! PetscInt >>> PetscInt h ! PetscInt >>> IS i ! IS >>> IS j ! IS >>> PetscErrorCode z >>> end subroutine PCGASMCreateSubdomains2D >>> Thus: >>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, >>> for every created subdomain, the list of rows and columns defining the >>> subblock in the matrix, am I right? >>> >>> Context: I have a block-tridiagonal system arising from space-time >>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>> >>> Thanks in advance, >>> Leonardo >>> >>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Thu May 4 15:43:51 2023 From: mlohry at gmail.com (Mark Lohry) Date: Thu, 4 May 2023 16:43:51 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: > > Is your code valgrind clean? > Yes, I also initialize all allocations with NaNs to be sure I'm not using anything uninitialized. > We can try and test this. Replace your MatMFFD with an actual matrix and > run. Do you see any variability? > I think I did what you're asking. I have -snes_mf_operator set, and then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is a matrix with ones on the diagonal. Two runs below, still with differences but sometimes identical. 0 SNES Function norm 3.424003312857e+04 0 KSP Residual norm 3.424003312857e+04 1 KSP Residual norm 2.871734444536e+04 2 KSP Residual norm 2.490276930242e+04 3 KSP Residual norm 2.131675872968e+04 4 KSP Residual norm 1.973129814235e+04 5 KSP Residual norm 1.832377856317e+04 6 KSP Residual norm 1.716783617436e+04 7 KSP Residual norm 1.583963149542e+04 8 KSP Residual norm 1.482272170304e+04 9 KSP Residual norm 1.380312106742e+04 10 KSP Residual norm 1.297793480658e+04 11 KSP Residual norm 1.208599123244e+04 12 KSP Residual norm 1.137345655227e+04 13 KSP Residual norm 1.059676909366e+04 14 KSP Residual norm 1.003823862398e+04 15 KSP Residual norm 9.425879221354e+03 16 KSP Residual norm 8.954805890038e+03 17 KSP Residual norm 8.592372470456e+03 18 KSP Residual norm 8.060707175821e+03 19 KSP Residual norm 7.782057728723e+03 20 KSP Residual norm 7.449686095424e+03 Linear solve converged due to CONVERGED_ITS iterations 20 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=16384, cols=16384 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=16384, cols=16384 total: nonzeros=16384, allocated nonzeros=16384 total number of mallocs used during MatSetValues calls=0 not using I-node routines 1 SNES Function norm 1.085015646971e+04 Nonlinear solve converged due to CONVERGED_ITS iterations 1 SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=23 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=16384, cols=16384 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=16384, cols=16384 total: nonzeros=16384, allocated nonzeros=16384 total number of mallocs used during MatSetValues calls=0 not using I-node routines 0 SNES Function norm 3.424003312857e+04 0 KSP Residual norm 3.424003312857e+04 1 KSP Residual norm 2.871734444536e+04 2 KSP Residual norm 2.490276931041e+04 3 KSP Residual norm 2.131675873776e+04 4 KSP Residual norm 1.973129814908e+04 5 KSP Residual norm 1.832377852186e+04 6 KSP Residual norm 1.716783608174e+04 7 KSP Residual norm 1.583963128956e+04 8 KSP Residual norm 1.482272160069e+04 9 KSP Residual norm 1.380312087005e+04 10 KSP Residual norm 1.297793458796e+04 11 KSP Residual norm 1.208599115602e+04 12 KSP Residual norm 1.137345657533e+04 13 KSP Residual norm 1.059676906197e+04 14 KSP Residual norm 1.003823857515e+04 15 KSP Residual norm 9.425879177747e+03 16 KSP Residual norm 8.954805850825e+03 17 KSP Residual norm 8.592372413320e+03 18 KSP Residual norm 8.060706994110e+03 19 KSP Residual norm 7.782057560782e+03 20 KSP Residual norm 7.449686034356e+03 Linear solve converged due to CONVERGED_ITS iterations 20 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=16384, cols=16384 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=16384, cols=16384 total: nonzeros=16384, allocated nonzeros=16384 total number of mallocs used during MatSetValues calls=0 not using I-node routines 1 SNES Function norm 1.085015821006e+04 Nonlinear solve converged due to CONVERGED_ITS iterations 1 SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=23 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=16384, cols=16384 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=16384, cols=16384 total: nonzeros=16384, allocated nonzeros=16384 total number of mallocs used during MatSetValues calls=0 not using I-node routines On Thu, May 4, 2023 at 10:10?AM Matthew Knepley wrote: > On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: > >> Try -pc_type none. >>> >> >> With -pc_type none the 0 KSP residual looks identical. But *sometimes* >> it's producing exactly the same history and others it's gradually >> changing. I'm reasonably confident my residual evaluation has no >> randomness, see info after the petsc output. >> > > We can try and test this. Replace your MatMFFD with an actual matrix and > run. Do you see any variability? > > If not, then it could be your routine, or it could be MatMFFD. So run a > few with -snes_view, and we can see if the > "w" parameter changes. > > Thanks, > > Matt > > >> solve history 1: >> >> 0 SNES Function norm 3.424003312857e+04 >> 0 KSP Residual norm 3.424003312857e+04 >> 1 KSP Residual norm 2.871734444536e+04 >> 2 KSP Residual norm 2.490276931041e+04 >> ... >> 20 KSP Residual norm 7.449686034356e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> 1 SNES Function norm 1.085015821006e+04 >> >> solve history 2, identical to 1: >> >> 0 SNES Function norm 3.424003312857e+04 >> 0 KSP Residual norm 3.424003312857e+04 >> 1 KSP Residual norm 2.871734444536e+04 >> 2 KSP Residual norm 2.490276931041e+04 >> ... >> 20 KSP Residual norm 7.449686034356e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> 1 SNES Function norm 1.085015821006e+04 >> >> solve history 3, identical KSP at 0 and 1, slight change at 2, growing >> difference to the end: >> 0 SNES Function norm 3.424003312857e+04 >> 0 KSP Residual norm 3.424003312857e+04 >> 1 KSP Residual norm 2.871734444536e+04 >> 2 KSP Residual norm 2.490276930242e+04 >> ... >> 20 KSP Residual norm 7.449686095424e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> 1 SNES Function norm 1.085015646971e+04 >> >> >> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 >> iterations, so 30 calls of the same residual evaluation, identical >> residuals every time >> >> run 1: >> >> # iteration rho rhou rhov >> rhoE abs_res rel_res umin >> vmax vmin elapsed_time >> # >> >> >> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >> 6.34834e-01 >> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >> 6.40063e-01 >> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >> 6.45166e-01 >> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >> 6.50494e-01 >> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >> 6.55656e-01 >> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >> 6.60872e-01 >> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >> 6.66041e-01 >> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >> 6.71316e-01 >> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >> 6.76447e-01 >> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >> 6.81716e-01 >> >> run N: >> >> >> # >> >> >> # iteration rho rhou rhov >> rhoE abs_res rel_res umin >> vmax vmin elapsed_time >> # >> >> >> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >> 6.23316e-01 >> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >> 6.28510e-01 >> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >> 6.33558e-01 >> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >> 6.38773e-01 >> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >> 6.43887e-01 >> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >> 6.49073e-01 >> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >> 6.54167e-01 >> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >> 6.59394e-01 >> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >> 6.64516e-01 >> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >> 6.69677e-01 >> >> >> >> >> >> On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: >> >>> ASM is just the sub PC with one proc but gets weaker with more procs >>> unless you use jacobi. (maybe I am missing something). >>> >>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: >>> >>>> Please send the output of -snes_view. >>>>> >>>> pasted below. anything stand out? >>>> >>>> >>>> SNES Object: 1 MPI process >>>> type: newtonls >>>> maximum iterations=1, maximum function evaluations=-1 >>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>> total number of linear solver iterations=20 >>>> total number of function evaluations=22 >>>> norm schedule ALWAYS >>>> Jacobian is never rebuilt >>>> Jacobian is applied matrix-free with differencing >>>> Preconditioning Jacobian is built using finite differences with >>>> coloring >>>> SNESLineSearch Object: 1 MPI process >>>> type: basic >>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>> lambda=1.000000e-08 >>>> maximum iterations=40 >>>> KSP Object: 1 MPI process >>>> type: gmres >>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>> Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=20, initial guess is zero >>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> PC Object: 1 MPI process >>>> type: asm >>>> total subdomain blocks = 1, amount of overlap = 0 >>>> restriction/interpolation type - RESTRICT >>>> Local solver information for first block is in the following KSP >>>> and PC objects on rank 0: >>>> Use -ksp_view ::ascii_info_detail to display information for all >>>> blocks >>>> KSP Object: (sub_) 1 MPI process >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (sub_) 1 MPI process >>>> type: ilu >>>> out-of-place factorization >>>> 0 levels of fill >>>> tolerance for zero pivot 2.22045e-14 >>>> matrix ordering: natural >>>> factor fill ratio given 1., needed 1. >>>> Factored matrix follows: >>>> Mat Object: (sub_) 1 MPI process >>>> type: seqbaij >>>> rows=16384, cols=16384, bs=16 >>>> package used to perform factorization: petsc >>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>> block size is 16 >>>> linear system matrix = precond matrix: >>>> Mat Object: (sub_) 1 MPI process >>>> type: seqbaij >>>> rows=16384, cols=16384, bs=16 >>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>> total number of mallocs used during MatSetValues calls=0 >>>> block size is 16 >>>> linear system matrix followed by preconditioner matrix: >>>> Mat Object: 1 MPI process >>>> type: mffd >>>> rows=16384, cols=16384 >>>> Matrix-free approximation: >>>> err=1.49012e-08 (relative error in function evaluation) >>>> Using wp compute h routine >>>> Does not compute normU >>>> Mat Object: 1 MPI process >>>> type: seqbaij >>>> rows=16384, cols=16384, bs=16 >>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>> total number of mallocs used during MatSetValues calls=0 >>>> block size is 16 >>>> >>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >>>> >>>>> If you are using MG what is the coarse grid solver? >>>>> -snes_view might give you that. >>>>> >>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >>>>>> >>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>> >>>>>>> >>>>>>> Yes, this. I take it this sounds familiar? >>>>>>> >>>>>>> See these two examples with 20 fixed iterations pasted at the end. >>>>>>> The difference for one solve is slight (final SNES norm is identical to 5 >>>>>>> digits), but in the context I'm using it in (repeated applications to solve >>>>>>> a steady state multigrid problem, though here just one level) the >>>>>>> differences add up such that I might reach global convergence in 35 >>>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>>> would be. >>>>>>> >>>>>> >>>>>> The initial KSP residual is different, so its the PC. Please send the >>>>>> output of -snes_view. If your ASM is using direct factorization, then it >>>>>> could be randomness in whatever LU you are using. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>> [...] >>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>> >>>>>>> >>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>> different >>>>>>> >>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>> [...] >>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>> >>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>> That is the first couple of KSP iterations they are almost identical but >>>>>>>> then for each iteration get a bit further. Similar for the SNES iterations, >>>>>>>> starting close and then for more iterations and more solves they start >>>>>>>> moving apart. Or do they suddenly jump to be very different? You can run >>>>>>>> with -snes_monitor -ksp_monitor >>>>>>>> >>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>>>>> >>>>>>>> This is on a single MPI rank. I haven't checked the coloring, was >>>>>>>> just guessing there. But the solutions/residuals are slightly different >>>>>>>> from run to run. >>>>>>>> >>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>>>>> bitwise identical results? >>>>>>>> >>>>>>>> >>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> No, the coloring should be identical every time. Do you see >>>>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry wrote: >>>>>>>>> > >>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>> > >>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals >>>>>>>>> from run to run. I'm wondering where randomness might enter here -- does >>>>>>>>> the jacobian coloring use a random seed? >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu May 4 15:51:58 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 4 May 2023 16:51:58 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: <2103AC4C-F90E-4C25-B494-B3A31EB3518B@petsc.dev> Do you get different results (in different runs) without -snes_mf_operator? So just using an explicit matrix? (Note: I am not convinced there is even a problem and think it may be simply different order of floating point operations in different runs.) Barry > On May 4, 2023, at 4:43 PM, Mark Lohry wrote: > >> Is your code valgrind clean? > > Yes, I also initialize all allocations with NaNs to be sure I'm not using anything uninitialized. > >> >> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? > > I think I did what you're asking. I have -snes_mf_operator set, and then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is a matrix with ones on the diagonal. Two runs below, still with differences but sometimes identical. > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276930242e+04 > 3 KSP Residual norm 2.131675872968e+04 > 4 KSP Residual norm 1.973129814235e+04 > 5 KSP Residual norm 1.832377856317e+04 > 6 KSP Residual norm 1.716783617436e+04 > 7 KSP Residual norm 1.583963149542e+04 > 8 KSP Residual norm 1.482272170304e+04 > 9 KSP Residual norm 1.380312106742e+04 > 10 KSP Residual norm 1.297793480658e+04 > 11 KSP Residual norm 1.208599123244e+04 > 12 KSP Residual norm 1.137345655227e+04 > 13 KSP Residual norm 1.059676909366e+04 > 14 KSP Residual norm 1.003823862398e+04 > 15 KSP Residual norm 9.425879221354e+03 > 16 KSP Residual norm 8.954805890038e+03 > 17 KSP Residual norm 8.592372470456e+03 > 18 KSP Residual norm 8.060707175821e+03 > 19 KSP Residual norm 7.782057728723e+03 > 20 KSP Residual norm 7.449686095424e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqaij > rows=16384, cols=16384 > total: nonzeros=16384, allocated nonzeros=16384 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > 1 SNES Function norm 1.085015646971e+04 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=23 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is applied matrix-free with differencing > Preconditioning Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqaij > rows=16384, cols=16384 > total: nonzeros=16384, allocated nonzeros=16384 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276931041e+04 > 3 KSP Residual norm 2.131675873776e+04 > 4 KSP Residual norm 1.973129814908e+04 > 5 KSP Residual norm 1.832377852186e+04 > 6 KSP Residual norm 1.716783608174e+04 > 7 KSP Residual norm 1.583963128956e+04 > 8 KSP Residual norm 1.482272160069e+04 > 9 KSP Residual norm 1.380312087005e+04 > 10 KSP Residual norm 1.297793458796e+04 > 11 KSP Residual norm 1.208599115602e+04 > 12 KSP Residual norm 1.137345657533e+04 > 13 KSP Residual norm 1.059676906197e+04 > 14 KSP Residual norm 1.003823857515e+04 > 15 KSP Residual norm 9.425879177747e+03 > 16 KSP Residual norm 8.954805850825e+03 > 17 KSP Residual norm 8.592372413320e+03 > 18 KSP Residual norm 8.060706994110e+03 > 19 KSP Residual norm 7.782057560782e+03 > 20 KSP Residual norm 7.449686034356e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqaij > rows=16384, cols=16384 > total: nonzeros=16384, allocated nonzeros=16384 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > 1 SNES Function norm 1.085015821006e+04 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=23 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is applied matrix-free with differencing > Preconditioning Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqaij > rows=16384, cols=16384 > total: nonzeros=16384, allocated nonzeros=16384 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > On Thu, May 4, 2023 at 10:10?AM Matthew Knepley > wrote: >> On Thu, May 4, 2023 at 8:54?AM Mark Lohry > wrote: >>>> Try -pc_type none. >>> >>> With -pc_type none the 0 KSP residual looks identical. But *sometimes* it's producing exactly the same history and others it's gradually changing. I'm reasonably confident my residual evaluation has no randomness, see info after the petsc output. >> >> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >> >> If not, then it could be your routine, or it could be MatMFFD. So run a few with -snes_view, and we can see if the >> "w" parameter changes. >> >> Thanks, >> >> Matt >> >>> solve history 1: >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276931041e+04 >>> ... >>> 20 KSP Residual norm 7.449686034356e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> 1 SNES Function norm 1.085015821006e+04 >>> >>> solve history 2, identical to 1: >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276931041e+04 >>> ... >>> 20 KSP Residual norm 7.449686034356e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> 1 SNES Function norm 1.085015821006e+04 >>> >>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing difference to the end: >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276930242e+04 >>> ... >>> 20 KSP Residual norm 7.449686095424e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> 1 SNES Function norm 1.085015646971e+04 >>> >>> >>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 iterations, so 30 calls of the same residual evaluation, identical residuals every time >>> >>> run 1: >>> >>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>> # >>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.34834e-01 >>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.40063e-01 >>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.45166e-01 >>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.50494e-01 >>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.55656e-01 >>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.60872e-01 >>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.66041e-01 >>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.71316e-01 >>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.76447e-01 >>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.81716e-01 >>> >>> run N: >>> >>> >>> # >>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>> # >>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.23316e-01 >>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.28510e-01 >>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.33558e-01 >>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.38773e-01 >>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.43887e-01 >>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.49073e-01 >>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.54167e-01 >>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.59394e-01 >>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.64516e-01 >>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.69677e-01 >>> >>> >>> >>> >>> >>> On Thu, May 4, 2023 at 8:41?AM Mark Adams > wrote: >>>> ASM is just the sub PC with one proc but gets weaker with more procs unless you use jacobi. (maybe I am missing something). >>>> >>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry > wrote: >>>>>> Please send the output of -snes_view. >>>>> pasted below. anything stand out? >>>>> >>>>> >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=22 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is applied matrix-free with differencing >>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: asm >>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>> restriction/interpolation type - RESTRICT >>>>> Local solver information for first block is in the following KSP and PC objects on rank 0: >>>>> Use -ksp_view ::ascii_info_detail to display information for all blocks >>>>> KSP Object: (sub_) 1 MPI process >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (sub_) 1 MPI process >>>>> type: ilu >>>>> out-of-place factorization >>>>> 0 levels of fill >>>>> tolerance for zero pivot 2.22045e-14 >>>>> matrix ordering: natural >>>>> factor fill ratio given 1., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: (sub_) 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> block size is 16 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (sub_) 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> >>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams > wrote: >>>>>> If you are using MG what is the coarse grid solver? >>>>>> -snes_view might give you that. >>>>>> >>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley > wrote: >>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry > wrote: >>>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>> >>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>> >>>>>>>> See these two examples with 20 fixed iterations pasted at the end. The difference for one solve is slight (final SNES norm is identical to 5 digits), but in the context I'm using it in (repeated applications to solve a steady state multigrid problem, though here just one level) the differences add up such that I might reach global convergence in 35 iterations or 38. It's not the end of the world, but I was expecting that with -np 1 these would be identical and I'm not sure where the root cause would be. >>>>>>> >>>>>>> The initial KSP residual is different, so its the PC. Please send the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>> could be randomness in whatever LU you are using. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>> [...] >>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>> >>>>>>>> >>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>>>>>>> >>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>> [...] >>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>> >>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith > wrote: >>>>>>>>> >>>>>>>>> Do they start very similarly and then slowly drift further apart? That is the first couple of KSP iterations they are almost identical but then for each iteration get a bit further. Similar for the SNES iterations, starting close and then for more iterations and more solves they start moving apart. Or do they suddenly jump to be very different? You can run with -snes_monitor -ksp_monitor >>>>>>>>> >>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry > wrote: >>>>>>>>>> >>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, was just guessing there. But the solutions/residuals are slightly different from run to run. >>>>>>>>>> >>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise identical results? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith > wrote: >>>>>>>>>>> >>>>>>>>>>> No, the coloring should be identical every time. Do you see differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry > wrote: >>>>>>>>>>> > >>>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>> > >>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from run to run. I'm wondering where randomness might enter here -- does the jacobian coloring use a random seed? >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 4 15:56:53 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2023 16:56:53 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: On Thu, May 4, 2023 at 4:44?PM Mark Lohry wrote: > Is your code valgrind clean? >> > > Yes, I also initialize all allocations with NaNs to be sure I'm not using > anything uninitialized. > > >> We can try and test this. Replace your MatMFFD with an actual matrix and >> run. Do you see any variability? >> > > I think I did what you're asking. I have -snes_mf_operator set, and then > SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is > a matrix with ones on the diagonal. Two runs below, still with differences > but sometimes identical. > No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence that nothing in the solver is variable. Thanks, Matt > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276930242e+04 > 3 KSP Residual norm 2.131675872968e+04 > 4 KSP Residual norm 1.973129814235e+04 > 5 KSP Residual norm 1.832377856317e+04 > 6 KSP Residual norm 1.716783617436e+04 > 7 KSP Residual norm 1.583963149542e+04 > 8 KSP Residual norm 1.482272170304e+04 > 9 KSP Residual norm 1.380312106742e+04 > 10 KSP Residual norm 1.297793480658e+04 > 11 KSP Residual norm 1.208599123244e+04 > 12 KSP Residual norm 1.137345655227e+04 > 13 KSP Residual norm 1.059676909366e+04 > 14 KSP Residual norm 1.003823862398e+04 > 15 KSP Residual norm 9.425879221354e+03 > 16 KSP Residual norm 8.954805890038e+03 > 17 KSP Residual norm 8.592372470456e+03 > 18 KSP Residual norm 8.060707175821e+03 > 19 KSP Residual norm 7.782057728723e+03 > 20 KSP Residual norm 7.449686095424e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqaij > rows=16384, cols=16384 > total: nonzeros=16384, allocated nonzeros=16384 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > 1 SNES Function norm 1.085015646971e+04 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=23 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is applied matrix-free with differencing > Preconditioning Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqaij > rows=16384, cols=16384 > total: nonzeros=16384, allocated nonzeros=16384 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.871734444536e+04 > 2 KSP Residual norm 2.490276931041e+04 > 3 KSP Residual norm 2.131675873776e+04 > 4 KSP Residual norm 1.973129814908e+04 > 5 KSP Residual norm 1.832377852186e+04 > 6 KSP Residual norm 1.716783608174e+04 > 7 KSP Residual norm 1.583963128956e+04 > 8 KSP Residual norm 1.482272160069e+04 > 9 KSP Residual norm 1.380312087005e+04 > 10 KSP Residual norm 1.297793458796e+04 > 11 KSP Residual norm 1.208599115602e+04 > 12 KSP Residual norm 1.137345657533e+04 > 13 KSP Residual norm 1.059676906197e+04 > 14 KSP Residual norm 1.003823857515e+04 > 15 KSP Residual norm 9.425879177747e+03 > 16 KSP Residual norm 8.954805850825e+03 > 17 KSP Residual norm 8.592372413320e+03 > 18 KSP Residual norm 8.060706994110e+03 > 19 KSP Residual norm 7.782057560782e+03 > 20 KSP Residual norm 7.449686034356e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqaij > rows=16384, cols=16384 > total: nonzeros=16384, allocated nonzeros=16384 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > 1 SNES Function norm 1.085015821006e+04 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=23 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is applied matrix-free with differencing > Preconditioning Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix followed by preconditioner matrix: > Mat Object: 1 MPI process > type: mffd > rows=16384, cols=16384 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 1 MPI process > type: seqaij > rows=16384, cols=16384 > total: nonzeros=16384, allocated nonzeros=16384 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > On Thu, May 4, 2023 at 10:10?AM Matthew Knepley wrote: > >> On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: >> >>> Try -pc_type none. >>>> >>> >>> With -pc_type none the 0 KSP residual looks identical. But *sometimes* >>> it's producing exactly the same history and others it's gradually >>> changing. I'm reasonably confident my residual evaluation has no >>> randomness, see info after the petsc output. >>> >> >> We can try and test this. Replace your MatMFFD with an actual matrix and >> run. Do you see any variability? >> >> If not, then it could be your routine, or it could be MatMFFD. So run a >> few with -snes_view, and we can see if the >> "w" parameter changes. >> >> Thanks, >> >> Matt >> >> >>> solve history 1: >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276931041e+04 >>> ... >>> 20 KSP Residual norm 7.449686034356e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> 1 SNES Function norm 1.085015821006e+04 >>> >>> solve history 2, identical to 1: >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276931041e+04 >>> ... >>> 20 KSP Residual norm 7.449686034356e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> 1 SNES Function norm 1.085015821006e+04 >>> >>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing >>> difference to the end: >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276930242e+04 >>> ... >>> 20 KSP Residual norm 7.449686095424e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> 1 SNES Function norm 1.085015646971e+04 >>> >>> >>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 >>> iterations, so 30 calls of the same residual evaluation, identical >>> residuals every time >>> >>> run 1: >>> >>> # iteration rho rhou rhov >>> rhoE abs_res rel_res umin >>> vmax vmin elapsed_time >>> # >>> >>> >>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>> 6.34834e-01 >>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>> 6.40063e-01 >>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>> 6.45166e-01 >>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>> 6.50494e-01 >>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>> 6.55656e-01 >>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>> 6.60872e-01 >>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>> 6.66041e-01 >>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>> 6.71316e-01 >>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>> 6.76447e-01 >>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>> 6.81716e-01 >>> >>> run N: >>> >>> >>> # >>> >>> >>> # iteration rho rhou rhov >>> rhoE abs_res rel_res umin >>> vmax vmin elapsed_time >>> # >>> >>> >>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>> 6.23316e-01 >>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>> 6.28510e-01 >>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>> 6.33558e-01 >>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>> 6.38773e-01 >>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>> 6.43887e-01 >>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>> 6.49073e-01 >>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>> 6.54167e-01 >>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>> 6.59394e-01 >>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>> 6.64516e-01 >>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>> 6.69677e-01 >>> >>> >>> >>> >>> >>> On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: >>> >>>> ASM is just the sub PC with one proc but gets weaker with more procs >>>> unless you use jacobi. (maybe I am missing something). >>>> >>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: >>>> >>>>> Please send the output of -snes_view. >>>>>> >>>>> pasted below. anything stand out? >>>>> >>>>> >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=22 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is applied matrix-free with differencing >>>>> Preconditioning Jacobian is built using finite differences with >>>>> coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>> lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>> Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: asm >>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>> restriction/interpolation type - RESTRICT >>>>> Local solver information for first block is in the following KSP >>>>> and PC objects on rank 0: >>>>> Use -ksp_view ::ascii_info_detail to display information for all >>>>> blocks >>>>> KSP Object: (sub_) 1 MPI process >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (sub_) 1 MPI process >>>>> type: ilu >>>>> out-of-place factorization >>>>> 0 levels of fill >>>>> tolerance for zero pivot 2.22045e-14 >>>>> matrix ordering: natural >>>>> factor fill ratio given 1., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: (sub_) 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> block size is 16 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: (sub_) 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> >>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >>>>> >>>>>> If you are using MG what is the coarse grid solver? >>>>>> -snes_view might give you that. >>>>>> >>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >>>>>>> >>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>> >>>>>>>> >>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>> >>>>>>>> See these two examples with 20 fixed iterations pasted at the end. >>>>>>>> The difference for one solve is slight (final SNES norm is identical to 5 >>>>>>>> digits), but in the context I'm using it in (repeated applications to solve >>>>>>>> a steady state multigrid problem, though here just one level) the >>>>>>>> differences add up such that I might reach global convergence in 35 >>>>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>>>> would be. >>>>>>>> >>>>>>> >>>>>>> The initial KSP residual is different, so its the PC. Please send >>>>>>> the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>> could be randomness in whatever LU you are using. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>> [...] >>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>> >>>>>>>> >>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>>> different >>>>>>>> >>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>> [...] >>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>> >>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>> apart? That is the first couple of KSP iterations they are almost identical >>>>>>>>> but then for each iteration get a bit further. Similar for the SNES >>>>>>>>> iterations, starting close and then for more iterations and more solves >>>>>>>>> they start moving apart. Or do they suddenly jump to be very different? You >>>>>>>>> can run with -snes_monitor -ksp_monitor >>>>>>>>> >>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>>>>>> >>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, was >>>>>>>>> just guessing there. But the solutions/residuals are slightly different >>>>>>>>> from run to run. >>>>>>>>> >>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>>>>>> bitwise identical results? >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> No, the coloring should be identical every time. Do you see >>>>>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry >>>>>>>>>> wrote: >>>>>>>>>> > >>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>> > >>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals >>>>>>>>>> from run to run. I'm wondering where randomness might enter here -- does >>>>>>>>>> the jacobian coloring use a random seed? >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Thu May 4 16:03:12 2023 From: mlohry at gmail.com (Mark Lohry) Date: Thu, 4 May 2023 17:03:12 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: > > Do you get different results (in different runs) without > -snes_mf_operator? So just using an explicit matrix? Unfortunately I don't have an explicit matrix available for this, hence the MFFD/JFNK. > (Note: I am not convinced there is even a problem and think it may be > simply different order of floating point operations in different runs.) > I'm not convinced either, but running explicit RK for 10,000 iterations i get exactly the same results every time so i'm fairly confident it's not the residual evaluation. How would there be a different order of floating point ops in different runs in serial? No, I mean without -snes_mf_* (as Barry says), so we are just running that > solver with a sparse matrix. This would give me confidence > that nothing in the solver is variable. > > I could do the sparse finite difference jacobian once, save it to disk, and then use that system each time. On Thu, May 4, 2023 at 4:57?PM Matthew Knepley wrote: > On Thu, May 4, 2023 at 4:44?PM Mark Lohry wrote: > >> Is your code valgrind clean? >>> >> >> Yes, I also initialize all allocations with NaNs to be sure I'm not using >> anything uninitialized. >> >> >>> We can try and test this. Replace your MatMFFD with an actual matrix and >>> run. Do you see any variability? >>> >> >> I think I did what you're asking. I have -snes_mf_operator set, and then >> SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is >> a matrix with ones on the diagonal. Two runs below, still with differences >> but sometimes identical. >> > > No, I mean without -snes_mf_* (as Barry says), so we are just running that > solver with a sparse matrix. This would give me confidence > that nothing in the solver is variable. > > Thanks, > > Matt > > >> 0 SNES Function norm 3.424003312857e+04 >> 0 KSP Residual norm 3.424003312857e+04 >> 1 KSP Residual norm 2.871734444536e+04 >> 2 KSP Residual norm 2.490276930242e+04 >> 3 KSP Residual norm 2.131675872968e+04 >> 4 KSP Residual norm 1.973129814235e+04 >> 5 KSP Residual norm 1.832377856317e+04 >> 6 KSP Residual norm 1.716783617436e+04 >> 7 KSP Residual norm 1.583963149542e+04 >> 8 KSP Residual norm 1.482272170304e+04 >> 9 KSP Residual norm 1.380312106742e+04 >> 10 KSP Residual norm 1.297793480658e+04 >> 11 KSP Residual norm 1.208599123244e+04 >> 12 KSP Residual norm 1.137345655227e+04 >> 13 KSP Residual norm 1.059676909366e+04 >> 14 KSP Residual norm 1.003823862398e+04 >> 15 KSP Residual norm 9.425879221354e+03 >> 16 KSP Residual norm 8.954805890038e+03 >> 17 KSP Residual norm 8.592372470456e+03 >> 18 KSP Residual norm 8.060707175821e+03 >> 19 KSP Residual norm 7.782057728723e+03 >> 20 KSP Residual norm 7.449686095424e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix followed by preconditioner matrix: >> Mat Object: 1 MPI process >> type: mffd >> rows=16384, cols=16384 >> Matrix-free approximation: >> err=1.49012e-08 (relative error in function evaluation) >> Using wp compute h routine >> Does not compute normU >> Mat Object: 1 MPI process >> type: seqaij >> rows=16384, cols=16384 >> total: nonzeros=16384, allocated nonzeros=16384 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> 1 SNES Function norm 1.085015646971e+04 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=23 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is applied matrix-free with differencing >> Preconditioning Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix followed by preconditioner matrix: >> Mat Object: 1 MPI process >> type: mffd >> rows=16384, cols=16384 >> Matrix-free approximation: >> err=1.49012e-08 (relative error in function evaluation) >> Using wp compute h routine >> Does not compute normU >> Mat Object: 1 MPI process >> type: seqaij >> rows=16384, cols=16384 >> total: nonzeros=16384, allocated nonzeros=16384 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> >> 0 SNES Function norm 3.424003312857e+04 >> 0 KSP Residual norm 3.424003312857e+04 >> 1 KSP Residual norm 2.871734444536e+04 >> 2 KSP Residual norm 2.490276931041e+04 >> 3 KSP Residual norm 2.131675873776e+04 >> 4 KSP Residual norm 1.973129814908e+04 >> 5 KSP Residual norm 1.832377852186e+04 >> 6 KSP Residual norm 1.716783608174e+04 >> 7 KSP Residual norm 1.583963128956e+04 >> 8 KSP Residual norm 1.482272160069e+04 >> 9 KSP Residual norm 1.380312087005e+04 >> 10 KSP Residual norm 1.297793458796e+04 >> 11 KSP Residual norm 1.208599115602e+04 >> 12 KSP Residual norm 1.137345657533e+04 >> 13 KSP Residual norm 1.059676906197e+04 >> 14 KSP Residual norm 1.003823857515e+04 >> 15 KSP Residual norm 9.425879177747e+03 >> 16 KSP Residual norm 8.954805850825e+03 >> 17 KSP Residual norm 8.592372413320e+03 >> 18 KSP Residual norm 8.060706994110e+03 >> 19 KSP Residual norm 7.782057560782e+03 >> 20 KSP Residual norm 7.449686034356e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix followed by preconditioner matrix: >> Mat Object: 1 MPI process >> type: mffd >> rows=16384, cols=16384 >> Matrix-free approximation: >> err=1.49012e-08 (relative error in function evaluation) >> Using wp compute h routine >> Does not compute normU >> Mat Object: 1 MPI process >> type: seqaij >> rows=16384, cols=16384 >> total: nonzeros=16384, allocated nonzeros=16384 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> 1 SNES Function norm 1.085015821006e+04 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=23 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is applied matrix-free with differencing >> Preconditioning Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix followed by preconditioner matrix: >> Mat Object: 1 MPI process >> type: mffd >> rows=16384, cols=16384 >> Matrix-free approximation: >> err=1.49012e-08 (relative error in function evaluation) >> Using wp compute h routine >> Does not compute normU >> Mat Object: 1 MPI process >> type: seqaij >> rows=16384, cols=16384 >> total: nonzeros=16384, allocated nonzeros=16384 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> >> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley >> wrote: >> >>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: >>> >>>> Try -pc_type none. >>>>> >>>> >>>> With -pc_type none the 0 KSP residual looks identical. But *sometimes* >>>> it's producing exactly the same history and others it's gradually >>>> changing. I'm reasonably confident my residual evaluation has no >>>> randomness, see info after the petsc output. >>>> >>> >>> We can try and test this. Replace your MatMFFD with an actual matrix and >>> run. Do you see any variability? >>> >>> If not, then it could be your routine, or it could be MatMFFD. So run a >>> few with -snes_view, and we can see if the >>> "w" parameter changes. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> solve history 1: >>>> >>>> 0 SNES Function norm 3.424003312857e+04 >>>> 0 KSP Residual norm 3.424003312857e+04 >>>> 1 KSP Residual norm 2.871734444536e+04 >>>> 2 KSP Residual norm 2.490276931041e+04 >>>> ... >>>> 20 KSP Residual norm 7.449686034356e+03 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> 1 SNES Function norm 1.085015821006e+04 >>>> >>>> solve history 2, identical to 1: >>>> >>>> 0 SNES Function norm 3.424003312857e+04 >>>> 0 KSP Residual norm 3.424003312857e+04 >>>> 1 KSP Residual norm 2.871734444536e+04 >>>> 2 KSP Residual norm 2.490276931041e+04 >>>> ... >>>> 20 KSP Residual norm 7.449686034356e+03 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> 1 SNES Function norm 1.085015821006e+04 >>>> >>>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing >>>> difference to the end: >>>> 0 SNES Function norm 3.424003312857e+04 >>>> 0 KSP Residual norm 3.424003312857e+04 >>>> 1 KSP Residual norm 2.871734444536e+04 >>>> 2 KSP Residual norm 2.490276930242e+04 >>>> ... >>>> 20 KSP Residual norm 7.449686095424e+03 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> 1 SNES Function norm 1.085015646971e+04 >>>> >>>> >>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 >>>> iterations, so 30 calls of the same residual evaluation, identical >>>> residuals every time >>>> >>>> run 1: >>>> >>>> # iteration rho rhou rhov >>>> rhoE abs_res rel_res umin >>>> vmax vmin elapsed_time >>>> # >>>> >>>> >>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>> 6.34834e-01 >>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>> 6.40063e-01 >>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>> 6.45166e-01 >>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>> 6.50494e-01 >>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>> 6.55656e-01 >>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>> 6.60872e-01 >>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>> 6.66041e-01 >>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>> 6.71316e-01 >>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>> 6.76447e-01 >>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>> 6.81716e-01 >>>> >>>> run N: >>>> >>>> >>>> # >>>> >>>> >>>> # iteration rho rhou rhov >>>> rhoE abs_res rel_res umin >>>> vmax vmin elapsed_time >>>> # >>>> >>>> >>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>> 6.23316e-01 >>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>> 6.28510e-01 >>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>> 6.33558e-01 >>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>> 6.38773e-01 >>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>> 6.43887e-01 >>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>> 6.49073e-01 >>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>> 6.54167e-01 >>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>> 6.59394e-01 >>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>> 6.64516e-01 >>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>> 6.69677e-01 >>>> >>>> >>>> >>>> >>>> >>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: >>>> >>>>> ASM is just the sub PC with one proc but gets weaker with more procs >>>>> unless you use jacobi. (maybe I am missing something). >>>>> >>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: >>>>> >>>>>> Please send the output of -snes_view. >>>>>>> >>>>>> pasted below. anything stand out? >>>>>> >>>>>> >>>>>> SNES Object: 1 MPI process >>>>>> type: newtonls >>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>> total number of linear solver iterations=20 >>>>>> total number of function evaluations=22 >>>>>> norm schedule ALWAYS >>>>>> Jacobian is never rebuilt >>>>>> Jacobian is applied matrix-free with differencing >>>>>> Preconditioning Jacobian is built using finite differences with >>>>>> coloring >>>>>> SNESLineSearch Object: 1 MPI process >>>>>> type: basic >>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>> lambda=1.000000e-08 >>>>>> maximum iterations=40 >>>>>> KSP Object: 1 MPI process >>>>>> type: gmres >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>> Orthogonalization with no iterative refinement >>>>>> happy breakdown tolerance 1e-30 >>>>>> maximum iterations=20, initial guess is zero >>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>> left preconditioning >>>>>> using PRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI process >>>>>> type: asm >>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>> restriction/interpolation type - RESTRICT >>>>>> Local solver information for first block is in the following >>>>>> KSP and PC objects on rank 0: >>>>>> Use -ksp_view ::ascii_info_detail to display information for >>>>>> all blocks >>>>>> KSP Object: (sub_) 1 MPI process >>>>>> type: preonly >>>>>> maximum iterations=10000, initial guess is zero >>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using NONE norm type for convergence test >>>>>> PC Object: (sub_) 1 MPI process >>>>>> type: ilu >>>>>> out-of-place factorization >>>>>> 0 levels of fill >>>>>> tolerance for zero pivot 2.22045e-14 >>>>>> matrix ordering: natural >>>>>> factor fill ratio given 1., needed 1. >>>>>> Factored matrix follows: >>>>>> Mat Object: (sub_) 1 MPI process >>>>>> type: seqbaij >>>>>> rows=16384, cols=16384, bs=16 >>>>>> package used to perform factorization: petsc >>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>> block size is 16 >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: (sub_) 1 MPI process >>>>>> type: seqbaij >>>>>> rows=16384, cols=16384, bs=16 >>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> block size is 16 >>>>>> linear system matrix followed by preconditioner matrix: >>>>>> Mat Object: 1 MPI process >>>>>> type: mffd >>>>>> rows=16384, cols=16384 >>>>>> Matrix-free approximation: >>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>> Using wp compute h routine >>>>>> Does not compute normU >>>>>> Mat Object: 1 MPI process >>>>>> type: seqbaij >>>>>> rows=16384, cols=16384, bs=16 >>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> block size is 16 >>>>>> >>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >>>>>> >>>>>>> If you are using MG what is the coarse grid solver? >>>>>>> -snes_view might give you that. >>>>>>> >>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry wrote: >>>>>>>> >>>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>> >>>>>>>>> See these two examples with 20 fixed iterations pasted at the end. >>>>>>>>> The difference for one solve is slight (final SNES norm is identical to 5 >>>>>>>>> digits), but in the context I'm using it in (repeated applications to solve >>>>>>>>> a steady state multigrid problem, though here just one level) the >>>>>>>>> differences add up such that I might reach global convergence in 35 >>>>>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>>>>> would be. >>>>>>>>> >>>>>>>> >>>>>>>> The initial KSP residual is different, so its the PC. Please send >>>>>>>> the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>>> could be randomness in whatever LU you are using. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>> [...] >>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>> >>>>>>>>> >>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>>>> different >>>>>>>>> >>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>> [...] >>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>> >>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>> apart? That is the first couple of KSP iterations they are almost identical >>>>>>>>>> but then for each iteration get a bit further. Similar for the SNES >>>>>>>>>> iterations, starting close and then for more iterations and more solves >>>>>>>>>> they start moving apart. Or do they suddenly jump to be very different? You >>>>>>>>>> can run with -snes_monitor -ksp_monitor >>>>>>>>>> >>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>>>>>>> >>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, was >>>>>>>>>> just guessing there. But the solutions/residuals are slightly different >>>>>>>>>> from run to run. >>>>>>>>>> >>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>>>>>>> bitwise identical results? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> No, the coloring should be identical every time. Do you see >>>>>>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry >>>>>>>>>>> wrote: >>>>>>>>>>> > >>>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>>>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>>>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>> > >>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals >>>>>>>>>>> from run to run. I'm wondering where randomness might enter here -- does >>>>>>>>>>> the jacobian coloring use a random seed? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu May 4 16:14:57 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 4 May 2023 17:14:57 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: > On May 4, 2023, at 5:03 PM, Mark Lohry wrote: > >> Do you get different results (in different runs) without -snes_mf_operator? So just using an explicit matrix? > > Unfortunately I don't have an explicit matrix available for this, hence the MFFD/JFNK. > >> >> (Note: I am not convinced there is even a problem and think it may be simply different order of floating point operations in different runs.) > > I'm not convinced either, but running explicit RK for 10,000 iterations i get exactly the same results every time so i'm fairly confident it's not the residual evaluation. > How would there be a different order of floating point ops in different runs in serial? > >> No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence >> that nothing in the solver is variable. >> > I could do the sparse finite difference jacobian once, save it to disk, and then use that system each time. Sure, but why only once and why save to disk? Why not just use that computed approximate Jacobian at each Newton step to drive the Newton solves along for a bunch of time steps? Barry > > > On Thu, May 4, 2023 at 4:57?PM Matthew Knepley > wrote: >> On Thu, May 4, 2023 at 4:44?PM Mark Lohry > wrote: >>>> Is your code valgrind clean? >>> >>> Yes, I also initialize all allocations with NaNs to be sure I'm not using anything uninitialized. >>> >>>> >>>> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >>> >>> I think I did what you're asking. I have -snes_mf_operator set, and then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is a matrix with ones on the diagonal. Two runs below, still with differences but sometimes identical. >> >> No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence >> that nothing in the solver is variable. >> >> Thanks, >> >> Matt >> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276930242e+04 >>> 3 KSP Residual norm 2.131675872968e+04 >>> 4 KSP Residual norm 1.973129814235e+04 >>> 5 KSP Residual norm 1.832377856317e+04 >>> 6 KSP Residual norm 1.716783617436e+04 >>> 7 KSP Residual norm 1.583963149542e+04 >>> 8 KSP Residual norm 1.482272170304e+04 >>> 9 KSP Residual norm 1.380312106742e+04 >>> 10 KSP Residual norm 1.297793480658e+04 >>> 11 KSP Residual norm 1.208599123244e+04 >>> 12 KSP Residual norm 1.137345655227e+04 >>> 13 KSP Residual norm 1.059676909366e+04 >>> 14 KSP Residual norm 1.003823862398e+04 >>> 15 KSP Residual norm 9.425879221354e+03 >>> 16 KSP Residual norm 8.954805890038e+03 >>> 17 KSP Residual norm 8.592372470456e+03 >>> 18 KSP Residual norm 8.060707175821e+03 >>> 19 KSP Residual norm 7.782057728723e+03 >>> 20 KSP Residual norm 7.449686095424e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqaij >>> rows=16384, cols=16384 >>> total: nonzeros=16384, allocated nonzeros=16384 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> 1 SNES Function norm 1.085015646971e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=23 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is applied matrix-free with differencing >>> Preconditioning Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqaij >>> rows=16384, cols=16384 >>> total: nonzeros=16384, allocated nonzeros=16384 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276931041e+04 >>> 3 KSP Residual norm 2.131675873776e+04 >>> 4 KSP Residual norm 1.973129814908e+04 >>> 5 KSP Residual norm 1.832377852186e+04 >>> 6 KSP Residual norm 1.716783608174e+04 >>> 7 KSP Residual norm 1.583963128956e+04 >>> 8 KSP Residual norm 1.482272160069e+04 >>> 9 KSP Residual norm 1.380312087005e+04 >>> 10 KSP Residual norm 1.297793458796e+04 >>> 11 KSP Residual norm 1.208599115602e+04 >>> 12 KSP Residual norm 1.137345657533e+04 >>> 13 KSP Residual norm 1.059676906197e+04 >>> 14 KSP Residual norm 1.003823857515e+04 >>> 15 KSP Residual norm 9.425879177747e+03 >>> 16 KSP Residual norm 8.954805850825e+03 >>> 17 KSP Residual norm 8.592372413320e+03 >>> 18 KSP Residual norm 8.060706994110e+03 >>> 19 KSP Residual norm 7.782057560782e+03 >>> 20 KSP Residual norm 7.449686034356e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqaij >>> rows=16384, cols=16384 >>> total: nonzeros=16384, allocated nonzeros=16384 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> 1 SNES Function norm 1.085015821006e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=23 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is applied matrix-free with differencing >>> Preconditioning Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqaij >>> rows=16384, cols=16384 >>> total: nonzeros=16384, allocated nonzeros=16384 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> >>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley > wrote: >>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry > wrote: >>>>>> Try -pc_type none. >>>>> >>>>> With -pc_type none the 0 KSP residual looks identical. But *sometimes* it's producing exactly the same history and others it's gradually changing. I'm reasonably confident my residual evaluation has no randomness, see info after the petsc output. >>>> >>>> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >>>> >>>> If not, then it could be your routine, or it could be MatMFFD. So run a few with -snes_view, and we can see if the >>>> "w" parameter changes. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> solve history 1: >>>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>> ... >>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> 1 SNES Function norm 1.085015821006e+04 >>>>> >>>>> solve history 2, identical to 1: >>>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>> ... >>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> 1 SNES Function norm 1.085015821006e+04 >>>>> >>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing difference to the end: >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>> ... >>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> 1 SNES Function norm 1.085015646971e+04 >>>>> >>>>> >>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 iterations, so 30 calls of the same residual evaluation, identical residuals every time >>>>> >>>>> run 1: >>>>> >>>>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>>>> # >>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.34834e-01 >>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.40063e-01 >>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.45166e-01 >>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.50494e-01 >>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.55656e-01 >>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.60872e-01 >>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.66041e-01 >>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.71316e-01 >>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.76447e-01 >>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.81716e-01 >>>>> >>>>> run N: >>>>> >>>>> >>>>> # >>>>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>>>> # >>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.23316e-01 >>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.28510e-01 >>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.33558e-01 >>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.38773e-01 >>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.43887e-01 >>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.49073e-01 >>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.54167e-01 >>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.59394e-01 >>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.64516e-01 >>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.69677e-01 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams > wrote: >>>>>> ASM is just the sub PC with one proc but gets weaker with more procs unless you use jacobi. (maybe I am missing something). >>>>>> >>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry > wrote: >>>>>>>> Please send the output of -snes_view. >>>>>>> pasted below. anything stand out? >>>>>>> >>>>>>> >>>>>>> SNES Object: 1 MPI process >>>>>>> type: newtonls >>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>> total number of linear solver iterations=20 >>>>>>> total number of function evaluations=22 >>>>>>> norm schedule ALWAYS >>>>>>> Jacobian is never rebuilt >>>>>>> Jacobian is applied matrix-free with differencing >>>>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>> type: basic >>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>>>> maximum iterations=40 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: asm >>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>> restriction/interpolation type - RESTRICT >>>>>>> Local solver information for first block is in the following KSP and PC objects on rank 0: >>>>>>> Use -ksp_view ::ascii_info_detail to display information for all blocks >>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (sub_) 1 MPI process >>>>>>> type: ilu >>>>>>> out-of-place factorization >>>>>>> 0 levels of fill >>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>> matrix ordering: natural >>>>>>> factor fill ratio given 1., needed 1. >>>>>>> Factored matrix follows: >>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>> type: seqbaij >>>>>>> rows=16384, cols=16384, bs=16 >>>>>>> package used to perform factorization: petsc >>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>> block size is 16 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>> type: seqbaij >>>>>>> rows=16384, cols=16384, bs=16 >>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> block size is 16 >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqbaij >>>>>>> rows=16384, cols=16384, bs=16 >>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> block size is 16 >>>>>>> >>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams > wrote: >>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>> -snes_view might give you that. >>>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley > wrote: >>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry > wrote: >>>>>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>>>> >>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>> >>>>>>>>>> See these two examples with 20 fixed iterations pasted at the end. The difference for one solve is slight (final SNES norm is identical to 5 digits), but in the context I'm using it in (repeated applications to solve a steady state multigrid problem, though here just one level) the differences add up such that I might reach global convergence in 35 iterations or 38. It's not the end of the world, but I was expecting that with -np 1 these would be identical and I'm not sure where the root cause would be. >>>>>>>>> >>>>>>>>> The initial KSP residual is different, so its the PC. Please send the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>> [...] >>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>>>>>>>>> >>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>> [...] >>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>> >>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith > wrote: >>>>>>>>>>> >>>>>>>>>>> Do they start very similarly and then slowly drift further apart? That is the first couple of KSP iterations they are almost identical but then for each iteration get a bit further. Similar for the SNES iterations, starting close and then for more iterations and more solves they start moving apart. Or do they suddenly jump to be very different? You can run with -snes_monitor -ksp_monitor >>>>>>>>>>> >>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry > wrote: >>>>>>>>>>>> >>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, was just guessing there. But the solutions/residuals are slightly different from run to run. >>>>>>>>>>>> >>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise identical results? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith > wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> No, the coloring should be identical every time. Do you see differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry > wrote: >>>>>>>>>>>>> > >>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>>> > >>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from run to run. I'm wondering where randomness might enter here -- does the jacobian coloring use a random seed? >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 4 16:22:34 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2023 17:22:34 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: On Thu, May 4, 2023 at 5:03?PM Mark Lohry wrote: > Do you get different results (in different runs) without >> -snes_mf_operator? So just using an explicit matrix? > > > Unfortunately I don't have an explicit matrix available for this, hence > the MFFD/JFNK. > I don't mean the actual matrix, I mean a representative matrix. > >> (Note: I am not convinced there is even a problem and think it may be >> simply different order of floating point operations in different runs.) >> > > I'm not convinced either, but running explicit RK for 10,000 iterations i > get exactly the same results every time so i'm fairly confident it's not > the residual evaluation. > How would there be a different order of floating point ops in different > runs in serial? > > No, I mean without -snes_mf_* (as Barry says), so we are just running that >> solver with a sparse matrix. This would give me confidence >> that nothing in the solver is variable. >> >> I could do the sparse finite difference jacobian once, save it to disk, > and then use that system each time. > Yes. That would work. Thanks, Matt > On Thu, May 4, 2023 at 4:57?PM Matthew Knepley wrote: > >> On Thu, May 4, 2023 at 4:44?PM Mark Lohry wrote: >> >>> Is your code valgrind clean? >>>> >>> >>> Yes, I also initialize all allocations with NaNs to be sure I'm not >>> using anything uninitialized. >>> >>> >>>> We can try and test this. Replace your MatMFFD with an actual matrix >>>> and run. Do you see any variability? >>>> >>> >>> I think I did what you're asking. I have -snes_mf_operator set, and then >>> SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is >>> a matrix with ones on the diagonal. Two runs below, still with differences >>> but sometimes identical. >>> >> >> No, I mean without -snes_mf_* (as Barry says), so we are just running >> that solver with a sparse matrix. This would give me confidence >> that nothing in the solver is variable. >> >> Thanks, >> >> Matt >> >> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276930242e+04 >>> 3 KSP Residual norm 2.131675872968e+04 >>> 4 KSP Residual norm 1.973129814235e+04 >>> 5 KSP Residual norm 1.832377856317e+04 >>> 6 KSP Residual norm 1.716783617436e+04 >>> 7 KSP Residual norm 1.583963149542e+04 >>> 8 KSP Residual norm 1.482272170304e+04 >>> 9 KSP Residual norm 1.380312106742e+04 >>> 10 KSP Residual norm 1.297793480658e+04 >>> 11 KSP Residual norm 1.208599123244e+04 >>> 12 KSP Residual norm 1.137345655227e+04 >>> 13 KSP Residual norm 1.059676909366e+04 >>> 14 KSP Residual norm 1.003823862398e+04 >>> 15 KSP Residual norm 9.425879221354e+03 >>> 16 KSP Residual norm 8.954805890038e+03 >>> 17 KSP Residual norm 8.592372470456e+03 >>> 18 KSP Residual norm 8.060707175821e+03 >>> 19 KSP Residual norm 7.782057728723e+03 >>> 20 KSP Residual norm 7.449686095424e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqaij >>> rows=16384, cols=16384 >>> total: nonzeros=16384, allocated nonzeros=16384 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> 1 SNES Function norm 1.085015646971e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=23 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is applied matrix-free with differencing >>> Preconditioning Jacobian is built using finite differences with >>> coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqaij >>> rows=16384, cols=16384 >>> total: nonzeros=16384, allocated nonzeros=16384 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.871734444536e+04 >>> 2 KSP Residual norm 2.490276931041e+04 >>> 3 KSP Residual norm 2.131675873776e+04 >>> 4 KSP Residual norm 1.973129814908e+04 >>> 5 KSP Residual norm 1.832377852186e+04 >>> 6 KSP Residual norm 1.716783608174e+04 >>> 7 KSP Residual norm 1.583963128956e+04 >>> 8 KSP Residual norm 1.482272160069e+04 >>> 9 KSP Residual norm 1.380312087005e+04 >>> 10 KSP Residual norm 1.297793458796e+04 >>> 11 KSP Residual norm 1.208599115602e+04 >>> 12 KSP Residual norm 1.137345657533e+04 >>> 13 KSP Residual norm 1.059676906197e+04 >>> 14 KSP Residual norm 1.003823857515e+04 >>> 15 KSP Residual norm 9.425879177747e+03 >>> 16 KSP Residual norm 8.954805850825e+03 >>> 17 KSP Residual norm 8.592372413320e+03 >>> 18 KSP Residual norm 8.060706994110e+03 >>> 19 KSP Residual norm 7.782057560782e+03 >>> 20 KSP Residual norm 7.449686034356e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqaij >>> rows=16384, cols=16384 >>> total: nonzeros=16384, allocated nonzeros=16384 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> 1 SNES Function norm 1.085015821006e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=23 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is applied matrix-free with differencing >>> Preconditioning Jacobian is built using finite differences with >>> coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: 1 MPI process >>> type: mffd >>> rows=16384, cols=16384 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> Using wp compute h routine >>> Does not compute normU >>> Mat Object: 1 MPI process >>> type: seqaij >>> rows=16384, cols=16384 >>> total: nonzeros=16384, allocated nonzeros=16384 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> >>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley >>> wrote: >>> >>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: >>>> >>>>> Try -pc_type none. >>>>>> >>>>> >>>>> With -pc_type none the 0 KSP residual looks identical. But *sometimes* >>>>> it's producing exactly the same history and others it's gradually >>>>> changing. I'm reasonably confident my residual evaluation has no >>>>> randomness, see info after the petsc output. >>>>> >>>> >>>> We can try and test this. Replace your MatMFFD with an actual matrix >>>> and run. Do you see any variability? >>>> >>>> If not, then it could be your routine, or it could be MatMFFD. So run a >>>> few with -snes_view, and we can see if the >>>> "w" parameter changes. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> solve history 1: >>>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>> ... >>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> 1 SNES Function norm 1.085015821006e+04 >>>>> >>>>> solve history 2, identical to 1: >>>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>> ... >>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> 1 SNES Function norm 1.085015821006e+04 >>>>> >>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing >>>>> difference to the end: >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>> ... >>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> 1 SNES Function norm 1.085015646971e+04 >>>>> >>>>> >>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 >>>>> iterations, so 30 calls of the same residual evaluation, identical >>>>> residuals every time >>>>> >>>>> run 1: >>>>> >>>>> # iteration rho rhou rhov >>>>> rhoE abs_res rel_res >>>>> umin vmax vmin elapsed_time >>>>> >>>>> # >>>>> >>>>> >>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>> 6.34834e-01 >>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>> 6.40063e-01 >>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>> 6.45166e-01 >>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>> 6.50494e-01 >>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>> 6.55656e-01 >>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>> 6.60872e-01 >>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>> 6.66041e-01 >>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>> 6.71316e-01 >>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>> 6.76447e-01 >>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>> 6.81716e-01 >>>>> >>>>> run N: >>>>> >>>>> >>>>> # >>>>> >>>>> >>>>> # iteration rho rhou rhov >>>>> rhoE abs_res rel_res >>>>> umin vmax vmin elapsed_time >>>>> >>>>> # >>>>> >>>>> >>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>> 6.23316e-01 >>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>> 6.28510e-01 >>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>> 6.33558e-01 >>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>> 6.38773e-01 >>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>> 6.43887e-01 >>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>> 6.49073e-01 >>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>> 6.54167e-01 >>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>> 6.59394e-01 >>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>> 6.64516e-01 >>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>> 6.69677e-01 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: >>>>> >>>>>> ASM is just the sub PC with one proc but gets weaker with more procs >>>>>> unless you use jacobi. (maybe I am missing something). >>>>>> >>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: >>>>>> >>>>>>> Please send the output of -snes_view. >>>>>>>> >>>>>>> pasted below. anything stand out? >>>>>>> >>>>>>> >>>>>>> SNES Object: 1 MPI process >>>>>>> type: newtonls >>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>> total number of linear solver iterations=20 >>>>>>> total number of function evaluations=22 >>>>>>> norm schedule ALWAYS >>>>>>> Jacobian is never rebuilt >>>>>>> Jacobian is applied matrix-free with differencing >>>>>>> Preconditioning Jacobian is built using finite differences with >>>>>>> coloring >>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>> type: basic >>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>> lambda=1.000000e-08 >>>>>>> maximum iterations=40 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: asm >>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>> restriction/interpolation type - RESTRICT >>>>>>> Local solver information for first block is in the following >>>>>>> KSP and PC objects on rank 0: >>>>>>> Use -ksp_view ::ascii_info_detail to display information for >>>>>>> all blocks >>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (sub_) 1 MPI process >>>>>>> type: ilu >>>>>>> out-of-place factorization >>>>>>> 0 levels of fill >>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>> matrix ordering: natural >>>>>>> factor fill ratio given 1., needed 1. >>>>>>> Factored matrix follows: >>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>> type: seqbaij >>>>>>> rows=16384, cols=16384, bs=16 >>>>>>> package used to perform factorization: petsc >>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>> block size is 16 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>> type: seqbaij >>>>>>> rows=16384, cols=16384, bs=16 >>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> block size is 16 >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqbaij >>>>>>> rows=16384, cols=16384, bs=16 >>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> block size is 16 >>>>>>> >>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >>>>>>> >>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>> -snes_view might give you that. >>>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>> >>>>>>>>>> See these two examples with 20 fixed iterations pasted at the >>>>>>>>>> end. The difference for one solve is slight (final SNES norm is identical >>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated applications to >>>>>>>>>> solve a steady state multigrid problem, though here just one level) the >>>>>>>>>> differences add up such that I might reach global convergence in 35 >>>>>>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>>>>>> would be. >>>>>>>>>> >>>>>>>>> >>>>>>>>> The initial KSP residual is different, so its the PC. Please send >>>>>>>>> the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>> [...] >>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>>>>> different >>>>>>>>>> >>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>> [...] >>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>> >>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>> apart? That is the first couple of KSP iterations they are almost identical >>>>>>>>>>> but then for each iteration get a bit further. Similar for the SNES >>>>>>>>>>> iterations, starting close and then for more iterations and more solves >>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very different? You >>>>>>>>>>> can run with -snes_monitor -ksp_monitor >>>>>>>>>>> >>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry wrote: >>>>>>>>>>> >>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, >>>>>>>>>>> was just guessing there. But the solutions/residuals are slightly different >>>>>>>>>>> from run to run. >>>>>>>>>>> >>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect >>>>>>>>>>> bitwise identical results? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> No, the coloring should be identical every time. Do you see >>>>>>>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry >>>>>>>>>>>> wrote: >>>>>>>>>>>> > >>>>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK >>>>>>>>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with >>>>>>>>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>> > >>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals >>>>>>>>>>>> from run to run. I'm wondering where randomness might enter here -- does >>>>>>>>>>>> the jacobian coloring use a random seed? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Thu May 4 16:35:53 2023 From: mlohry at gmail.com (Mark Lohry) Date: Thu, 4 May 2023 17:35:53 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: > > Sure, but why only once and why save to disk? Why not just use that > computed approximate Jacobian at each Newton step to drive the Newton > solves along for a bunch of time steps? Ah I get what you mean. Okay I did three newton steps with the same LHS, with a few repeated manual tests. 3 out of 4 times i got the same exact history. is it in the realm of possibility that a hardware error could cause something this subtle, bad memory bit or something? 2 runs of 3 newton solves below, ever-so-slightly different. 0 SNES Function norm 3.424003312857e+04 0 KSP Residual norm 3.424003312857e+04 1 KSP Residual norm 2.886124328003e+04 2 KSP Residual norm 2.504664994246e+04 3 KSP Residual norm 2.104615835161e+04 4 KSP Residual norm 1.938102896632e+04 5 KSP Residual norm 1.793774642408e+04 6 KSP Residual norm 1.671392566980e+04 7 KSP Residual norm 1.501504103873e+04 8 KSP Residual norm 1.366362900747e+04 9 KSP Residual norm 1.240398500429e+04 10 KSP Residual norm 1.156293733914e+04 11 KSP Residual norm 1.066296477958e+04 12 KSP Residual norm 9.835601966950e+03 13 KSP Residual norm 9.017480191491e+03 14 KSP Residual norm 8.415336139780e+03 15 KSP Residual norm 7.807497808435e+03 16 KSP Residual norm 7.341703768294e+03 17 KSP Residual norm 6.979298049282e+03 18 KSP Residual norm 6.521277772081e+03 19 KSP Residual norm 6.174842408773e+03 20 KSP Residual norm 5.889819665003e+03 Linear solve converged due to CONVERGED_ITS iterations 20 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 1 SNES Function norm 1.000525348433e+04 Nonlinear solve converged due to CONVERGED_ITS iterations 1 SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=2 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 0 SNES Function norm 1.000525348433e+04 0 KSP Residual norm 1.000525348433e+04 1 KSP Residual norm 7.908741564765e+03 2 KSP Residual norm 6.825263536686e+03 3 KSP Residual norm 6.224930664968e+03 4 KSP Residual norm 6.095547180532e+03 5 KSP Residual norm 5.952968230430e+03 6 KSP Residual norm 5.861251998116e+03 7 KSP Residual norm 5.712439327755e+03 8 KSP Residual norm 5.583056913266e+03 9 KSP Residual norm 5.461768804626e+03 10 KSP Residual norm 5.351937611098e+03 11 KSP Residual norm 5.224288337578e+03 12 KSP Residual norm 5.129863847081e+03 13 KSP Residual norm 5.010818237218e+03 14 KSP Residual norm 4.907162936199e+03 15 KSP Residual norm 4.789564773955e+03 16 KSP Residual norm 4.695173370720e+03 17 KSP Residual norm 4.584070962171e+03 18 KSP Residual norm 4.483061424742e+03 19 KSP Residual norm 4.373384070745e+03 20 KSP Residual norm 4.260704657592e+03 Linear solve converged due to CONVERGED_ITS iterations 20 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 1 SNES Function norm 4.662386014882e+03 Nonlinear solve converged due to CONVERGED_ITS iterations 1 SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=2 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 0 SNES Function norm 4.662386014882e+03 0 KSP Residual norm 4.662386014882e+03 1 KSP Residual norm 4.408316259864e+03 2 KSP Residual norm 4.184867769829e+03 3 KSP Residual norm 4.079091244351e+03 4 KSP Residual norm 4.009247390166e+03 5 KSP Residual norm 3.928417371428e+03 6 KSP Residual norm 3.865152075780e+03 7 KSP Residual norm 3.795606446033e+03 8 KSP Residual norm 3.735294554158e+03 9 KSP Residual norm 3.674393726487e+03 10 KSP Residual norm 3.617795166786e+03 11 KSP Residual norm 3.563807982274e+03 12 KSP Residual norm 3.512269444921e+03 13 KSP Residual norm 3.455110223236e+03 14 KSP Residual norm 3.407141247372e+03 15 KSP Residual norm 3.356562415982e+03 16 KSP Residual norm 3.312720047685e+03 17 KSP Residual norm 3.263690150810e+03 18 KSP Residual norm 3.219359862444e+03 19 KSP Residual norm 3.173500955995e+03 20 KSP Residual norm 3.127528790155e+03 Linear solve converged due to CONVERGED_ITS iterations 20 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 1 SNES Function norm 3.186752172556e+03 Nonlinear solve converged due to CONVERGED_ITS iterations 1 SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=2 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 0 SNES Function norm 3.424003312857e+04 0 KSP Residual norm 3.424003312857e+04 1 KSP Residual norm 2.886124328003e+04 2 KSP Residual norm 2.504664994221e+04 3 KSP Residual norm 2.104615835130e+04 4 KSP Residual norm 1.938102896610e+04 5 KSP Residual norm 1.793774642406e+04 6 KSP Residual norm 1.671392566981e+04 7 KSP Residual norm 1.501504103854e+04 8 KSP Residual norm 1.366362900726e+04 9 KSP Residual norm 1.240398500414e+04 10 KSP Residual norm 1.156293733914e+04 11 KSP Residual norm 1.066296477972e+04 12 KSP Residual norm 9.835601967036e+03 13 KSP Residual norm 9.017480191500e+03 14 KSP Residual norm 8.415336139732e+03 15 KSP Residual norm 7.807497808414e+03 16 KSP Residual norm 7.341703768300e+03 17 KSP Residual norm 6.979298049244e+03 18 KSP Residual norm 6.521277772042e+03 19 KSP Residual norm 6.174842408713e+03 20 KSP Residual norm 5.889819664983e+03 Linear solve converged due to CONVERGED_ITS iterations 20 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 1 SNES Function norm 1.000525348435e+04 Nonlinear solve converged due to CONVERGED_ITS iterations 1 SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=2 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 0 SNES Function norm 1.000525348435e+04 0 KSP Residual norm 1.000525348435e+04 1 KSP Residual norm 7.908741565645e+03 2 KSP Residual norm 6.825263536988e+03 3 KSP Residual norm 6.224930664967e+03 4 KSP Residual norm 6.095547180474e+03 5 KSP Residual norm 5.952968230397e+03 6 KSP Residual norm 5.861251998127e+03 7 KSP Residual norm 5.712439327726e+03 8 KSP Residual norm 5.583056913167e+03 9 KSP Residual norm 5.461768804526e+03 10 KSP Residual norm 5.351937611030e+03 11 KSP Residual norm 5.224288337536e+03 12 KSP Residual norm 5.129863847028e+03 13 KSP Residual norm 5.010818237161e+03 14 KSP Residual norm 4.907162936143e+03 15 KSP Residual norm 4.789564773923e+03 16 KSP Residual norm 4.695173370709e+03 17 KSP Residual norm 4.584070962145e+03 18 KSP Residual norm 4.483061424714e+03 19 KSP Residual norm 4.373384070713e+03 20 KSP Residual norm 4.260704657576e+03 Linear solve converged due to CONVERGED_ITS iterations 20 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 1 SNES Function norm 4.662386014874e+03 Nonlinear solve converged due to CONVERGED_ITS iterations 1 SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=2 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 0 SNES Function norm 4.662386014874e+03 0 KSP Residual norm 4.662386014874e+03 1 KSP Residual norm 4.408316259834e+03 2 KSP Residual norm 4.184867769891e+03 3 KSP Residual norm 4.079091244367e+03 4 KSP Residual norm 4.009247390184e+03 5 KSP Residual norm 3.928417371457e+03 6 KSP Residual norm 3.865152075802e+03 7 KSP Residual norm 3.795606446041e+03 8 KSP Residual norm 3.735294554160e+03 9 KSP Residual norm 3.674393726485e+03 10 KSP Residual norm 3.617795166775e+03 11 KSP Residual norm 3.563807982249e+03 12 KSP Residual norm 3.512269444873e+03 13 KSP Residual norm 3.455110223193e+03 14 KSP Residual norm 3.407141247334e+03 15 KSP Residual norm 3.356562415949e+03 16 KSP Residual norm 3.312720047652e+03 17 KSP Residual norm 3.263690150782e+03 18 KSP Residual norm 3.219359862425e+03 19 KSP Residual norm 3.173500955997e+03 20 KSP Residual norm 3.127528790156e+03 Linear solve converged due to CONVERGED_ITS iterations 20 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 1 SNES Function norm 3.186752172503e+03 Nonlinear solve converged due to CONVERGED_ITS iterations 1 SNES Object: 1 MPI process type: newtonls maximum iterations=1, maximum function evaluations=-1 tolerances: relative=0.1, absolute=1e-15, solution=1e-15 total number of linear solver iterations=20 total number of function evaluations=2 norm schedule ALWAYS Jacobian is never rebuilt Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, initial guess is zero tolerances: relative=0.1, absolute=1e-15, divergence=10. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqbaij rows=16384, cols=16384, bs=16 total: nonzeros=1277952, allocated nonzeros=1277952 total number of mallocs used during MatSetValues calls=0 block size is 16 On Thu, May 4, 2023 at 5:22?PM Matthew Knepley wrote: > On Thu, May 4, 2023 at 5:03?PM Mark Lohry wrote: > >> Do you get different results (in different runs) without >>> -snes_mf_operator? So just using an explicit matrix? >> >> >> Unfortunately I don't have an explicit matrix available for this, hence >> the MFFD/JFNK. >> > > I don't mean the actual matrix, I mean a representative matrix. > > >> >>> (Note: I am not convinced there is even a problem and think it may be >>> simply different order of floating point operations in different runs.) >>> >> >> I'm not convinced either, but running explicit RK for 10,000 iterations i >> get exactly the same results every time so i'm fairly confident it's not >> the residual evaluation. >> How would there be a different order of floating point ops in different >> runs in serial? >> >> No, I mean without -snes_mf_* (as Barry says), so we are just running >>> that solver with a sparse matrix. This would give me confidence >>> that nothing in the solver is variable. >>> >>> I could do the sparse finite difference jacobian once, save it to disk, >> and then use that system each time. >> > > Yes. That would work. > > Thanks, > > Matt > > >> On Thu, May 4, 2023 at 4:57?PM Matthew Knepley wrote: >> >>> On Thu, May 4, 2023 at 4:44?PM Mark Lohry wrote: >>> >>>> Is your code valgrind clean? >>>>> >>>> >>>> Yes, I also initialize all allocations with NaNs to be sure I'm not >>>> using anything uninitialized. >>>> >>>> >>>>> We can try and test this. Replace your MatMFFD with an actual matrix >>>>> and run. Do you see any variability? >>>>> >>>> >>>> I think I did what you're asking. I have -snes_mf_operator set, and >>>> then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where >>>> diag_ones is a matrix with ones on the diagonal. Two runs below, still with >>>> differences but sometimes identical. >>>> >>> >>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>> that solver with a sparse matrix. This would give me confidence >>> that nothing in the solver is variable. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> 0 SNES Function norm 3.424003312857e+04 >>>> 0 KSP Residual norm 3.424003312857e+04 >>>> 1 KSP Residual norm 2.871734444536e+04 >>>> 2 KSP Residual norm 2.490276930242e+04 >>>> 3 KSP Residual norm 2.131675872968e+04 >>>> 4 KSP Residual norm 1.973129814235e+04 >>>> 5 KSP Residual norm 1.832377856317e+04 >>>> 6 KSP Residual norm 1.716783617436e+04 >>>> 7 KSP Residual norm 1.583963149542e+04 >>>> 8 KSP Residual norm 1.482272170304e+04 >>>> 9 KSP Residual norm 1.380312106742e+04 >>>> 10 KSP Residual norm 1.297793480658e+04 >>>> 11 KSP Residual norm 1.208599123244e+04 >>>> 12 KSP Residual norm 1.137345655227e+04 >>>> 13 KSP Residual norm 1.059676909366e+04 >>>> 14 KSP Residual norm 1.003823862398e+04 >>>> 15 KSP Residual norm 9.425879221354e+03 >>>> 16 KSP Residual norm 8.954805890038e+03 >>>> 17 KSP Residual norm 8.592372470456e+03 >>>> 18 KSP Residual norm 8.060707175821e+03 >>>> 19 KSP Residual norm 7.782057728723e+03 >>>> 20 KSP Residual norm 7.449686095424e+03 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> KSP Object: 1 MPI process >>>> type: gmres >>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>> Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=20, initial guess is zero >>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> PC Object: 1 MPI process >>>> type: none >>>> linear system matrix followed by preconditioner matrix: >>>> Mat Object: 1 MPI process >>>> type: mffd >>>> rows=16384, cols=16384 >>>> Matrix-free approximation: >>>> err=1.49012e-08 (relative error in function evaluation) >>>> Using wp compute h routine >>>> Does not compute normU >>>> Mat Object: 1 MPI process >>>> type: seqaij >>>> rows=16384, cols=16384 >>>> total: nonzeros=16384, allocated nonzeros=16384 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node routines >>>> 1 SNES Function norm 1.085015646971e+04 >>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>> SNES Object: 1 MPI process >>>> type: newtonls >>>> maximum iterations=1, maximum function evaluations=-1 >>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>> total number of linear solver iterations=20 >>>> total number of function evaluations=23 >>>> norm schedule ALWAYS >>>> Jacobian is never rebuilt >>>> Jacobian is applied matrix-free with differencing >>>> Preconditioning Jacobian is built using finite differences with >>>> coloring >>>> SNESLineSearch Object: 1 MPI process >>>> type: basic >>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>> lambda=1.000000e-08 >>>> maximum iterations=40 >>>> KSP Object: 1 MPI process >>>> type: gmres >>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>> Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=20, initial guess is zero >>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> PC Object: 1 MPI process >>>> type: none >>>> linear system matrix followed by preconditioner matrix: >>>> Mat Object: 1 MPI process >>>> type: mffd >>>> rows=16384, cols=16384 >>>> Matrix-free approximation: >>>> err=1.49012e-08 (relative error in function evaluation) >>>> Using wp compute h routine >>>> Does not compute normU >>>> Mat Object: 1 MPI process >>>> type: seqaij >>>> rows=16384, cols=16384 >>>> total: nonzeros=16384, allocated nonzeros=16384 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node routines >>>> >>>> 0 SNES Function norm 3.424003312857e+04 >>>> 0 KSP Residual norm 3.424003312857e+04 >>>> 1 KSP Residual norm 2.871734444536e+04 >>>> 2 KSP Residual norm 2.490276931041e+04 >>>> 3 KSP Residual norm 2.131675873776e+04 >>>> 4 KSP Residual norm 1.973129814908e+04 >>>> 5 KSP Residual norm 1.832377852186e+04 >>>> 6 KSP Residual norm 1.716783608174e+04 >>>> 7 KSP Residual norm 1.583963128956e+04 >>>> 8 KSP Residual norm 1.482272160069e+04 >>>> 9 KSP Residual norm 1.380312087005e+04 >>>> 10 KSP Residual norm 1.297793458796e+04 >>>> 11 KSP Residual norm 1.208599115602e+04 >>>> 12 KSP Residual norm 1.137345657533e+04 >>>> 13 KSP Residual norm 1.059676906197e+04 >>>> 14 KSP Residual norm 1.003823857515e+04 >>>> 15 KSP Residual norm 9.425879177747e+03 >>>> 16 KSP Residual norm 8.954805850825e+03 >>>> 17 KSP Residual norm 8.592372413320e+03 >>>> 18 KSP Residual norm 8.060706994110e+03 >>>> 19 KSP Residual norm 7.782057560782e+03 >>>> 20 KSP Residual norm 7.449686034356e+03 >>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>> KSP Object: 1 MPI process >>>> type: gmres >>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>> Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=20, initial guess is zero >>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> PC Object: 1 MPI process >>>> type: none >>>> linear system matrix followed by preconditioner matrix: >>>> Mat Object: 1 MPI process >>>> type: mffd >>>> rows=16384, cols=16384 >>>> Matrix-free approximation: >>>> err=1.49012e-08 (relative error in function evaluation) >>>> Using wp compute h routine >>>> Does not compute normU >>>> Mat Object: 1 MPI process >>>> type: seqaij >>>> rows=16384, cols=16384 >>>> total: nonzeros=16384, allocated nonzeros=16384 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node routines >>>> 1 SNES Function norm 1.085015821006e+04 >>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>> SNES Object: 1 MPI process >>>> type: newtonls >>>> maximum iterations=1, maximum function evaluations=-1 >>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>> total number of linear solver iterations=20 >>>> total number of function evaluations=23 >>>> norm schedule ALWAYS >>>> Jacobian is never rebuilt >>>> Jacobian is applied matrix-free with differencing >>>> Preconditioning Jacobian is built using finite differences with >>>> coloring >>>> SNESLineSearch Object: 1 MPI process >>>> type: basic >>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>> lambda=1.000000e-08 >>>> maximum iterations=40 >>>> KSP Object: 1 MPI process >>>> type: gmres >>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>> Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=20, initial guess is zero >>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> PC Object: 1 MPI process >>>> type: none >>>> linear system matrix followed by preconditioner matrix: >>>> Mat Object: 1 MPI process >>>> type: mffd >>>> rows=16384, cols=16384 >>>> Matrix-free approximation: >>>> err=1.49012e-08 (relative error in function evaluation) >>>> Using wp compute h routine >>>> Does not compute normU >>>> Mat Object: 1 MPI process >>>> type: seqaij >>>> rows=16384, cols=16384 >>>> total: nonzeros=16384, allocated nonzeros=16384 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node routines >>>> >>>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: >>>>> >>>>>> Try -pc_type none. >>>>>>> >>>>>> >>>>>> With -pc_type none the 0 KSP residual looks identical. But >>>>>> *sometimes* it's producing exactly the same history and others it's >>>>>> gradually changing. I'm reasonably confident my residual evaluation has no >>>>>> randomness, see info after the petsc output. >>>>>> >>>>> >>>>> We can try and test this. Replace your MatMFFD with an actual matrix >>>>> and run. Do you see any variability? >>>>> >>>>> If not, then it could be your routine, or it could be MatMFFD. So run >>>>> a few with -snes_view, and we can see if the >>>>> "w" parameter changes. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> solve history 1: >>>>>> >>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>> ... >>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>> >>>>>> solve history 2, identical to 1: >>>>>> >>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>> ... >>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>> >>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, >>>>>> growing difference to the end: >>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>> ... >>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>> >>>>>> >>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 >>>>>> iterations, so 30 calls of the same residual evaluation, identical >>>>>> residuals every time >>>>>> >>>>>> run 1: >>>>>> >>>>>> # iteration rho rhou rhov >>>>>> rhoE abs_res rel_res >>>>>> umin vmax vmin elapsed_time >>>>>> >>>>>> # >>>>>> >>>>>> >>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>>> 6.34834e-01 >>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>>> 6.40063e-01 >>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>>> 6.45166e-01 >>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>>> 6.50494e-01 >>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>>> 6.55656e-01 >>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>>> 6.60872e-01 >>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>>> 6.66041e-01 >>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>>> 6.71316e-01 >>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>>> 6.76447e-01 >>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>>> 6.81716e-01 >>>>>> >>>>>> run N: >>>>>> >>>>>> >>>>>> # >>>>>> >>>>>> >>>>>> # iteration rho rhou rhov >>>>>> rhoE abs_res rel_res >>>>>> umin vmax vmin elapsed_time >>>>>> >>>>>> # >>>>>> >>>>>> >>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>>> 6.23316e-01 >>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>>> 6.28510e-01 >>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>>> 6.33558e-01 >>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>>> 6.38773e-01 >>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>>> 6.43887e-01 >>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>>> 6.49073e-01 >>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>>> 6.54167e-01 >>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>>> 6.59394e-01 >>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>>> 6.64516e-01 >>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>>> 6.69677e-01 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: >>>>>> >>>>>>> ASM is just the sub PC with one proc but gets weaker with more procs >>>>>>> unless you use jacobi. (maybe I am missing something). >>>>>>> >>>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: >>>>>>> >>>>>>>> Please send the output of -snes_view. >>>>>>>>> >>>>>>>> pasted below. anything stand out? >>>>>>>> >>>>>>>> >>>>>>>> SNES Object: 1 MPI process >>>>>>>> type: newtonls >>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>> total number of linear solver iterations=20 >>>>>>>> total number of function evaluations=22 >>>>>>>> norm schedule ALWAYS >>>>>>>> Jacobian is never rebuilt >>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>> Preconditioning Jacobian is built using finite differences with >>>>>>>> coloring >>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>> type: basic >>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>>> lambda=1.000000e-08 >>>>>>>> maximum iterations=40 >>>>>>>> KSP Object: 1 MPI process >>>>>>>> type: gmres >>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>>> Orthogonalization with no iterative refinement >>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>> left preconditioning >>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>> PC Object: 1 MPI process >>>>>>>> type: asm >>>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>>> restriction/interpolation type - RESTRICT >>>>>>>> Local solver information for first block is in the following >>>>>>>> KSP and PC objects on rank 0: >>>>>>>> Use -ksp_view ::ascii_info_detail to display information for >>>>>>>> all blocks >>>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>>> type: preonly >>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using NONE norm type for convergence test >>>>>>>> PC Object: (sub_) 1 MPI process >>>>>>>> type: ilu >>>>>>>> out-of-place factorization >>>>>>>> 0 levels of fill >>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>> matrix ordering: natural >>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>> Factored matrix follows: >>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>> type: seqbaij >>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>> package used to perform factorization: petsc >>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>> block size is 16 >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>> type: seqbaij >>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>> block size is 16 >>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: mffd >>>>>>>> rows=16384, cols=16384 >>>>>>>> Matrix-free approximation: >>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>> Using wp compute h routine >>>>>>>> Does not compute normU >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: seqbaij >>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>> block size is 16 >>>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >>>>>>>> >>>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>>> -snes_view might give you that. >>>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>> apart? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>>> >>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the >>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is identical >>>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated applications to >>>>>>>>>>> solve a steady state multigrid problem, though here just one level) the >>>>>>>>>>> differences add up such that I might reach global convergence in 35 >>>>>>>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>>>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>>>>>>> would be. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The initial KSP residual is different, so its the PC. Please send >>>>>>>>>> the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>>> [...] >>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>>>>>> different >>>>>>>>>>> >>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>>> [...] >>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>> >>>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>> apart? That is the first couple of KSP iterations they are almost identical >>>>>>>>>>>> but then for each iteration get a bit further. Similar for the SNES >>>>>>>>>>>> iterations, starting close and then for more iterations and more solves >>>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very different? You >>>>>>>>>>>> can run with -snes_monitor -ksp_monitor >>>>>>>>>>>> >>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, >>>>>>>>>>>> was just guessing there. But the solutions/residuals are slightly different >>>>>>>>>>>> from run to run. >>>>>>>>>>>> >>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should >>>>>>>>>>>> expect bitwise identical results? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> No, the coloring should be identical every time. Do you see >>>>>>>>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> > >>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an >>>>>>>>>>>>> MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, >>>>>>>>>>>>> with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>>> > >>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals >>>>>>>>>>>>> from run to run. I'm wondering where randomness might enter here -- does >>>>>>>>>>>>> the jacobian coloring use a random seed? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 4 17:01:04 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2023 18:01:04 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: On Thu, May 4, 2023 at 1:43?PM LEONARDO MUTTI < leonardo.mutti01 at universitadipavia.it> wrote: > Of course, I'll try to explain. > > I am solving a parabolic equation with space-time FEM and I want an > efficient solver/preconditioner for the resulting system. > The corresponding matrix, call it X, has an e.g. block bi-diagonal > structure, if the cG(1)-dG(0) method is used (i.e. implicit Euler solved in > batch). > Every block-row of X corresponds to a time instant. > > I want to introduce parallelism in time by subdividing X into overlapping > submatrices of e.g 2x2 or 3x3 blocks, along the block diagonal. > For instance, call X_i the individual blocks. The submatrices would be, > for various i, (X_{i-1,i-1},X_{i-1,i};X_{i,i-1},X_{i,i}). > I'd like each submatrix to be solved in parallel, to combine the various > results together in an ASM like fashion. > Every submatrix has thus a predecessor and a successor, and it overlaps > with both, so that as far as I could understand, GASM has to be used in > place of ASM. > Yes, ordered that way you need GASM. I wonder if inverting the ordering would be useful, namely putting the time index on the inside. Then the blocks would be over all time, but limited space, which is more the spirit of ASM I think. Have you considered waveform relaxation for this problem? Thanks, Matt > Hope this helps. > Best, > Leonardo > > Il giorno gio 4 mag 2023 alle ore 18:05 Matthew Knepley > ha scritto: > >> On Thu, May 4, 2023 at 11:24?AM LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >>> Thank you for the help. >>> Adding to my example: >>> >>> >>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>> inflated_IS,ierr) call >>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>> results in: >>> >>> * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS >>> referenced in function ... * >>> >>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>> referenced in function ... * >>> I'm not sure if the interfaces are missing or if I have a compilation >>> problem. >>> >> >> I just want to make sure you really want GASM. It sounded like you might >> able to do what you want just with ASM. >> Can you tell me again what you want to do overall? >> >> Thanks, >> >> Matt >> >> >>> Thank you again. >>> Best, >>> Leonardo >>> >>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith >>> ha scritto: >>> >>>> >>>> Thank you for the test code. I have a fix in the branch >>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>> with >>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>> >>>> The functions did not have proper Fortran stubs and interfaces so I >>>> had to provide them manually in the new branch. >>>> >>>> Use >>>> >>>> git fetch >>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>> >>>> ./configure etc >>>> >>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had >>>> to change things slightly and I updated the error handling for the latest >>>> version. >>>> >>>> Please let us know if you have any later questions. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>> >>>> Hello. I am having a hard time understanding the index sets to feed >>>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>>> get more intuition on how the IS objects behave I tried the following >>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>> square, non-overlapping submatrices: >>>> >>>> #include >>>> #include >>>> #include >>>> USE petscmat >>>> USE petscksp >>>> USE petscpc >>>> >>>> Mat :: A >>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>> INTEGER :: I,J >>>> PetscErrorCode :: ierr >>>> PetscScalar :: v >>>> KSP :: ksp >>>> PC :: pc >>>> IS :: subdomains_IS, inflated_IS >>>> >>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>> >>>> !-----Create a dummy matrix >>>> M = 16 >>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>> & M, M, >>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>> & A, ierr) >>>> >>>> DO I=1,M >>>> DO J=1,M >>>> v = I*J >>>> CALL MatSetValue (A,I-1,J-1,v, >>>> & INSERT_VALUES , ierr) >>>> END DO >>>> END DO >>>> >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>> >>>> !-----Create KSP and PC >>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>> call KSPSetOperators(ksp,A,A, ierr) >>>> call KSPSetType(ksp,"bcgs",ierr) >>>> call KSPGetPC(ksp,pc,ierr) >>>> call KSPSetUp(ksp, ierr) >>>> call PCSetType(pc,PCGASM, ierr) >>>> call PCSetUp(pc , ierr) >>>> >>>> !-----GASM setup >>>> NSubx = 4 >>>> dof = 1 >>>> overlap = 0 >>>> >>>> call PCGASMCreateSubdomains2D(pc, >>>> & M, M, >>>> & NSubx, NSubx, >>>> & dof, overlap, >>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>> >>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>> >>>> call KSPDestroy(ksp, ierr) >>>> call PetscFinalize(ierr) >>>> >>>> Running this on one processor, I get NSub = 4. >>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 >>>> as expected. >>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - >>>> access violation". So: >>>> 1) why do I get two different results with ASM, and GASM? >>>> 2) why do I get access violation and how can I solve this? >>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>> objects. As I see on the Fortran interface, the arguments to >>>> PCGASMCreateSubdomains2D are IS objects: >>>> >>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>> import tPC,tIS >>>> PC a ! PC >>>> PetscInt b ! PetscInt >>>> PetscInt c ! PetscInt >>>> PetscInt d ! PetscInt >>>> PetscInt e ! PetscInt >>>> PetscInt f ! PetscInt >>>> PetscInt g ! PetscInt >>>> PetscInt h ! PetscInt >>>> IS i ! IS >>>> IS j ! IS >>>> PetscErrorCode z >>>> end subroutine PCGASMCreateSubdomains2D >>>> Thus: >>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, >>>> for every created subdomain, the list of rows and columns defining the >>>> subblock in the matrix, am I right? >>>> >>>> Context: I have a block-tridiagonal system arising from space-time >>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>> >>>> Thanks in advance, >>>> Leonardo >>>> >>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu May 4 20:51:29 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 4 May 2023 21:51:29 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: Send configure.log > On May 4, 2023, at 5:35 PM, Mark Lohry wrote: > >> Sure, but why only once and why save to disk? Why not just use that computed approximate Jacobian at each Newton step to drive the Newton solves along for a bunch of time steps? > > Ah I get what you mean. Okay I did three newton steps with the same LHS, with a few repeated manual tests. 3 out of 4 times i got the same exact history. is it in the realm of possibility that a hardware error could cause something this subtle, bad memory bit or something? > > 2 runs of 3 newton solves below, ever-so-slightly different. > > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.886124328003e+04 > 2 KSP Residual norm 2.504664994246e+04 > 3 KSP Residual norm 2.104615835161e+04 > 4 KSP Residual norm 1.938102896632e+04 > 5 KSP Residual norm 1.793774642408e+04 > 6 KSP Residual norm 1.671392566980e+04 > 7 KSP Residual norm 1.501504103873e+04 > 8 KSP Residual norm 1.366362900747e+04 > 9 KSP Residual norm 1.240398500429e+04 > 10 KSP Residual norm 1.156293733914e+04 > 11 KSP Residual norm 1.066296477958e+04 > 12 KSP Residual norm 9.835601966950e+03 > 13 KSP Residual norm 9.017480191491e+03 > 14 KSP Residual norm 8.415336139780e+03 > 15 KSP Residual norm 7.807497808435e+03 > 16 KSP Residual norm 7.341703768294e+03 > 17 KSP Residual norm 6.979298049282e+03 > 18 KSP Residual norm 6.521277772081e+03 > 19 KSP Residual norm 6.174842408773e+03 > 20 KSP Residual norm 5.889819665003e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 1.000525348433e+04 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 0 SNES Function norm 1.000525348433e+04 > 0 KSP Residual norm 1.000525348433e+04 > 1 KSP Residual norm 7.908741564765e+03 > 2 KSP Residual norm 6.825263536686e+03 > 3 KSP Residual norm 6.224930664968e+03 > 4 KSP Residual norm 6.095547180532e+03 > 5 KSP Residual norm 5.952968230430e+03 > 6 KSP Residual norm 5.861251998116e+03 > 7 KSP Residual norm 5.712439327755e+03 > 8 KSP Residual norm 5.583056913266e+03 > 9 KSP Residual norm 5.461768804626e+03 > 10 KSP Residual norm 5.351937611098e+03 > 11 KSP Residual norm 5.224288337578e+03 > 12 KSP Residual norm 5.129863847081e+03 > 13 KSP Residual norm 5.010818237218e+03 > 14 KSP Residual norm 4.907162936199e+03 > 15 KSP Residual norm 4.789564773955e+03 > 16 KSP Residual norm 4.695173370720e+03 > 17 KSP Residual norm 4.584070962171e+03 > 18 KSP Residual norm 4.483061424742e+03 > 19 KSP Residual norm 4.373384070745e+03 > 20 KSP Residual norm 4.260704657592e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 4.662386014882e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 0 SNES Function norm 4.662386014882e+03 > 0 KSP Residual norm 4.662386014882e+03 > 1 KSP Residual norm 4.408316259864e+03 > 2 KSP Residual norm 4.184867769829e+03 > 3 KSP Residual norm 4.079091244351e+03 > 4 KSP Residual norm 4.009247390166e+03 > 5 KSP Residual norm 3.928417371428e+03 > 6 KSP Residual norm 3.865152075780e+03 > 7 KSP Residual norm 3.795606446033e+03 > 8 KSP Residual norm 3.735294554158e+03 > 9 KSP Residual norm 3.674393726487e+03 > 10 KSP Residual norm 3.617795166786e+03 > 11 KSP Residual norm 3.563807982274e+03 > 12 KSP Residual norm 3.512269444921e+03 > 13 KSP Residual norm 3.455110223236e+03 > 14 KSP Residual norm 3.407141247372e+03 > 15 KSP Residual norm 3.356562415982e+03 > 16 KSP Residual norm 3.312720047685e+03 > 17 KSP Residual norm 3.263690150810e+03 > 18 KSP Residual norm 3.219359862444e+03 > 19 KSP Residual norm 3.173500955995e+03 > 20 KSP Residual norm 3.127528790155e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 3.186752172556e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > > > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.886124328003e+04 > 2 KSP Residual norm 2.504664994221e+04 > 3 KSP Residual norm 2.104615835130e+04 > 4 KSP Residual norm 1.938102896610e+04 > 5 KSP Residual norm 1.793774642406e+04 > 6 KSP Residual norm 1.671392566981e+04 > 7 KSP Residual norm 1.501504103854e+04 > 8 KSP Residual norm 1.366362900726e+04 > 9 KSP Residual norm 1.240398500414e+04 > 10 KSP Residual norm 1.156293733914e+04 > 11 KSP Residual norm 1.066296477972e+04 > 12 KSP Residual norm 9.835601967036e+03 > 13 KSP Residual norm 9.017480191500e+03 > 14 KSP Residual norm 8.415336139732e+03 > 15 KSP Residual norm 7.807497808414e+03 > 16 KSP Residual norm 7.341703768300e+03 > 17 KSP Residual norm 6.979298049244e+03 > 18 KSP Residual norm 6.521277772042e+03 > 19 KSP Residual norm 6.174842408713e+03 > 20 KSP Residual norm 5.889819664983e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 1.000525348435e+04 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 0 SNES Function norm 1.000525348435e+04 > 0 KSP Residual norm 1.000525348435e+04 > 1 KSP Residual norm 7.908741565645e+03 > 2 KSP Residual norm 6.825263536988e+03 > 3 KSP Residual norm 6.224930664967e+03 > 4 KSP Residual norm 6.095547180474e+03 > 5 KSP Residual norm 5.952968230397e+03 > 6 KSP Residual norm 5.861251998127e+03 > 7 KSP Residual norm 5.712439327726e+03 > 8 KSP Residual norm 5.583056913167e+03 > 9 KSP Residual norm 5.461768804526e+03 > 10 KSP Residual norm 5.351937611030e+03 > 11 KSP Residual norm 5.224288337536e+03 > 12 KSP Residual norm 5.129863847028e+03 > 13 KSP Residual norm 5.010818237161e+03 > 14 KSP Residual norm 4.907162936143e+03 > 15 KSP Residual norm 4.789564773923e+03 > 16 KSP Residual norm 4.695173370709e+03 > 17 KSP Residual norm 4.584070962145e+03 > 18 KSP Residual norm 4.483061424714e+03 > 19 KSP Residual norm 4.373384070713e+03 > 20 KSP Residual norm 4.260704657576e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 4.662386014874e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 0 SNES Function norm 4.662386014874e+03 > 0 KSP Residual norm 4.662386014874e+03 > 1 KSP Residual norm 4.408316259834e+03 > 2 KSP Residual norm 4.184867769891e+03 > 3 KSP Residual norm 4.079091244367e+03 > 4 KSP Residual norm 4.009247390184e+03 > 5 KSP Residual norm 3.928417371457e+03 > 6 KSP Residual norm 3.865152075802e+03 > 7 KSP Residual norm 3.795606446041e+03 > 8 KSP Residual norm 3.735294554160e+03 > 9 KSP Residual norm 3.674393726485e+03 > 10 KSP Residual norm 3.617795166775e+03 > 11 KSP Residual norm 3.563807982249e+03 > 12 KSP Residual norm 3.512269444873e+03 > 13 KSP Residual norm 3.455110223193e+03 > 14 KSP Residual norm 3.407141247334e+03 > 15 KSP Residual norm 3.356562415949e+03 > 16 KSP Residual norm 3.312720047652e+03 > 17 KSP Residual norm 3.263690150782e+03 > 18 KSP Residual norm 3.219359862425e+03 > 19 KSP Residual norm 3.173500955997e+03 > 20 KSP Residual norm 3.127528790156e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 3.186752172503e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > > On Thu, May 4, 2023 at 5:22?PM Matthew Knepley > wrote: >> On Thu, May 4, 2023 at 5:03?PM Mark Lohry > wrote: >>>> Do you get different results (in different runs) without -snes_mf_operator? So just using an explicit matrix? >>> >>> Unfortunately I don't have an explicit matrix available for this, hence the MFFD/JFNK. >> >> I don't mean the actual matrix, I mean a representative matrix. >> >>>> >>>> (Note: I am not convinced there is even a problem and think it may be simply different order of floating point operations in different runs.) >>> >>> I'm not convinced either, but running explicit RK for 10,000 iterations i get exactly the same results every time so i'm fairly confident it's not the residual evaluation. >>> How would there be a different order of floating point ops in different runs in serial? >>> >>>> No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence >>>> that nothing in the solver is variable. >>>> >>> I could do the sparse finite difference jacobian once, save it to disk, and then use that system each time. >> >> Yes. That would work. >> >> Thanks, >> >> Matt >> >>> On Thu, May 4, 2023 at 4:57?PM Matthew Knepley > wrote: >>>> On Thu, May 4, 2023 at 4:44?PM Mark Lohry > wrote: >>>>>> Is your code valgrind clean? >>>>> >>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not using anything uninitialized. >>>>> >>>>>> >>>>>> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >>>>> >>>>> I think I did what you're asking. I have -snes_mf_operator set, and then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is a matrix with ones on the diagonal. Two runs below, still with differences but sometimes identical. >>>> >>>> No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence >>>> that nothing in the solver is variable. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>> 3 KSP Residual norm 2.131675872968e+04 >>>>> 4 KSP Residual norm 1.973129814235e+04 >>>>> 5 KSP Residual norm 1.832377856317e+04 >>>>> 6 KSP Residual norm 1.716783617436e+04 >>>>> 7 KSP Residual norm 1.583963149542e+04 >>>>> 8 KSP Residual norm 1.482272170304e+04 >>>>> 9 KSP Residual norm 1.380312106742e+04 >>>>> 10 KSP Residual norm 1.297793480658e+04 >>>>> 11 KSP Residual norm 1.208599123244e+04 >>>>> 12 KSP Residual norm 1.137345655227e+04 >>>>> 13 KSP Residual norm 1.059676909366e+04 >>>>> 14 KSP Residual norm 1.003823862398e+04 >>>>> 15 KSP Residual norm 9.425879221354e+03 >>>>> 16 KSP Residual norm 8.954805890038e+03 >>>>> 17 KSP Residual norm 8.592372470456e+03 >>>>> 18 KSP Residual norm 8.060707175821e+03 >>>>> 19 KSP Residual norm 7.782057728723e+03 >>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqaij >>>>> rows=16384, cols=16384 >>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> 1 SNES Function norm 1.085015646971e+04 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=23 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is applied matrix-free with differencing >>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqaij >>>>> rows=16384, cols=16384 >>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>> 3 KSP Residual norm 2.131675873776e+04 >>>>> 4 KSP Residual norm 1.973129814908e+04 >>>>> 5 KSP Residual norm 1.832377852186e+04 >>>>> 6 KSP Residual norm 1.716783608174e+04 >>>>> 7 KSP Residual norm 1.583963128956e+04 >>>>> 8 KSP Residual norm 1.482272160069e+04 >>>>> 9 KSP Residual norm 1.380312087005e+04 >>>>> 10 KSP Residual norm 1.297793458796e+04 >>>>> 11 KSP Residual norm 1.208599115602e+04 >>>>> 12 KSP Residual norm 1.137345657533e+04 >>>>> 13 KSP Residual norm 1.059676906197e+04 >>>>> 14 KSP Residual norm 1.003823857515e+04 >>>>> 15 KSP Residual norm 9.425879177747e+03 >>>>> 16 KSP Residual norm 8.954805850825e+03 >>>>> 17 KSP Residual norm 8.592372413320e+03 >>>>> 18 KSP Residual norm 8.060706994110e+03 >>>>> 19 KSP Residual norm 7.782057560782e+03 >>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqaij >>>>> rows=16384, cols=16384 >>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> 1 SNES Function norm 1.085015821006e+04 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=23 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is applied matrix-free with differencing >>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqaij >>>>> rows=16384, cols=16384 >>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> >>>>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley > wrote: >>>>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry > wrote: >>>>>>>> Try -pc_type none. >>>>>>> >>>>>>> With -pc_type none the 0 KSP residual looks identical. But *sometimes* it's producing exactly the same history and others it's gradually changing. I'm reasonably confident my residual evaluation has no randomness, see info after the petsc output. >>>>>> >>>>>> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >>>>>> >>>>>> If not, then it could be your routine, or it could be MatMFFD. So run a few with -snes_view, and we can see if the >>>>>> "w" parameter changes. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>>> solve history 1: >>>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>> ... >>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>> >>>>>>> solve history 2, identical to 1: >>>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>> ... >>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>> >>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing difference to the end: >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>> ... >>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>> >>>>>>> >>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 iterations, so 30 calls of the same residual evaluation, identical residuals every time >>>>>>> >>>>>>> run 1: >>>>>>> >>>>>>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>>>>>> # >>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.34834e-01 >>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.40063e-01 >>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.45166e-01 >>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.50494e-01 >>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.55656e-01 >>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.60872e-01 >>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.66041e-01 >>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.71316e-01 >>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.76447e-01 >>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.81716e-01 >>>>>>> >>>>>>> run N: >>>>>>> >>>>>>> >>>>>>> # >>>>>>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>>>>>> # >>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.23316e-01 >>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.28510e-01 >>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.33558e-01 >>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.38773e-01 >>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.43887e-01 >>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.49073e-01 >>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.54167e-01 >>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.59394e-01 >>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.64516e-01 >>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.69677e-01 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams > wrote: >>>>>>>> ASM is just the sub PC with one proc but gets weaker with more procs unless you use jacobi. (maybe I am missing something). >>>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry > wrote: >>>>>>>>>> Please send the output of -snes_view. >>>>>>>>> pasted below. anything stand out? >>>>>>>>> >>>>>>>>> >>>>>>>>> SNES Object: 1 MPI process >>>>>>>>> type: newtonls >>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>> total number of linear solver iterations=20 >>>>>>>>> total number of function evaluations=22 >>>>>>>>> norm schedule ALWAYS >>>>>>>>> Jacobian is never rebuilt >>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>> type: basic >>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>>>>>> maximum iterations=40 >>>>>>>>> KSP Object: 1 MPI process >>>>>>>>> type: gmres >>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>> left preconditioning >>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI process >>>>>>>>> type: asm >>>>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>>>> restriction/interpolation type - RESTRICT >>>>>>>>> Local solver information for first block is in the following KSP and PC objects on rank 0: >>>>>>>>> Use -ksp_view ::ascii_info_detail to display information for all blocks >>>>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>>>> type: preonly >>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>> left preconditioning >>>>>>>>> using NONE norm type for convergence test >>>>>>>>> PC Object: (sub_) 1 MPI process >>>>>>>>> type: ilu >>>>>>>>> out-of-place factorization >>>>>>>>> 0 levels of fill >>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>> matrix ordering: natural >>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>> Factored matrix follows: >>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>> type: seqbaij >>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>> package used to perform factorization: petsc >>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>> block size is 16 >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>> type: seqbaij >>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>> block size is 16 >>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: mffd >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> Matrix-free approximation: >>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>> Using wp compute h routine >>>>>>>>> Does not compute normU >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: seqbaij >>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>> block size is 16 >>>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams > wrote: >>>>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>>>> -snes_view might give you that. >>>>>>>>>> >>>>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley > wrote: >>>>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry > wrote: >>>>>>>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>>>>>> >>>>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>>>> >>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the end. The difference for one solve is slight (final SNES norm is identical to 5 digits), but in the context I'm using it in (repeated applications to solve a steady state multigrid problem, though here just one level) the differences add up such that I might reach global convergence in 35 iterations or 38. It's not the end of the world, but I was expecting that with -np 1 these would be identical and I'm not sure where the root cause would be. >>>>>>>>>>> >>>>>>>>>>> The initial KSP residual is different, so its the PC. Please send the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>>>> [...] >>>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>>>>>>>>>>> >>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>>>> [...] >>>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith > wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Do they start very similarly and then slowly drift further apart? That is the first couple of KSP iterations they are almost identical but then for each iteration get a bit further. Similar for the SNES iterations, starting close and then for more iterations and more solves they start moving apart. Or do they suddenly jump to be very different? You can run with -snes_monitor -ksp_monitor >>>>>>>>>>>>> >>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry > wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, was just guessing there. But the solutions/residuals are slightly different from run to run. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise identical results? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith > wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> No, the coloring should be identical every time. Do you see differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry > wrote: >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from run to run. I'm wondering where randomness might enter here -- does the jacobian coloring use a random seed? >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Fri May 5 01:45:11 2023 From: leonardo.mutti01 at universitadipavia.it (LEONARDO MUTTI) Date: Fri, 5 May 2023 08:45:11 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: Interesting, a priori I'm not sure this will work better, mainly because I'd lose the compact band structure. As for waveform relaxation: I excluded it at first since it appears to be requiring too many CPUs than I have to beat sequential solvers, plus it is more complicated and I have very limited time at this project. For both suggestions, because of the way the space-time matrix is generated, it is much more convenient for me to mess with the time dimension than with space. Overall GASM seems a simpler way to go before trying other things. Please let me know if you decide to add the GASM interfaces. Thanks again. Best, Leonardo Il ven 5 mag 2023, 00:01 Matthew Knepley ha scritto: > On Thu, May 4, 2023 at 1:43?PM LEONARDO MUTTI < > leonardo.mutti01 at universitadipavia.it> wrote: > >> Of course, I'll try to explain. >> >> I am solving a parabolic equation with space-time FEM and I want an >> efficient solver/preconditioner for the resulting system. >> The corresponding matrix, call it X, has an e.g. block bi-diagonal >> structure, if the cG(1)-dG(0) method is used (i.e. implicit Euler solved in >> batch). >> Every block-row of X corresponds to a time instant. >> >> I want to introduce parallelism in time by subdividing X into overlapping >> submatrices of e.g 2x2 or 3x3 blocks, along the block diagonal. >> For instance, call X_i the individual blocks. The submatrices would be, >> for various i, (X_{i-1,i-1},X_{i-1,i};X_{i,i-1},X_{i,i}). >> I'd like each submatrix to be solved in parallel, to combine the various >> results together in an ASM like fashion. >> Every submatrix has thus a predecessor and a successor, and it overlaps >> with both, so that as far as I could understand, GASM has to be used in >> place of ASM. >> > > Yes, ordered that way you need GASM. I wonder if inverting the ordering > would be useful, namely putting the time index on the inside. > Then the blocks would be over all time, but limited space, which is more > the spirit of ASM I think. > > Have you considered waveform relaxation for this problem? > > Thanks, > > Matt > > >> Hope this helps. >> Best, >> Leonardo >> >> Il giorno gio 4 mag 2023 alle ore 18:05 Matthew Knepley < >> knepley at gmail.com> ha scritto: >> >>> On Thu, May 4, 2023 at 11:24?AM LEONARDO MUTTI < >>> leonardo.mutti01 at universitadipavia.it> wrote: >>> >>>> Thank you for the help. >>>> Adding to my example: >>>> >>>> >>>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>>> inflated_IS,ierr) call >>>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>>> results in: >>>> >>>> * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS >>>> referenced in function ... * >>>> >>>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>>> referenced in function ... * >>>> I'm not sure if the interfaces are missing or if I have a compilation >>>> problem. >>>> >>> >>> I just want to make sure you really want GASM. It sounded like you might >>> able to do what you want just with ASM. >>> Can you tell me again what you want to do overall? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thank you again. >>>> Best, >>>> Leonardo >>>> >>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith >>>> ha scritto: >>>> >>>>> >>>>> Thank you for the test code. I have a fix in the branch >>>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>> with >>>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>> >>>>> The functions did not have proper Fortran stubs and interfaces so I >>>>> had to provide them manually in the new branch. >>>>> >>>>> Use >>>>> >>>>> git fetch >>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>> >>>>> ./configure etc >>>>> >>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had >>>>> to change things slightly and I updated the error handling for the latest >>>>> version. >>>>> >>>>> Please let us know if you have any later questions. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>> >>>>> Hello. I am having a hard time understanding the index sets to feed >>>>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>>>> get more intuition on how the IS objects behave I tried the following >>>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>>> square, non-overlapping submatrices: >>>>> >>>>> #include >>>>> #include >>>>> #include >>>>> USE petscmat >>>>> USE petscksp >>>>> USE petscpc >>>>> >>>>> Mat :: A >>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>> INTEGER :: I,J >>>>> PetscErrorCode :: ierr >>>>> PetscScalar :: v >>>>> KSP :: ksp >>>>> PC :: pc >>>>> IS :: subdomains_IS, inflated_IS >>>>> >>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>> >>>>> !-----Create a dummy matrix >>>>> M = 16 >>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>> & M, M, >>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>> & A, ierr) >>>>> >>>>> DO I=1,M >>>>> DO J=1,M >>>>> v = I*J >>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>> & INSERT_VALUES , ierr) >>>>> END DO >>>>> END DO >>>>> >>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>> >>>>> !-----Create KSP and PC >>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>> call KSPGetPC(ksp,pc,ierr) >>>>> call KSPSetUp(ksp, ierr) >>>>> call PCSetType(pc,PCGASM, ierr) >>>>> call PCSetUp(pc , ierr) >>>>> >>>>> !-----GASM setup >>>>> NSubx = 4 >>>>> dof = 1 >>>>> overlap = 0 >>>>> >>>>> call PCGASMCreateSubdomains2D(pc, >>>>> & M, M, >>>>> & NSubx, NSubx, >>>>> & dof, overlap, >>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>> >>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>> >>>>> call KSPDestroy(ksp, ierr) >>>>> call PetscFinalize(ierr) >>>>> >>>>> Running this on one processor, I get NSub = 4. >>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 >>>>> as expected. >>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - >>>>> access violation". So: >>>>> 1) why do I get two different results with ASM, and GASM? >>>>> 2) why do I get access violation and how can I solve this? >>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>>> objects. As I see on the Fortran interface, the arguments to >>>>> PCGASMCreateSubdomains2D are IS objects: >>>>> >>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>> import tPC,tIS >>>>> PC a ! PC >>>>> PetscInt b ! PetscInt >>>>> PetscInt c ! PetscInt >>>>> PetscInt d ! PetscInt >>>>> PetscInt e ! PetscInt >>>>> PetscInt f ! PetscInt >>>>> PetscInt g ! PetscInt >>>>> PetscInt h ! PetscInt >>>>> IS i ! IS >>>>> IS j ! IS >>>>> PetscErrorCode z >>>>> end subroutine PCGASMCreateSubdomains2D >>>>> Thus: >>>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, >>>>> for every created subdomain, the list of rows and columns defining the >>>>> subblock in the matrix, am I right? >>>>> >>>>> Context: I have a block-tridiagonal system arising from space-time >>>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>> >>>>> Thanks in advance, >>>>> Leonardo >>>>> >>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksl7912 at snu.ac.kr Fri May 5 02:49:28 2023 From: ksl7912 at snu.ac.kr (=?UTF-8?B?wq3qtozsirnrpqwgLyDtlZnsg50gLyDtla3qs7XsmrDso7zqs7XtlZnqs7w=?=) Date: Fri, 5 May 2023 16:49:28 +0900 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: Dear Barry Smith Thanks to you, I knew the difference between MATAIJ and MATDENSE. However, I still have some problems. There is no problem when I run with a single core. But, MatGetFactor error occurs when using multi-core. Could you give me some advice? The error message is [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: See https://petsc.org/release/overview/linear_solve_table/ for possible LU and Cholesky solvers [0]PETSC ERROR: MatSolverType petsc does not support matrix type mpidense [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.18.5, unknown [0]PETSC ERROR: ./app on a arch-linux-c-opt named ubuntu by ksl Fri May 5 00:35:23 2023 [0]PETSC ERROR: Configure options --download-mpich --with-debugging=0 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-mumps --download-scalapack --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc [0]PETSC ERROR: #1 MatGetFactor() at /home/ksl/petsc/src/mat/interface/matrix.c:4757 [0]PETSC ERROR: #2 main() at /home/ksl/Downloads/coding_test/coding/a1.c:66 [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_SELF, 92) - process 0 My code is below: int main(int argc, char** args) { Mat A, E, A_temp, A_fac; int n = 15; PetscInitialize(&argc, &args, NULL, NULL); PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); PetscCall(MatSetType(A,MATDENSE)); PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, n, n)); PetscCall(MatSetFromOptions(A)); PetscCall(MatSetUp(A)); // Insert values double val; for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { if (i == j){ val = 2.0; } else{ val = 1.0; } PetscCall(MatSetValue(A, i, j, val, INSERT_VALUES)); } } PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY)); PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY)); // Make Identity matrix PetscCall(MatCreate(PETSC_COMM_WORLD, &E)); PetscCall(MatSetType(E,MATDENSE)); PetscCall(MatSetSizes(E, PETSC_DECIDE, PETSC_DECIDE, n, n)); PetscCall(MatSetFromOptions(E)); PetscCall(MatSetUp(E)); PetscCall(MatShift(E,1.0)); PetscCall(MatAssemblyBegin(E, MAT_FINAL_ASSEMBLY)); PetscCall(MatAssemblyEnd(E, MAT_FINAL_ASSEMBLY)); PetscCall(MatDuplicate(A, MAT_DO_NOT_COPY_VALUES, &A_temp)); PetscCall(MatGetFactor(A, MATSOLVERPETSC, MAT_FACTOR_LU, &A_fac)); IS isr, isc; MatFactorInfo info; MatGetOrdering(A, MATORDERINGNATURAL, &isr, &isc); PetscCall(MatLUFactorSymbolic(A_fac, A, isr, isc, &info)); PetscCall(MatLUFactorNumeric(A_fac, A, &info)); MatMatSolve(A_fac, E, A_temp); PetscCall(MatView(A_temp, PETSC_VIEWER_STDOUT_WORLD)); MatDestroy(&A); MatDestroy(&A_temp); MatDestroy(&A_fac); MatDestroy(&E); PetscCall(PetscFinalize()); } Best regards Seung Lee Kwon 2023? 5? 4? (?) ?? 10:19, Barry Smith ?? ??: > > The code in ex125.c contains > > PetscCall(MatCreate(PETSC_COMM_WORLD, &C)); > PetscCall(MatSetOptionsPrefix(C, "rhs_")); > PetscCall(MatSetSizes(C, m, PETSC_DECIDE, PETSC_DECIDE, nrhs)); > PetscCall(MatSetType(C, MATDENSE)); > PetscCall(MatSetFromOptions(C)); > PetscCall(MatSetUp(C)); > > This dense parallel matrix is suitable for passing to MatMatSolve() as the > right-hand side matrix. Note it is created with PETSC_COMM_WORLD and its > type is set to be MATDENSE. > > You may need to make a sample code by stripping out all the excess code > in ex125.c to just create an MATAIJ and MATDENSE and solves with > MatMatSolve() to determine why you code does not work. > > > > On May 4, 2023, at 3:20 AM, ???? / ?? / ??????? wrote: > > Dear Barry Smith > > Thank you for your reply. > > I've already installed MUMPS. > > And I checked the example you said (ex125.c), I don't understand why the > RHS matrix becomes the SeqDense matrix. > > Could you explain in more detail? > > Best regards > Seung Lee Kwon > > 2023? 5? 4? (?) ?? 12:08, Barry Smith ?? ??: > >> >> You can configure with MUMPS ./configure --download-mumps >> --download-scalapack --download-ptscotch --download-metis >> --download-parmetis >> >> And then use MatMatSolve() as in src/mat/tests/ex125.c with parallel >> MatMatSolve() using MUMPS as the solver. >> >> Barry >> >> >> On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? >> wrote: >> >> Dear developers >> >> Thank you for your explanation. >> >> But I should use the MatCreateSeqDense because I want to use the >> MatMatSolve that B matrix must be a SeqDense matrix. >> >> Using MatMatSolve is an inevitable part of my code. >> >> Could you give me a comment to avoid this error? >> >> Best, >> >> Seung Lee Kwon >> >> 2023? 5? 3? (?) ?? 7:30, Matthew Knepley ?? ??: >> >>> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? >>> wrote: >>> >>>> Dear developers >>>> >>>> I'm trying to use parallel computing and I ran the command 'mpirun -np >>>> 4 ./app' >>>> >>>> In this case, there are two problems. >>>> >>>> *First,* I encountered error message >>>> /// >>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >>>> -------------------------------------------------------------- >>>> [1]PETSC ERROR: Invalid argument >>>> [1]PETSC ERROR: Comm must be of size 1 >>>> /// >>>> The code on the error position is >>>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >>>> >>> >>> 1) "Seq" means sequential, that is "not parallel". >>> >>> 2) This line should still be fine since PETSC_COMM_SELF is a serial >>> communicator >>> >>> 3) You should be checking the error code for each call, maybe using the >>> CHKERRQ() macro >>> >>> 4) Please always send the entire error message, not a snippet >>> >>> THanks >>> >>> Matt >>> >>> >>>> Could "MatCreateSeqDense" not be used in parallel computing? >>>> >>>> *Second*, the same error message is repeated as many times as the >>>> number of cores. >>>> if I use command -np 4, then the error message is repeated 4 times. >>>> Could you recommend some advice related to this? >>>> >>>> Best, >>>> Seung Lee Kwon >>>> >>>> -- >>>> Seung Lee Kwon, Ph.D.Candidate >>>> Aerospace Structures and Materials Laboratory >>>> Department of Mechanical and Aerospace Engineering >>>> Seoul National University >>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>> E-mail : ksl7912 at snu.ac.kr >>>> Office : +82-2-880-7389 >>>> C. P : +82-10-4695-1062 >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Seung Lee Kwon, Ph.D.Candidate >> Aerospace Structures and Materials Laboratory >> Department of Mechanical and Aerospace Engineering >> Seoul National University >> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >> E-mail : ksl7912 at snu.ac.kr >> Office : +82-2-880-7389 >> C. P : +82-10-4695-1062 >> >> >> > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 > > > -- Seung Lee Kwon, Ph.D.Candidate Aerospace Structures and Materials Laboratory Department of Mechanical and Aerospace Engineering Seoul National University Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 E-mail : ksl7912 at snu.ac.kr Office : +82-2-880-7389 C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Fri May 5 04:13:14 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Fri, 5 May 2023 11:13:14 +0200 Subject: [petsc-users] issues with VecSetValues in petsc 3.19 In-Reply-To: References: Message-ID: Hi Matt, I have some more questions on the fieldsplit saga :) I am running a 1M cell ahmed body case using the following options: "solver": "fgmres", "preconditioner": "fieldsplit", "absTol": 1e-6, "relTol": 0.0, "options":{ "pc_fieldsplit_type": "multiplicative", "fieldsplit_u_pc_type": "ml", "fieldsplit_p_pc_type": "ml", "fieldsplit_u_ksp_type": "preonly", "fieldsplit_p_ksp_type": "preonly", "fieldsplit_u_ksp_rtol": 1e-2, "fieldsplit_p_ksp_rtol": 1e-2, } I have run the case using -ksp_monitor_true_residual and this is the results I am getting: Residual norms for UPeqn_ solve. 0 KSP unpreconditioned resid norm 5.003190920461e+00 true resid norm 5.003190920461e+00 ||r(i)||/||b|| 4.993739374163e-03 1 KSP unpreconditioned resid norm 4.959389845148e+00 true resid norm 4.959389845148e+00 ||r(i)||/||b|| 4.950021043622e-03 2 KSP unpreconditioned resid norm 4.793097370730e+00 true resid norm 4.793097370730e+00 ||r(i)||/||b|| 4.784042712927e-03 3 KSP unpreconditioned resid norm 4.187770916162e+00 true resid norm 4.187770916162e+00 ||r(i)||/||b|| 4.179859782782e-03 4 KSP unpreconditioned resid norm 3.099045576565e+00 true resid norm 3.099045576565e+00 ||r(i)||/||b|| 3.093191158212e-03 5 KSP unpreconditioned resid norm 2.072551338956e+00 true resid norm 2.072551338956e+00 ||r(i)||/||b|| 2.068636074628e-03 6 KSP unpreconditioned resid norm 1.414678932482e+00 true resid norm 1.414678932482e+00 ||r(i)||/||b|| 1.412006457328e-03 7 KSP unpreconditioned resid norm 1.006854855789e+00 true resid norm 1.006854855789e+00 ||r(i)||/||b|| 1.004952802592e-03 8 KSP unpreconditioned resid norm 7.332800358083e-01 true resid norm 7.332800358084e-01 ||r(i)||/||b|| 7.318947938062e-04 9 KSP unpreconditioned resid norm 5.406076142092e-01 true resid norm 5.406076142093e-01 ||r(i)||/||b|| 5.395863503846e-04 10 KSP unpreconditioned resid norm 4.037336888099e-01 true resid norm 4.037336888100e-01 ||r(i)||/||b|| 4.029709940193e-04 11 KSP unpreconditioned resid norm 3.041388930530e-01 true resid norm 3.041388930530e-01 ||r(i)||/||b|| 3.035643431559e-04 12 KSP unpreconditioned resid norm 2.299815364065e-01 true resid norm 2.299815364066e-01 ||r(i)||/||b|| 2.295470774436e-04 13 KSP unpreconditioned resid norm 1.739866268817e-01 true resid norm 1.739866268817e-01 ||r(i)||/||b|| 1.736579481074e-04 14 KSP unpreconditioned resid norm 1.317133652074e-01 true resid norm 1.317133652074e-01 ||r(i)||/||b|| 1.314645450067e-04 15 KSP unpreconditioned resid norm 9.966247212017e-02 true resid norm 9.966247212019e-02 ||r(i)||/||b|| 9.947419937897e-05 16 KSP unpreconditioned resid norm 7.531138284402e-02 true resid norm 7.531138284404e-02 ||r(i)||/||b|| 7.516911183479e-05 17 KSP unpreconditioned resid norm 5.646770286889e-02 true resid norm 5.646770286889e-02 ||r(i)||/||b|| 5.636102952452e-05 18 KSP unpreconditioned resid norm 4.225114444838e-02 true resid norm 4.225114444838e-02 ||r(i)||/||b|| 4.217132765661e-05 19 KSP unpreconditioned resid norm 3.160393382046e-02 true resid norm 3.160393382046e-02 ||r(i)||/||b|| 3.154423071329e-05 20 KSP unpreconditioned resid norm 2.366499890335e-02 true resid norm 2.366499890334e-02 ||r(i)||/||b|| 2.362029326721e-05 21 KSP unpreconditioned resid norm 1.759504138840e-02 true resid norm 1.759504138839e-02 ||r(i)||/||b|| 1.756180253124e-05 22 KSP unpreconditioned resid norm 1.309326628086e-02 true resid norm 1.309326628085e-02 ||r(i)||/||b|| 1.306853174355e-05 23 KSP unpreconditioned resid norm 9.710513089816e-03 true resid norm 9.710513089808e-03 ||r(i)||/||b|| 9.692168923954e-06 24 KSP unpreconditioned resid norm 7.171039760236e-03 true resid norm 7.171039760227e-03 ||r(i)||/||b|| 7.157492922743e-06 25 KSP unpreconditioned resid norm 5.277221846963e-03 true resid norm 5.277221846949e-03 ||r(i)||/||b|| 5.267252627824e-06 26 KSP unpreconditioned resid norm 3.906960734588e-03 true resid norm 3.906960734575e-03 ||r(i)||/||b|| 3.899580080737e-06 27 KSP unpreconditioned resid norm 2.896843259283e-03 true resid norm 2.896843259273e-03 ||r(i)||/||b|| 2.891370822059e-06 28 KSP unpreconditioned resid norm 2.140269358580e-03 true resid norm 2.140269358568e-03 ||r(i)||/||b|| 2.136226167881e-06 29 KSP unpreconditioned resid norm 1.585513255966e-03 true resid norm 1.585513255956e-03 ||r(i)||/||b|| 1.582518057055e-06 30 KSP unpreconditioned resid norm 1.173839272299e-03 true resid norm 1.173839272299e-03 ||r(i)||/||b|| 1.171621768229e-06 31 KSP unpreconditioned resid norm 8.777233482545e-04 true resid norm 8.777233482544e-04 ||r(i)||/||b|| 8.760652378615e-07 32 KSP unpreconditioned resid norm 6.546689191353e-04 true resid norm 6.546689191370e-04 ||r(i)||/||b|| 6.534321816833e-07 33 KSP unpreconditioned resid norm 4.973281362004e-04 true resid norm 4.973281362015e-04 ||r(i)||/||b|| 4.963886317973e-07 34 KSP unpreconditioned resid norm 3.775325682448e-04 true resid norm 3.775325682480e-04 ||r(i)||/||b|| 3.768193700902e-07 35 KSP unpreconditioned resid norm 2.908052735383e-04 true resid norm 2.908052735387e-04 ||r(i)||/||b|| 2.902559122310e-07 36 KSP unpreconditioned resid norm 2.239556213185e-04 true resid norm 2.239556213218e-04 ||r(i)||/||b|| 2.235325459370e-07 37 KSP unpreconditioned resid norm 1.698373081988e-04 true resid norm 1.698373081996e-04 ||r(i)||/||b|| 1.695164679184e-07 38 KSP unpreconditioned resid norm 1.277467123301e-04 true resid norm 1.277467123333e-04 ||r(i)||/||b|| 1.275053855510e-07 39 KSP unpreconditioned resid norm 9.506326626848e-05 true resid norm 9.506326626983e-05 ||r(i)||/||b|| 9.488368190523e-08 40 KSP unpreconditioned resid norm 7.223958235163e-05 true resid norm 7.223958235336e-05 ||r(i)||/||b|| 7.210311429367e-08 41 KSP unpreconditioned resid norm 5.509671615415e-05 true resid norm 5.509671615512e-05 ||r(i)||/||b|| 5.499263274677e-08 42 KSP unpreconditioned resid norm 4.189229263778e-05 true resid norm 4.189229263744e-05 ||r(i)||/||b|| 4.181315375394e-08 43 KSP unpreconditioned resid norm 3.067856645608e-05 true resid norm 3.067856645894e-05 ||r(i)||/||b|| 3.062061146665e-08 44 KSP unpreconditioned resid norm 2.340298386078e-05 true resid norm 2.340298386081e-05 ||r(i)||/||b|| 2.335877319825e-08 45 KSP unpreconditioned resid norm 1.791143784234e-05 true resid norm 1.791143784276e-05 ||r(i)||/||b|| 1.787760127991e-08 46 KSP unpreconditioned resid norm 1.355654057227e-05 true resid norm 1.355654057087e-05 ||r(i)||/||b|| 1.353093086041e-08 47 KSP unpreconditioned resid norm 1.020861247518e-05 true resid norm 1.020861247732e-05 ||r(i)||/||b|| 1.018932735009e-08 48 KSP unpreconditioned resid norm 7.642335784085e-06 true resid norm 7.642335784452e-06 ||r(i)||/||b|| 7.627898619923e-09 49 KSP unpreconditioned resid norm 5.874756954976e-06 true resid norm 5.874756956850e-06 ||r(i)||/||b|| 5.863658931960e-09 50 KSP unpreconditioned resid norm 4.512356844825e-06 true resid norm 4.512356846552e-06 ||r(i)||/||b|| 4.503832536701e-09 51 KSP unpreconditioned resid norm 3.438985239280e-06 true resid norm 3.438985240743e-06 ||r(i)||/||b|| 3.432488641125e-09 52 KSP unpreconditioned resid norm 2.655998139390e-06 true resid norm 2.655998140374e-06 ||r(i)||/||b|| 2.650980684556e-09 53 KSP unpreconditioned resid norm 2.051081181832e-06 true resid norm 2.051081181929e-06 ||r(i)||/||b|| 2.047206476954e-09 54 KSP unpreconditioned resid norm 1.581756364000e-06 true resid norm 1.581756364725e-06 ||r(i)||/||b|| 1.578768262982e-09 55 KSP unpreconditioned resid norm 1.207420527415e-06 true resid norm 1.207420527996e-06 ||r(i)||/||b|| 1.205139585453e-09 56 KSP unpreconditioned resid norm 9.377914349033e-07 true resid norm 9.377914347157e-07 ||r(i)||/||b|| 9.360198494806e-10 I have the feeling I am self-inflicting some overhead not using relative tolerance (if I set 1e-3 I converge in 8 iters vs 56...), however I would like to ask the following question: >From your point of view, is this a reasonably convergence history? Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 5 04:21:37 2023 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 5 May 2023 05:21:37 -0400 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: On Fri, May 5, 2023 at 3:49?AM ???? / ?? / ??????? wrote: > Dear Barry Smith > > Thanks to you, I knew the difference between MATAIJ and MATDENSE. > > However, I still have some problems. > > There is no problem when I run with a single core. But, MatGetFactor error > occurs when using multi-core. > > Could you give me some advice? > > The error message is > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: See https://petsc.org/release/overview/linear_solve_table/ > for possible LU and Cholesky solvers > [0]PETSC ERROR: MatSolverType petsc does not support matrix type mpidense > PETSc uses 3rd party packages for parallel dense factorization. You would need to reconfigure with either ScaLAPACK or Elemental. Thanks, Matt > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.5, unknown > [0]PETSC ERROR: ./app on a arch-linux-c-opt named ubuntu by ksl Fri May 5 > 00:35:23 2023 > [0]PETSC ERROR: Configure options --download-mpich --with-debugging=0 > COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native > -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-mumps > --download-scalapack --download-parmetis --download-metis > --download-parmetis --download-hpddm --download-slepc > [0]PETSC ERROR: #1 MatGetFactor() at > /home/ksl/petsc/src/mat/interface/matrix.c:4757 > [0]PETSC ERROR: #2 main() at /home/ksl/Downloads/coding_test/coding/a1.c:66 > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_SELF, 92) - process 0 > > My code is below: > > int main(int argc, char** args) > { > Mat A, E, A_temp, A_fac; > int n = 15; > PetscInitialize(&argc, &args, NULL, NULL); > PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); > > PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); > PetscCall(MatSetType(A,MATDENSE)); > PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, n, n)); > PetscCall(MatSetFromOptions(A)); > PetscCall(MatSetUp(A)); > // Insert values > double val; > for (int i = 0; i < n; i++) { > for (int j = 0; j < n; j++) { > if (i == j){ > val = 2.0; > } > else{ > val = 1.0; > } > PetscCall(MatSetValue(A, i, j, val, INSERT_VALUES)); > } > } > PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY)); > PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY)); > > // Make Identity matrix > PetscCall(MatCreate(PETSC_COMM_WORLD, &E)); > PetscCall(MatSetType(E,MATDENSE)); > PetscCall(MatSetSizes(E, PETSC_DECIDE, PETSC_DECIDE, n, n)); > PetscCall(MatSetFromOptions(E)); > PetscCall(MatSetUp(E)); > PetscCall(MatShift(E,1.0)); > PetscCall(MatAssemblyBegin(E, MAT_FINAL_ASSEMBLY)); > PetscCall(MatAssemblyEnd(E, MAT_FINAL_ASSEMBLY)); > > PetscCall(MatDuplicate(A, MAT_DO_NOT_COPY_VALUES, &A_temp)); > PetscCall(MatGetFactor(A, MATSOLVERPETSC, MAT_FACTOR_LU, &A_fac)); > > IS isr, isc; MatFactorInfo info; > MatGetOrdering(A, MATORDERINGNATURAL, &isr, &isc); > PetscCall(MatLUFactorSymbolic(A_fac, A, isr, isc, &info)); > PetscCall(MatLUFactorNumeric(A_fac, A, &info)); > MatMatSolve(A_fac, E, A_temp); > > PetscCall(MatView(A_temp, PETSC_VIEWER_STDOUT_WORLD)); > MatDestroy(&A); > MatDestroy(&A_temp); > MatDestroy(&A_fac); > MatDestroy(&E); > PetscCall(PetscFinalize()); > } > > Best regards > Seung Lee Kwon > > 2023? 5? 4? (?) ?? 10:19, Barry Smith ?? ??: > >> >> The code in ex125.c contains >> >> PetscCall(MatCreate(PETSC_COMM_WORLD, &C)); >> PetscCall(MatSetOptionsPrefix(C, "rhs_")); >> PetscCall(MatSetSizes(C, m, PETSC_DECIDE, PETSC_DECIDE, nrhs)); >> PetscCall(MatSetType(C, MATDENSE)); >> PetscCall(MatSetFromOptions(C)); >> PetscCall(MatSetUp(C)); >> >> This dense parallel matrix is suitable for passing to MatMatSolve() as >> the right-hand side matrix. Note it is created with PETSC_COMM_WORLD and >> its type is set to be MATDENSE. >> >> You may need to make a sample code by stripping out all the excess code >> in ex125.c to just create an MATAIJ and MATDENSE and solves with >> MatMatSolve() to determine why you code does not work. >> >> >> >> On May 4, 2023, at 3:20 AM, ???? / ?? / ??????? >> wrote: >> >> Dear Barry Smith >> >> Thank you for your reply. >> >> I've already installed MUMPS. >> >> And I checked the example you said (ex125.c), I don't understand why the >> RHS matrix becomes the SeqDense matrix. >> >> Could you explain in more detail? >> >> Best regards >> Seung Lee Kwon >> >> 2023? 5? 4? (?) ?? 12:08, Barry Smith ?? ??: >> >>> >>> You can configure with MUMPS ./configure --download-mumps >>> --download-scalapack --download-ptscotch --download-metis >>> --download-parmetis >>> >>> And then use MatMatSolve() as in src/mat/tests/ex125.c with parallel >>> MatMatSolve() using MUMPS as the solver. >>> >>> Barry >>> >>> >>> On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? >>> wrote: >>> >>> Dear developers >>> >>> Thank you for your explanation. >>> >>> But I should use the MatCreateSeqDense because I want to use the >>> MatMatSolve that B matrix must be a SeqDense matrix. >>> >>> Using MatMatSolve is an inevitable part of my code. >>> >>> Could you give me a comment to avoid this error? >>> >>> Best, >>> >>> Seung Lee Kwon >>> >>> 2023? 5? 3? (?) ?? 7:30, Matthew Knepley ?? ??: >>> >>>> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? >>>> wrote: >>>> >>>>> Dear developers >>>>> >>>>> I'm trying to use parallel computing and I ran the command 'mpirun -np >>>>> 4 ./app' >>>>> >>>>> In this case, there are two problems. >>>>> >>>>> *First,* I encountered error message >>>>> /// >>>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >>>>> -------------------------------------------------------------- >>>>> [1]PETSC ERROR: Invalid argument >>>>> [1]PETSC ERROR: Comm must be of size 1 >>>>> /// >>>>> The code on the error position is >>>>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >>>>> >>>> >>>> 1) "Seq" means sequential, that is "not parallel". >>>> >>>> 2) This line should still be fine since PETSC_COMM_SELF is a serial >>>> communicator >>>> >>>> 3) You should be checking the error code for each call, maybe using the >>>> CHKERRQ() macro >>>> >>>> 4) Please always send the entire error message, not a snippet >>>> >>>> THanks >>>> >>>> Matt >>>> >>>> >>>>> Could "MatCreateSeqDense" not be used in parallel computing? >>>>> >>>>> *Second*, the same error message is repeated as many times as the >>>>> number of cores. >>>>> if I use command -np 4, then the error message is repeated 4 times. >>>>> Could you recommend some advice related to this? >>>>> >>>>> Best, >>>>> Seung Lee Kwon >>>>> >>>>> -- >>>>> Seung Lee Kwon, Ph.D.Candidate >>>>> Aerospace Structures and Materials Laboratory >>>>> Department of Mechanical and Aerospace Engineering >>>>> Seoul National University >>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>> E-mail : ksl7912 at snu.ac.kr >>>>> Office : +82-2-880-7389 >>>>> C. P : +82-10-4695-1062 >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> Seung Lee Kwon, Ph.D.Candidate >>> Aerospace Structures and Materials Laboratory >>> Department of Mechanical and Aerospace Engineering >>> Seoul National University >>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>> E-mail : ksl7912 at snu.ac.kr >>> Office : +82-2-880-7389 >>> C. P : +82-10-4695-1062 >>> >>> >>> >> >> -- >> Seung Lee Kwon, Ph.D.Candidate >> Aerospace Structures and Materials Laboratory >> Department of Mechanical and Aerospace Engineering >> Seoul National University >> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >> E-mail : ksl7912 at snu.ac.kr >> Office : +82-2-880-7389 >> C. P : +82-10-4695-1062 >> >> >> > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 5 04:26:23 2023 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 5 May 2023 05:26:23 -0400 Subject: [petsc-users] issues with VecSetValues in petsc 3.19 In-Reply-To: References: Message-ID: On Fri, May 5, 2023 at 5:13?AM Edoardo alinovi wrote: > Hi Matt, > > I have some more questions on the fieldsplit saga :) > > I am running a 1M cell ahmed body case using the following options: > > "solver": "fgmres", > "preconditioner": "fieldsplit", > "absTol": 1e-6, > "relTol": 0.0, > > "options":{ > "pc_fieldsplit_type": "multiplicative", > "fieldsplit_u_pc_type": "ml", > "fieldsplit_p_pc_type": "ml", > "fieldsplit_u_ksp_type": "preonly", > "fieldsplit_p_ksp_type": "preonly", > "fieldsplit_u_ksp_rtol": 1e-2, > "fieldsplit_p_ksp_rtol": 1e-2, > } > > > I have run the case using -ksp_monitor_true_residual and this is the > results I am getting: > > > Residual norms for UPeqn_ solve. > 0 KSP unpreconditioned resid norm 5.003190920461e+00 true resid norm > 5.003190920461e+00 ||r(i)||/||b|| 4.993739374163e-03 > 1 KSP unpreconditioned resid norm 4.959389845148e+00 true resid norm > 4.959389845148e+00 ||r(i)||/||b|| 4.950021043622e-03 > 2 KSP unpreconditioned resid norm 4.793097370730e+00 true resid norm > 4.793097370730e+00 ||r(i)||/||b|| 4.784042712927e-03 > 3 KSP unpreconditioned resid norm 4.187770916162e+00 true resid norm > 4.187770916162e+00 ||r(i)||/||b|| 4.179859782782e-03 > 4 KSP unpreconditioned resid norm 3.099045576565e+00 true resid norm > 3.099045576565e+00 ||r(i)||/||b|| 3.093191158212e-03 > 5 KSP unpreconditioned resid norm 2.072551338956e+00 true resid norm > 2.072551338956e+00 ||r(i)||/||b|| 2.068636074628e-03 > 6 KSP unpreconditioned resid norm 1.414678932482e+00 true resid norm > 1.414678932482e+00 ||r(i)||/||b|| 1.412006457328e-03 > 7 KSP unpreconditioned resid norm 1.006854855789e+00 true resid norm > 1.006854855789e+00 ||r(i)||/||b|| 1.004952802592e-03 > 8 KSP unpreconditioned resid norm 7.332800358083e-01 true resid norm > 7.332800358084e-01 ||r(i)||/||b|| 7.318947938062e-04 > 9 KSP unpreconditioned resid norm 5.406076142092e-01 true resid norm > 5.406076142093e-01 ||r(i)||/||b|| 5.395863503846e-04 > 10 KSP unpreconditioned resid norm 4.037336888099e-01 true resid norm > 4.037336888100e-01 ||r(i)||/||b|| 4.029709940193e-04 > 11 KSP unpreconditioned resid norm 3.041388930530e-01 true resid norm > 3.041388930530e-01 ||r(i)||/||b|| 3.035643431559e-04 > 12 KSP unpreconditioned resid norm 2.299815364065e-01 true resid norm > 2.299815364066e-01 ||r(i)||/||b|| 2.295470774436e-04 > 13 KSP unpreconditioned resid norm 1.739866268817e-01 true resid norm > 1.739866268817e-01 ||r(i)||/||b|| 1.736579481074e-04 > 14 KSP unpreconditioned resid norm 1.317133652074e-01 true resid norm > 1.317133652074e-01 ||r(i)||/||b|| 1.314645450067e-04 > 15 KSP unpreconditioned resid norm 9.966247212017e-02 true resid norm > 9.966247212019e-02 ||r(i)||/||b|| 9.947419937897e-05 > 16 KSP unpreconditioned resid norm 7.531138284402e-02 true resid norm > 7.531138284404e-02 ||r(i)||/||b|| 7.516911183479e-05 > 17 KSP unpreconditioned resid norm 5.646770286889e-02 true resid norm > 5.646770286889e-02 ||r(i)||/||b|| 5.636102952452e-05 > 18 KSP unpreconditioned resid norm 4.225114444838e-02 true resid norm > 4.225114444838e-02 ||r(i)||/||b|| 4.217132765661e-05 > 19 KSP unpreconditioned resid norm 3.160393382046e-02 true resid norm > 3.160393382046e-02 ||r(i)||/||b|| 3.154423071329e-05 > 20 KSP unpreconditioned resid norm 2.366499890335e-02 true resid norm > 2.366499890334e-02 ||r(i)||/||b|| 2.362029326721e-05 > 21 KSP unpreconditioned resid norm 1.759504138840e-02 true resid norm > 1.759504138839e-02 ||r(i)||/||b|| 1.756180253124e-05 > 22 KSP unpreconditioned resid norm 1.309326628086e-02 true resid norm > 1.309326628085e-02 ||r(i)||/||b|| 1.306853174355e-05 > 23 KSP unpreconditioned resid norm 9.710513089816e-03 true resid norm > 9.710513089808e-03 ||r(i)||/||b|| 9.692168923954e-06 > 24 KSP unpreconditioned resid norm 7.171039760236e-03 true resid norm > 7.171039760227e-03 ||r(i)||/||b|| 7.157492922743e-06 > 25 KSP unpreconditioned resid norm 5.277221846963e-03 true resid norm > 5.277221846949e-03 ||r(i)||/||b|| 5.267252627824e-06 > 26 KSP unpreconditioned resid norm 3.906960734588e-03 true resid norm > 3.906960734575e-03 ||r(i)||/||b|| 3.899580080737e-06 > 27 KSP unpreconditioned resid norm 2.896843259283e-03 true resid norm > 2.896843259273e-03 ||r(i)||/||b|| 2.891370822059e-06 > 28 KSP unpreconditioned resid norm 2.140269358580e-03 true resid norm > 2.140269358568e-03 ||r(i)||/||b|| 2.136226167881e-06 > 29 KSP unpreconditioned resid norm 1.585513255966e-03 true resid norm > 1.585513255956e-03 ||r(i)||/||b|| 1.582518057055e-06 > 30 KSP unpreconditioned resid norm 1.173839272299e-03 true resid norm > 1.173839272299e-03 ||r(i)||/||b|| 1.171621768229e-06 > 31 KSP unpreconditioned resid norm 8.777233482545e-04 true resid norm > 8.777233482544e-04 ||r(i)||/||b|| 8.760652378615e-07 > 32 KSP unpreconditioned resid norm 6.546689191353e-04 true resid norm > 6.546689191370e-04 ||r(i)||/||b|| 6.534321816833e-07 > 33 KSP unpreconditioned resid norm 4.973281362004e-04 true resid norm > 4.973281362015e-04 ||r(i)||/||b|| 4.963886317973e-07 > 34 KSP unpreconditioned resid norm 3.775325682448e-04 true resid norm > 3.775325682480e-04 ||r(i)||/||b|| 3.768193700902e-07 > 35 KSP unpreconditioned resid norm 2.908052735383e-04 true resid norm > 2.908052735387e-04 ||r(i)||/||b|| 2.902559122310e-07 > 36 KSP unpreconditioned resid norm 2.239556213185e-04 true resid norm > 2.239556213218e-04 ||r(i)||/||b|| 2.235325459370e-07 > 37 KSP unpreconditioned resid norm 1.698373081988e-04 true resid norm > 1.698373081996e-04 ||r(i)||/||b|| 1.695164679184e-07 > 38 KSP unpreconditioned resid norm 1.277467123301e-04 true resid norm > 1.277467123333e-04 ||r(i)||/||b|| 1.275053855510e-07 > 39 KSP unpreconditioned resid norm 9.506326626848e-05 true resid norm > 9.506326626983e-05 ||r(i)||/||b|| 9.488368190523e-08 > 40 KSP unpreconditioned resid norm 7.223958235163e-05 true resid norm > 7.223958235336e-05 ||r(i)||/||b|| 7.210311429367e-08 > 41 KSP unpreconditioned resid norm 5.509671615415e-05 true resid norm > 5.509671615512e-05 ||r(i)||/||b|| 5.499263274677e-08 > 42 KSP unpreconditioned resid norm 4.189229263778e-05 true resid norm > 4.189229263744e-05 ||r(i)||/||b|| 4.181315375394e-08 > 43 KSP unpreconditioned resid norm 3.067856645608e-05 true resid norm > 3.067856645894e-05 ||r(i)||/||b|| 3.062061146665e-08 > 44 KSP unpreconditioned resid norm 2.340298386078e-05 true resid norm > 2.340298386081e-05 ||r(i)||/||b|| 2.335877319825e-08 > 45 KSP unpreconditioned resid norm 1.791143784234e-05 true resid norm > 1.791143784276e-05 ||r(i)||/||b|| 1.787760127991e-08 > 46 KSP unpreconditioned resid norm 1.355654057227e-05 true resid norm > 1.355654057087e-05 ||r(i)||/||b|| 1.353093086041e-08 > 47 KSP unpreconditioned resid norm 1.020861247518e-05 true resid norm > 1.020861247732e-05 ||r(i)||/||b|| 1.018932735009e-08 > 48 KSP unpreconditioned resid norm 7.642335784085e-06 true resid norm > 7.642335784452e-06 ||r(i)||/||b|| 7.627898619923e-09 > 49 KSP unpreconditioned resid norm 5.874756954976e-06 true resid norm > 5.874756956850e-06 ||r(i)||/||b|| 5.863658931960e-09 > 50 KSP unpreconditioned resid norm 4.512356844825e-06 true resid norm > 4.512356846552e-06 ||r(i)||/||b|| 4.503832536701e-09 > 51 KSP unpreconditioned resid norm 3.438985239280e-06 true resid norm > 3.438985240743e-06 ||r(i)||/||b|| 3.432488641125e-09 > 52 KSP unpreconditioned resid norm 2.655998139390e-06 true resid norm > 2.655998140374e-06 ||r(i)||/||b|| 2.650980684556e-09 > 53 KSP unpreconditioned resid norm 2.051081181832e-06 true resid norm > 2.051081181929e-06 ||r(i)||/||b|| 2.047206476954e-09 > 54 KSP unpreconditioned resid norm 1.581756364000e-06 true resid norm > 1.581756364725e-06 ||r(i)||/||b|| 1.578768262982e-09 > 55 KSP unpreconditioned resid norm 1.207420527415e-06 true resid norm > 1.207420527996e-06 ||r(i)||/||b|| 1.205139585453e-09 > 56 KSP unpreconditioned resid norm 9.377914349033e-07 true resid norm > 9.377914347157e-07 ||r(i)||/||b|| 9.360198494806e-10 > > I have the feeling I am self-inflicting some overhead not using relative > tolerance (if I set 1e-3 I converge in 8 iters vs 56...), however I would > like to ask the following question: > > From your point of view, is this a reasonably convergence history? > It is steadily converging. It is hard to say what is achievable without really understanding the problem. You could imagine trying some variants: - Using "richardson" or "gmres" instead of "preonly" on the inner solvers - Using Schur complement instead of multiplicative if you had a decent approximation matrix for the Schur complement Thanks, Matt > Thank you! > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 5 04:28:13 2023 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 5 May 2023 05:28:13 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: On Fri, May 5, 2023 at 2:45?AM LEONARDO MUTTI < leonardo.mutti01 at universitadipavia.it> wrote: > Interesting, a priori I'm not sure this will work better, mainly because > I'd lose the compact band structure. > > As for waveform relaxation: I excluded it at first since it appears to be > requiring too many CPUs than I have to beat sequential solvers, plus it is > more complicated and I have very limited time at this project. > > For both suggestions, because of the way the space-time matrix is > generated, it is much more convenient for me to mess with the time > dimension than with space. > > Overall GASM seems a simpler way to go before trying other things. > Please let me know if you decide to add the GASM interfaces. > We will add them. It might take until the end of classes this term. Thanks, Matt > Thanks again. > Best, > Leonardo > > Il ven 5 mag 2023, 00:01 Matthew Knepley ha scritto: > >> On Thu, May 4, 2023 at 1:43?PM LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >>> Of course, I'll try to explain. >>> >>> I am solving a parabolic equation with space-time FEM and I want an >>> efficient solver/preconditioner for the resulting system. >>> The corresponding matrix, call it X, has an e.g. block bi-diagonal >>> structure, if the cG(1)-dG(0) method is used (i.e. implicit Euler solved in >>> batch). >>> Every block-row of X corresponds to a time instant. >>> >>> I want to introduce parallelism in time by subdividing X into >>> overlapping submatrices of e.g 2x2 or 3x3 blocks, along the block diagonal. >>> For instance, call X_i the individual blocks. The submatrices would be, >>> for various i, (X_{i-1,i-1},X_{i-1,i};X_{i,i-1},X_{i,i}). >>> I'd like each submatrix to be solved in parallel, to combine the various >>> results together in an ASM like fashion. >>> Every submatrix has thus a predecessor and a successor, and it overlaps >>> with both, so that as far as I could understand, GASM has to be used in >>> place of ASM. >>> >> >> Yes, ordered that way you need GASM. I wonder if inverting the ordering >> would be useful, namely putting the time index on the inside. >> Then the blocks would be over all time, but limited space, which is more >> the spirit of ASM I think. >> >> Have you considered waveform relaxation for this problem? >> >> Thanks, >> >> Matt >> >> >>> Hope this helps. >>> Best, >>> Leonardo >>> >>> Il giorno gio 4 mag 2023 alle ore 18:05 Matthew Knepley < >>> knepley at gmail.com> ha scritto: >>> >>>> On Thu, May 4, 2023 at 11:24?AM LEONARDO MUTTI < >>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>> >>>>> Thank you for the help. >>>>> Adding to my example: >>>>> >>>>> >>>>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>>>> inflated_IS,ierr) call >>>>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>>>> results in: >>>>> >>>>> * Error LNK2019 unresolved external symbol >>>>> PCGASMDESTROYSUBDOMAINS referenced in function ... * >>>>> >>>>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>>>> referenced in function ... * >>>>> I'm not sure if the interfaces are missing or if I have a compilation >>>>> problem. >>>>> >>>> >>>> I just want to make sure you really want GASM. It sounded like you >>>> might able to do what you want just with ASM. >>>> Can you tell me again what you want to do overall? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thank you again. >>>>> Best, >>>>> Leonardo >>>>> >>>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith >>>>> ha scritto: >>>>> >>>>>> >>>>>> Thank you for the test code. I have a fix in the branch >>>>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>> with >>>>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>>> >>>>>> The functions did not have proper Fortran stubs and interfaces so >>>>>> I had to provide them manually in the new branch. >>>>>> >>>>>> Use >>>>>> >>>>>> git fetch >>>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>> >>>>>> ./configure etc >>>>>> >>>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I >>>>>> had to change things slightly and I updated the error handling for the >>>>>> latest version. >>>>>> >>>>>> Please let us know if you have any later questions. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>>> >>>>>> Hello. I am having a hard time understanding the index sets to feed >>>>>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>>>>> get more intuition on how the IS objects behave I tried the following >>>>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>>>> square, non-overlapping submatrices: >>>>>> >>>>>> #include >>>>>> #include >>>>>> #include >>>>>> USE petscmat >>>>>> USE petscksp >>>>>> USE petscpc >>>>>> >>>>>> Mat :: A >>>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>>> INTEGER :: I,J >>>>>> PetscErrorCode :: ierr >>>>>> PetscScalar :: v >>>>>> KSP :: ksp >>>>>> PC :: pc >>>>>> IS :: subdomains_IS, inflated_IS >>>>>> >>>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>>> >>>>>> !-----Create a dummy matrix >>>>>> M = 16 >>>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>> & M, M, >>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>> & A, ierr) >>>>>> >>>>>> DO I=1,M >>>>>> DO J=1,M >>>>>> v = I*J >>>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>>> & INSERT_VALUES , ierr) >>>>>> END DO >>>>>> END DO >>>>>> >>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>> >>>>>> !-----Create KSP and PC >>>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>>> call KSPGetPC(ksp,pc,ierr) >>>>>> call KSPSetUp(ksp, ierr) >>>>>> call PCSetType(pc,PCGASM, ierr) >>>>>> call PCSetUp(pc , ierr) >>>>>> >>>>>> !-----GASM setup >>>>>> NSubx = 4 >>>>>> dof = 1 >>>>>> overlap = 0 >>>>>> >>>>>> call PCGASMCreateSubdomains2D(pc, >>>>>> & M, M, >>>>>> & NSubx, NSubx, >>>>>> & dof, overlap, >>>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>>> >>>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>>> >>>>>> call KSPDestroy(ksp, ierr) >>>>>> call PetscFinalize(ierr) >>>>>> >>>>>> Running this on one processor, I get NSub = 4. >>>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = >>>>>> 16 as expected. >>>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - >>>>>> access violation". So: >>>>>> 1) why do I get two different results with ASM, and GASM? >>>>>> 2) why do I get access violation and how can I solve this? >>>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>>>> objects. As I see on the Fortran interface, the arguments to >>>>>> PCGASMCreateSubdomains2D are IS objects: >>>>>> >>>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>>> import tPC,tIS >>>>>> PC a ! PC >>>>>> PetscInt b ! PetscInt >>>>>> PetscInt c ! PetscInt >>>>>> PetscInt d ! PetscInt >>>>>> PetscInt e ! PetscInt >>>>>> PetscInt f ! PetscInt >>>>>> PetscInt g ! PetscInt >>>>>> PetscInt h ! PetscInt >>>>>> IS i ! IS >>>>>> IS j ! IS >>>>>> PetscErrorCode z >>>>>> end subroutine PCGASMCreateSubdomains2D >>>>>> Thus: >>>>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, >>>>>> for every created subdomain, the list of rows and columns defining the >>>>>> subblock in the matrix, am I right? >>>>>> >>>>>> Context: I have a block-tridiagonal system arising from space-time >>>>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>>> >>>>>> Thanks in advance, >>>>>> Leonardo >>>>>> >>>>>> >>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksl7912 at snu.ac.kr Fri May 5 06:25:12 2023 From: ksl7912 at snu.ac.kr (=?UTF-8?B?wq3qtozsirnrpqwgLyDtlZnsg50gLyDtla3qs7XsmrDso7zqs7XtlZnqs7w=?=) Date: Fri, 5 May 2023 20:25:12 +0900 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: Dear Matthew Knepley However, I've already installed ScaLAPACK. cd $PETSC_DIR ./configure --download-mpich --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mumps -- *download-scalapack* --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc Is there some way to use ScaLAPCK? Or, Can I run the part related to MatMatSolve with a single core? 2023? 5? 5? (?) ?? 6:21, Matthew Knepley ?? ??: > On Fri, May 5, 2023 at 3:49?AM ???? / ?? / ??????? > wrote: > >> Dear Barry Smith >> >> Thanks to you, I knew the difference between MATAIJ and MATDENSE. >> >> However, I still have some problems. >> >> There is no problem when I run with a single core. But, MatGetFactor >> error occurs when using multi-core. >> >> Could you give me some advice? >> >> The error message is >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: See >> https://petsc.org/release/overview/linear_solve_table/ for possible LU >> and Cholesky solvers >> [0]PETSC ERROR: MatSolverType petsc does not support matrix type mpidense >> > > PETSc uses 3rd party packages for parallel dense factorization. You would > need to reconfigure with either ScaLAPACK > or Elemental. > > Thanks, > > Matt > > >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.18.5, unknown >> [0]PETSC ERROR: ./app on a arch-linux-c-opt named ubuntu by ksl Fri May >> 5 00:35:23 2023 >> [0]PETSC ERROR: Configure options --download-mpich --with-debugging=0 >> COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native >> -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-mumps >> --download-scalapack --download-parmetis --download-metis >> --download-parmetis --download-hpddm --download-slepc >> [0]PETSC ERROR: #1 MatGetFactor() at >> /home/ksl/petsc/src/mat/interface/matrix.c:4757 >> [0]PETSC ERROR: #2 main() at >> /home/ksl/Downloads/coding_test/coding/a1.c:66 >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire >> error message to petsc-maint at mcs.anl.gov---------- >> application called MPI_Abort(MPI_COMM_SELF, 92) - process 0 >> >> My code is below: >> >> int main(int argc, char** args) >> { >> Mat A, E, A_temp, A_fac; >> int n = 15; >> PetscInitialize(&argc, &args, NULL, NULL); >> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); >> >> PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); >> PetscCall(MatSetType(A,MATDENSE)); >> PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, n, n)); >> PetscCall(MatSetFromOptions(A)); >> PetscCall(MatSetUp(A)); >> // Insert values >> double val; >> for (int i = 0; i < n; i++) { >> for (int j = 0; j < n; j++) { >> if (i == j){ >> val = 2.0; >> } >> else{ >> val = 1.0; >> } >> PetscCall(MatSetValue(A, i, j, val, INSERT_VALUES)); >> } >> } >> PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY)); >> PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY)); >> >> // Make Identity matrix >> PetscCall(MatCreate(PETSC_COMM_WORLD, &E)); >> PetscCall(MatSetType(E,MATDENSE)); >> PetscCall(MatSetSizes(E, PETSC_DECIDE, PETSC_DECIDE, n, n)); >> PetscCall(MatSetFromOptions(E)); >> PetscCall(MatSetUp(E)); >> PetscCall(MatShift(E,1.0)); >> PetscCall(MatAssemblyBegin(E, MAT_FINAL_ASSEMBLY)); >> PetscCall(MatAssemblyEnd(E, MAT_FINAL_ASSEMBLY)); >> >> PetscCall(MatDuplicate(A, MAT_DO_NOT_COPY_VALUES, &A_temp)); >> PetscCall(MatGetFactor(A, MATSOLVERPETSC, MAT_FACTOR_LU, &A_fac)); >> >> IS isr, isc; MatFactorInfo info; >> MatGetOrdering(A, MATORDERINGNATURAL, &isr, &isc); >> PetscCall(MatLUFactorSymbolic(A_fac, A, isr, isc, &info)); >> PetscCall(MatLUFactorNumeric(A_fac, A, &info)); >> MatMatSolve(A_fac, E, A_temp); >> >> PetscCall(MatView(A_temp, PETSC_VIEWER_STDOUT_WORLD)); >> MatDestroy(&A); >> MatDestroy(&A_temp); >> MatDestroy(&A_fac); >> MatDestroy(&E); >> PetscCall(PetscFinalize()); >> } >> >> Best regards >> Seung Lee Kwon >> >> 2023? 5? 4? (?) ?? 10:19, Barry Smith ?? ??: >> >>> >>> The code in ex125.c contains >>> >>> PetscCall(MatCreate(PETSC_COMM_WORLD, &C)); >>> PetscCall(MatSetOptionsPrefix(C, "rhs_")); >>> PetscCall(MatSetSizes(C, m, PETSC_DECIDE, PETSC_DECIDE, nrhs)); >>> PetscCall(MatSetType(C, MATDENSE)); >>> PetscCall(MatSetFromOptions(C)); >>> PetscCall(MatSetUp(C)); >>> >>> This dense parallel matrix is suitable for passing to MatMatSolve() as >>> the right-hand side matrix. Note it is created with PETSC_COMM_WORLD and >>> its type is set to be MATDENSE. >>> >>> You may need to make a sample code by stripping out all the excess >>> code in ex125.c to just create an MATAIJ and MATDENSE and solves with >>> MatMatSolve() to determine why you code does not work. >>> >>> >>> >>> On May 4, 2023, at 3:20 AM, ???? / ?? / ??????? >>> wrote: >>> >>> Dear Barry Smith >>> >>> Thank you for your reply. >>> >>> I've already installed MUMPS. >>> >>> And I checked the example you said (ex125.c), I don't understand why the >>> RHS matrix becomes the SeqDense matrix. >>> >>> Could you explain in more detail? >>> >>> Best regards >>> Seung Lee Kwon >>> >>> 2023? 5? 4? (?) ?? 12:08, Barry Smith ?? ??: >>> >>>> >>>> You can configure with MUMPS ./configure --download-mumps >>>> --download-scalapack --download-ptscotch --download-metis >>>> --download-parmetis >>>> >>>> And then use MatMatSolve() as in src/mat/tests/ex125.c with parallel >>>> MatMatSolve() using MUMPS as the solver. >>>> >>>> Barry >>>> >>>> >>>> On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? >>>> wrote: >>>> >>>> Dear developers >>>> >>>> Thank you for your explanation. >>>> >>>> But I should use the MatCreateSeqDense because I want to use the >>>> MatMatSolve that B matrix must be a SeqDense matrix. >>>> >>>> Using MatMatSolve is an inevitable part of my code. >>>> >>>> Could you give me a comment to avoid this error? >>>> >>>> Best, >>>> >>>> Seung Lee Kwon >>>> >>>> 2023? 5? 3? (?) ?? 7:30, Matthew Knepley ?? ??: >>>> >>>>> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? >>>>> wrote: >>>>> >>>>>> Dear developers >>>>>> >>>>>> I'm trying to use parallel computing and I ran the command 'mpirun >>>>>> -np 4 ./app' >>>>>> >>>>>> In this case, there are two problems. >>>>>> >>>>>> *First,* I encountered error message >>>>>> /// >>>>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >>>>>> -------------------------------------------------------------- >>>>>> [1]PETSC ERROR: Invalid argument >>>>>> [1]PETSC ERROR: Comm must be of size 1 >>>>>> /// >>>>>> The code on the error position is >>>>>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >>>>>> >>>>> >>>>> 1) "Seq" means sequential, that is "not parallel". >>>>> >>>>> 2) This line should still be fine since PETSC_COMM_SELF is a serial >>>>> communicator >>>>> >>>>> 3) You should be checking the error code for each call, maybe using >>>>> the CHKERRQ() macro >>>>> >>>>> 4) Please always send the entire error message, not a snippet >>>>> >>>>> THanks >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Could "MatCreateSeqDense" not be used in parallel computing? >>>>>> >>>>>> *Second*, the same error message is repeated as many times as the >>>>>> number of cores. >>>>>> if I use command -np 4, then the error message is repeated 4 times. >>>>>> Could you recommend some advice related to this? >>>>>> >>>>>> Best, >>>>>> Seung Lee Kwon >>>>>> >>>>>> -- >>>>>> Seung Lee Kwon, Ph.D.Candidate >>>>>> Aerospace Structures and Materials Laboratory >>>>>> Department of Mechanical and Aerospace Engineering >>>>>> Seoul National University >>>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>>> E-mail : ksl7912 at snu.ac.kr >>>>>> Office : +82-2-880-7389 >>>>>> C. P : +82-10-4695-1062 >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> Seung Lee Kwon, Ph.D.Candidate >>>> Aerospace Structures and Materials Laboratory >>>> Department of Mechanical and Aerospace Engineering >>>> Seoul National University >>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>> E-mail : ksl7912 at snu.ac.kr >>>> Office : +82-2-880-7389 >>>> C. P : +82-10-4695-1062 >>>> >>>> >>>> >>> >>> -- >>> Seung Lee Kwon, Ph.D.Candidate >>> Aerospace Structures and Materials Laboratory >>> Department of Mechanical and Aerospace Engineering >>> Seoul National University >>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>> E-mail : ksl7912 at snu.ac.kr >>> Office : +82-2-880-7389 >>> C. P : +82-10-4695-1062 >>> >>> >>> >> >> -- >> Seung Lee Kwon, Ph.D.Candidate >> Aerospace Structures and Materials Laboratory >> Department of Mechanical and Aerospace Engineering >> Seoul National University >> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >> E-mail : ksl7912 at snu.ac.kr >> Office : +82-2-880-7389 >> C. P : +82-10-4695-1062 >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Seung Lee Kwon, Ph.D.Candidate Aerospace Structures and Materials Laboratory Department of Mechanical and Aerospace Engineering Seoul National University Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 E-mail : ksl7912 at snu.ac.kr Office : +82-2-880-7389 C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.jolivet at lip6.fr Fri May 5 06:35:10 2023 From: pierre.jolivet at lip6.fr (Pierre Jolivet) Date: Fri, 5 May 2023 13:35:10 +0200 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: > On 5 May 2023, at 1:25 PM, ???? / ?? / ??????? wrote: > > Dear Matthew Knepley > > However, I've already installed ScaLAPACK. > cd $PETSC_DIR > ./configure --download-mpich --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mumps --download-scalapack --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc > > Is there some way to use ScaLAPCK? You need to convert your MatDense to MatSCALAPACK before the call to MatMatSolve(). This library (ScaLAPACK, but also Elemental) has severe limitations with respect to the matrix distribution. Depending on what you are doing, you may be better of using KSPMatSolve() and computing only an approximation of the solution with a cheap preconditioner (I don?t recall you telling us why you need to do such an operation even though we told you it was not practical ? or maybe I?m being confused by another thread). Thanks, Pierre > Or, Can I run the part related to MatMatSolve with a single core? > > 2023? 5? 5? (?) ?? 6:21, Matthew Knepley >?? ??: >> On Fri, May 5, 2023 at 3:49?AM ???? / ?? / ??????? > wrote: >>> Dear Barry Smith >>> >>> Thanks to you, I knew the difference between MATAIJ and MATDENSE. >>> >>> However, I still have some problems. >>> >>> There is no problem when I run with a single core. But, MatGetFactor error occurs when using multi-core. >>> >>> Could you give me some advice? >>> >>> The error message is >>> >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [0]PETSC ERROR: See https://petsc.org/release/overview/linear_solve_table/ for possible LU and Cholesky solvers >>> [0]PETSC ERROR: MatSolverType petsc does not support matrix type mpidense >> >> PETSc uses 3rd party packages for parallel dense factorization. You would need to reconfigure with either ScaLAPACK >> or Elemental. >> >> Thanks, >> >> Matt >> >>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.18.5, unknown >>> [0]PETSC ERROR: ./app on a arch-linux-c-opt named ubuntu by ksl Fri May 5 00:35:23 2023 >>> [0]PETSC ERROR: Configure options --download-mpich --with-debugging=0 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-mumps --download-scalapack --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc >>> [0]PETSC ERROR: #1 MatGetFactor() at /home/ksl/petsc/src/mat/interface/matrix.c:4757 >>> [0]PETSC ERROR: #2 main() at /home/ksl/Downloads/coding_test/coding/a1.c:66 >>> [0]PETSC ERROR: No PETSc Option Table entries >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >>> application called MPI_Abort(MPI_COMM_SELF, 92) - process 0 >>> >>> My code is below: >>> >>> int main(int argc, char** args) >>> { >>> Mat A, E, A_temp, A_fac; >>> int n = 15; >>> PetscInitialize(&argc, &args, NULL, NULL); >>> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); >>> >>> PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); >>> PetscCall(MatSetType(A,MATDENSE)); >>> PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, n, n)); >>> PetscCall(MatSetFromOptions(A)); >>> PetscCall(MatSetUp(A)); >>> // Insert values >>> double val; >>> for (int i = 0; i < n; i++) { >>> for (int j = 0; j < n; j++) { >>> if (i == j){ >>> val = 2.0; >>> } >>> else{ >>> val = 1.0; >>> } >>> PetscCall(MatSetValue(A, i, j, val, INSERT_VALUES)); >>> } >>> } >>> PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY)); >>> PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY)); >>> >>> // Make Identity matrix >>> PetscCall(MatCreate(PETSC_COMM_WORLD, &E)); >>> PetscCall(MatSetType(E,MATDENSE)); >>> PetscCall(MatSetSizes(E, PETSC_DECIDE, PETSC_DECIDE, n, n)); >>> PetscCall(MatSetFromOptions(E)); >>> PetscCall(MatSetUp(E)); >>> PetscCall(MatShift(E,1.0)); >>> PetscCall(MatAssemblyBegin(E, MAT_FINAL_ASSEMBLY)); >>> PetscCall(MatAssemblyEnd(E, MAT_FINAL_ASSEMBLY)); >>> >>> PetscCall(MatDuplicate(A, MAT_DO_NOT_COPY_VALUES, &A_temp)); >>> PetscCall(MatGetFactor(A, MATSOLVERPETSC, MAT_FACTOR_LU, &A_fac)); >>> >>> IS isr, isc; MatFactorInfo info; >>> MatGetOrdering(A, MATORDERINGNATURAL, &isr, &isc); >>> PetscCall(MatLUFactorSymbolic(A_fac, A, isr, isc, &info)); >>> PetscCall(MatLUFactorNumeric(A_fac, A, &info)); >>> MatMatSolve(A_fac, E, A_temp); >>> >>> PetscCall(MatView(A_temp, PETSC_VIEWER_STDOUT_WORLD)); >>> MatDestroy(&A); >>> MatDestroy(&A_temp); >>> MatDestroy(&A_fac); >>> MatDestroy(&E); >>> PetscCall(PetscFinalize()); >>> } >>> >>> Best regards >>> Seung Lee Kwon >>> >>> 2023? 5? 4? (?) ?? 10:19, Barry Smith >?? ??: >>>> >>>> The code in ex125.c contains >>>> >>>> PetscCall(MatCreate(PETSC_COMM_WORLD, &C)); >>>> PetscCall(MatSetOptionsPrefix(C, "rhs_")); >>>> PetscCall(MatSetSizes(C, m, PETSC_DECIDE, PETSC_DECIDE, nrhs)); >>>> PetscCall(MatSetType(C, MATDENSE)); >>>> PetscCall(MatSetFromOptions(C)); >>>> PetscCall(MatSetUp(C)); >>>> >>>> This dense parallel matrix is suitable for passing to MatMatSolve() as the right-hand side matrix. Note it is created with PETSC_COMM_WORLD and its type is set to be MATDENSE. >>>> >>>> You may need to make a sample code by stripping out all the excess code in ex125.c to just create an MATAIJ and MATDENSE and solves with MatMatSolve() to determine why you code does not work. >>>> >>>> >>>> >>>>> On May 4, 2023, at 3:20 AM, ???? / ?? / ??????? > wrote: >>>>> >>>>> Dear Barry Smith >>>>> >>>>> Thank you for your reply. >>>>> >>>>> I've already installed MUMPS. >>>>> >>>>> And I checked the example you said (ex125.c), I don't understand why the RHS matrix becomes the SeqDense matrix. >>>>> >>>>> Could you explain in more detail? >>>>> >>>>> Best regards >>>>> Seung Lee Kwon >>>>> >>>>> 2023? 5? 4? (?) ?? 12:08, Barry Smith >?? ??: >>>>>> >>>>>> You can configure with MUMPS ./configure --download-mumps --download-scalapack --download-ptscotch --download-metis --download-parmetis >>>>>> >>>>>> And then use MatMatSolve() as in src/mat/tests/ex125.c with parallel MatMatSolve() using MUMPS as the solver. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>>> On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? > wrote: >>>>>>> >>>>>>> Dear developers >>>>>>> >>>>>>> Thank you for your explanation. >>>>>>> >>>>>>> But I should use the MatCreateSeqDense because I want to use the MatMatSolve that B matrix must be a SeqDense matrix. >>>>>>> >>>>>>> Using MatMatSolve is an inevitable part of my code. >>>>>>> >>>>>>> Could you give me a comment to avoid this error? >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Seung Lee Kwon >>>>>>> >>>>>>> 2023? 5? 3? (?) ?? 7:30, Matthew Knepley >?? ??: >>>>>>>> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? > wrote: >>>>>>>>> Dear developers >>>>>>>>> >>>>>>>>> I'm trying to use parallel computing and I ran the command 'mpirun -np 4 ./app' >>>>>>>>> >>>>>>>>> In this case, there are two problems. >>>>>>>>> >>>>>>>>> First, I encountered error message >>>>>>>>> /// >>>>>>>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>>>>>>> [1]PETSC ERROR: Invalid argument >>>>>>>>> [1]PETSC ERROR: Comm must be of size 1 >>>>>>>>> /// >>>>>>>>> The code on the error position is >>>>>>>>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >>>>>>>> >>>>>>>> 1) "Seq" means sequential, that is "not parallel". >>>>>>>> >>>>>>>> 2) This line should still be fine since PETSC_COMM_SELF is a serial communicator >>>>>>>> >>>>>>>> 3) You should be checking the error code for each call, maybe using the CHKERRQ() macro >>>>>>>> >>>>>>>> 4) Please always send the entire error message, not a snippet >>>>>>>> >>>>>>>> THanks >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>>> Could "MatCreateSeqDense" not be used in parallel computing? >>>>>>>>> >>>>>>>>> Second, the same error message is repeated as many times as the number of cores. >>>>>>>>> if I use command -np 4, then the error message is repeated 4 times. >>>>>>>>> Could you recommend some advice related to this? >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Seung Lee Kwon >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Seung Lee Kwon, Ph.D.Candidate >>>>>>>>> Aerospace Structures and Materials Laboratory >>>>>>>>> Department of Mechanical and Aerospace Engineering >>>>>>>>> Seoul National University >>>>>>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>>>>>> E-mail : ksl7912 at snu.ac.kr >>>>>>>>> Office : +82-2-880-7389 >>>>>>>>> C. P : +82-10-4695-1062 >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Seung Lee Kwon, Ph.D.Candidate >>>>>>> Aerospace Structures and Materials Laboratory >>>>>>> Department of Mechanical and Aerospace Engineering >>>>>>> Seoul National University >>>>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>>>> E-mail : ksl7912 at snu.ac.kr >>>>>>> Office : +82-2-880-7389 >>>>>>> C. P : +82-10-4695-1062 >>>>>> >>>>> >>>>> >>>>> -- >>>>> Seung Lee Kwon, Ph.D.Candidate >>>>> Aerospace Structures and Materials Laboratory >>>>> Department of Mechanical and Aerospace Engineering >>>>> Seoul National University >>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>> E-mail : ksl7912 at snu.ac.kr >>>>> Office : +82-2-880-7389 >>>>> C. P : +82-10-4695-1062 >>>> >>> >>> >>> -- >>> Seung Lee Kwon, Ph.D.Candidate >>> Aerospace Structures and Materials Laboratory >>> Department of Mechanical and Aerospace Engineering >>> Seoul National University >>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>> E-mail : ksl7912 at snu.ac.kr >>> Office : +82-2-880-7389 >>> C. P : +82-10-4695-1062 >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksl7912 at snu.ac.kr Fri May 5 07:00:19 2023 From: ksl7912 at snu.ac.kr (=?UTF-8?B?wq3qtozsirnrpqwgLyDtlZnsg50gLyDtla3qs7XsmrDso7zqs7XtlZnqs7w=?=) Date: Fri, 5 May 2023 21:00:19 +0900 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: Dear Pierre Jolivet Thank you for your explanation. I will try to use a converting matrix. I know it's really inefficient, but I need an inverse matrix (inv(A)) itself for my research. If parallel computing is difficult to get inv(A), can I run the part related to MatMatSolve with a single core? Best, Seung Lee Kwon 2023? 5? 5? (?) ?? 8:35, Pierre Jolivet ?? ??: > > > On 5 May 2023, at 1:25 PM, ???? / ?? / ??????? wrote: > > Dear Matthew Knepley > > However, I've already installed ScaLAPACK. > cd $PETSC_DIR > ./configure --download-mpich --with-debugging=0 COPTFLAGS='-O3 > -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' > FOPTFLAGS='-O3 -march=native -mtune=native' --download-mumps -- > *download-scalapack* --download-parmetis --download-metis > --download-parmetis --download-hpddm --download-slepc > > Is there some way to use ScaLAPCK? > > > You need to convert your MatDense to MatSCALAPACK before the call to > MatMatSolve(). > This library (ScaLAPACK, but also Elemental) has severe limitations with > respect to the matrix distribution. > Depending on what you are doing, you may be better of using KSPMatSolve() > and computing only an approximation of the solution with a cheap > preconditioner (I don?t recall you telling us why you need to do such an > operation even though we told you it was not practical ? or maybe I?m being > confused by another thread). > > Thanks, > Pierre > > Or, Can I run the part related to MatMatSolve with a single core? > > 2023? 5? 5? (?) ?? 6:21, Matthew Knepley ?? ??: > >> On Fri, May 5, 2023 at 3:49?AM ???? / ?? / ??????? >> wrote: >> >>> Dear Barry Smith >>> >>> Thanks to you, I knew the difference between MATAIJ and MATDENSE. >>> >>> However, I still have some problems. >>> >>> There is no problem when I run with a single core. But, MatGetFactor >>> error occurs when using multi-core. >>> >>> Could you give me some advice? >>> >>> The error message is >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [0]PETSC ERROR: See >>> https://petsc.org/release/overview/linear_solve_table/ for possible LU >>> and Cholesky solvers >>> [0]PETSC ERROR: MatSolverType petsc does not support matrix type mpidense >>> >> >> PETSc uses 3rd party packages for parallel dense factorization. You would >> need to reconfigure with either ScaLAPACK >> or Elemental. >> >> Thanks, >> >> Matt >> >> >>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.18.5, unknown >>> [0]PETSC ERROR: ./app on a arch-linux-c-opt named ubuntu by ksl Fri May >>> 5 00:35:23 2023 >>> [0]PETSC ERROR: Configure options --download-mpich --with-debugging=0 >>> COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native >>> -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-mumps >>> --download-scalapack --download-parmetis --download-metis >>> --download-parmetis --download-hpddm --download-slepc >>> [0]PETSC ERROR: #1 MatGetFactor() at >>> /home/ksl/petsc/src/mat/interface/matrix.c:4757 >>> [0]PETSC ERROR: #2 main() at >>> /home/ksl/Downloads/coding_test/coding/a1.c:66 >>> [0]PETSC ERROR: No PETSc Option Table entries >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire >>> error message to petsc-maint at mcs.anl.gov---------- >>> application called MPI_Abort(MPI_COMM_SELF, 92) - process 0 >>> >>> My code is below: >>> >>> int main(int argc, char** args) >>> { >>> Mat A, E, A_temp, A_fac; >>> int n = 15; >>> PetscInitialize(&argc, &args, NULL, NULL); >>> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); >>> >>> PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); >>> PetscCall(MatSetType(A,MATDENSE)); >>> PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, n, n)); >>> PetscCall(MatSetFromOptions(A)); >>> PetscCall(MatSetUp(A)); >>> // Insert values >>> double val; >>> for (int i = 0; i < n; i++) { >>> for (int j = 0; j < n; j++) { >>> if (i == j){ >>> val = 2.0; >>> } >>> else{ >>> val = 1.0; >>> } >>> PetscCall(MatSetValue(A, i, j, val, INSERT_VALUES)); >>> } >>> } >>> PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY)); >>> PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY)); >>> >>> // Make Identity matrix >>> PetscCall(MatCreate(PETSC_COMM_WORLD, &E)); >>> PetscCall(MatSetType(E,MATDENSE)); >>> PetscCall(MatSetSizes(E, PETSC_DECIDE, PETSC_DECIDE, n, n)); >>> PetscCall(MatSetFromOptions(E)); >>> PetscCall(MatSetUp(E)); >>> PetscCall(MatShift(E,1.0)); >>> PetscCall(MatAssemblyBegin(E, MAT_FINAL_ASSEMBLY)); >>> PetscCall(MatAssemblyEnd(E, MAT_FINAL_ASSEMBLY)); >>> >>> PetscCall(MatDuplicate(A, MAT_DO_NOT_COPY_VALUES, &A_temp)); >>> PetscCall(MatGetFactor(A, MATSOLVERPETSC, MAT_FACTOR_LU, &A_fac)); >>> >>> IS isr, isc; MatFactorInfo info; >>> MatGetOrdering(A, MATORDERINGNATURAL, &isr, &isc); >>> PetscCall(MatLUFactorSymbolic(A_fac, A, isr, isc, &info)); >>> PetscCall(MatLUFactorNumeric(A_fac, A, &info)); >>> MatMatSolve(A_fac, E, A_temp); >>> >>> PetscCall(MatView(A_temp, PETSC_VIEWER_STDOUT_WORLD)); >>> MatDestroy(&A); >>> MatDestroy(&A_temp); >>> MatDestroy(&A_fac); >>> MatDestroy(&E); >>> PetscCall(PetscFinalize()); >>> } >>> >>> Best regards >>> Seung Lee Kwon >>> >>> 2023? 5? 4? (?) ?? 10:19, Barry Smith ?? ??: >>> >>>> >>>> The code in ex125.c contains >>>> >>>> PetscCall(MatCreate(PETSC_COMM_WORLD, &C)); >>>> PetscCall(MatSetOptionsPrefix(C, "rhs_")); >>>> PetscCall(MatSetSizes(C, m, PETSC_DECIDE, PETSC_DECIDE, nrhs)); >>>> PetscCall(MatSetType(C, MATDENSE)); >>>> PetscCall(MatSetFromOptions(C)); >>>> PetscCall(MatSetUp(C)); >>>> >>>> This dense parallel matrix is suitable for passing to MatMatSolve() as >>>> the right-hand side matrix. Note it is created with PETSC_COMM_WORLD and >>>> its type is set to be MATDENSE. >>>> >>>> You may need to make a sample code by stripping out all the excess >>>> code in ex125.c to just create an MATAIJ and MATDENSE and solves with >>>> MatMatSolve() to determine why you code does not work. >>>> >>>> >>>> >>>> On May 4, 2023, at 3:20 AM, ???? / ?? / ??????? >>>> wrote: >>>> >>>> Dear Barry Smith >>>> >>>> Thank you for your reply. >>>> >>>> I've already installed MUMPS. >>>> >>>> And I checked the example you said (ex125.c), I don't understand why >>>> the RHS matrix becomes the SeqDense matrix. >>>> >>>> Could you explain in more detail? >>>> >>>> Best regards >>>> Seung Lee Kwon >>>> >>>> 2023? 5? 4? (?) ?? 12:08, Barry Smith ?? ??: >>>> >>>>> >>>>> You can configure with MUMPS ./configure --download-mumps >>>>> --download-scalapack --download-ptscotch --download-metis >>>>> --download-parmetis >>>>> >>>>> And then use MatMatSolve() as in src/mat/tests/ex125.c with >>>>> parallel MatMatSolve() using MUMPS as the solver. >>>>> >>>>> Barry >>>>> >>>>> >>>>> On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? >>>>> wrote: >>>>> >>>>> Dear developers >>>>> >>>>> Thank you for your explanation. >>>>> >>>>> But I should use the MatCreateSeqDense because I want to use the >>>>> MatMatSolve that B matrix must be a SeqDense matrix. >>>>> >>>>> Using MatMatSolve is an inevitable part of my code. >>>>> >>>>> Could you give me a comment to avoid this error? >>>>> >>>>> Best, >>>>> >>>>> Seung Lee Kwon >>>>> >>>>> 2023? 5? 3? (?) ?? 7:30, Matthew Knepley ?? ??: >>>>> >>>>>> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? >>>>>> wrote: >>>>>> >>>>>>> Dear developers >>>>>>> >>>>>>> I'm trying to use parallel computing and I ran the command 'mpirun >>>>>>> -np 4 ./app' >>>>>>> >>>>>>> In this case, there are two problems. >>>>>>> >>>>>>> *First,* I encountered error message >>>>>>> /// >>>>>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >>>>>>> -------------------------------------------------------------- >>>>>>> [1]PETSC ERROR: Invalid argument >>>>>>> [1]PETSC ERROR: Comm must be of size 1 >>>>>>> /// >>>>>>> The code on the error position is >>>>>>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >>>>>>> >>>>>> >>>>>> 1) "Seq" means sequential, that is "not parallel". >>>>>> >>>>>> 2) This line should still be fine since PETSC_COMM_SELF is a serial >>>>>> communicator >>>>>> >>>>>> 3) You should be checking the error code for each call, maybe using >>>>>> the CHKERRQ() macro >>>>>> >>>>>> 4) Please always send the entire error message, not a snippet >>>>>> >>>>>> THanks >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Could "MatCreateSeqDense" not be used in parallel computing? >>>>>>> >>>>>>> *Second*, the same error message is repeated as many times as the >>>>>>> number of cores. >>>>>>> if I use command -np 4, then the error message is repeated 4 times. >>>>>>> Could you recommend some advice related to this? >>>>>>> >>>>>>> Best, >>>>>>> Seung Lee Kwon >>>>>>> >>>>>>> -- >>>>>>> Seung Lee Kwon, Ph.D.Candidate >>>>>>> Aerospace Structures and Materials Laboratory >>>>>>> Department of Mechanical and Aerospace Engineering >>>>>>> Seoul National University >>>>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, >>>>>>> 08826 >>>>>>> E-mail : ksl7912 at snu.ac.kr >>>>>>> Office : +82-2-880-7389 >>>>>>> C. P : +82-10-4695-1062 >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Seung Lee Kwon, Ph.D.Candidate >>>>> Aerospace Structures and Materials Laboratory >>>>> Department of Mechanical and Aerospace Engineering >>>>> Seoul National University >>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>> E-mail : ksl7912 at snu.ac.kr >>>>> Office : +82-2-880-7389 >>>>> C. P : +82-10-4695-1062 >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Seung Lee Kwon, Ph.D.Candidate >>>> Aerospace Structures and Materials Laboratory >>>> Department of Mechanical and Aerospace Engineering >>>> Seoul National University >>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>> E-mail : ksl7912 at snu.ac.kr >>>> Office : +82-2-880-7389 >>>> C. P : +82-10-4695-1062 >>>> >>>> >>>> >>> >>> -- >>> Seung Lee Kwon, Ph.D.Candidate >>> Aerospace Structures and Materials Laboratory >>> Department of Mechanical and Aerospace Engineering >>> Seoul National University >>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>> E-mail : ksl7912 at snu.ac.kr >>> Office : +82-2-880-7389 >>> C. P : +82-10-4695-1062 >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 > > > -- Seung Lee Kwon, Ph.D.Candidate Aerospace Structures and Materials Laboratory Department of Mechanical and Aerospace Engineering Seoul National University Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 E-mail : ksl7912 at snu.ac.kr Office : +82-2-880-7389 C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Fri May 5 07:11:37 2023 From: mlohry at gmail.com (Mark Lohry) Date: Fri, 5 May 2023 08:11:37 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: On Thu, May 4, 2023 at 9:51?PM Barry Smith wrote: > > Send configure.log > > > On May 4, 2023, at 5:35 PM, Mark Lohry wrote: > > Sure, but why only once and why save to disk? Why not just use that >> computed approximate Jacobian at each Newton step to drive the Newton >> solves along for a bunch of time steps? > > > Ah I get what you mean. Okay I did three newton steps with the same LHS, > with a few repeated manual tests. 3 out of 4 times i got the same exact > history. is it in the realm of possibility that a hardware error could > cause something this subtle, bad memory bit or something? > > 2 runs of 3 newton solves below, ever-so-slightly different. > > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.886124328003e+04 > 2 KSP Residual norm 2.504664994246e+04 > 3 KSP Residual norm 2.104615835161e+04 > 4 KSP Residual norm 1.938102896632e+04 > 5 KSP Residual norm 1.793774642408e+04 > 6 KSP Residual norm 1.671392566980e+04 > 7 KSP Residual norm 1.501504103873e+04 > 8 KSP Residual norm 1.366362900747e+04 > 9 KSP Residual norm 1.240398500429e+04 > 10 KSP Residual norm 1.156293733914e+04 > 11 KSP Residual norm 1.066296477958e+04 > 12 KSP Residual norm 9.835601966950e+03 > 13 KSP Residual norm 9.017480191491e+03 > 14 KSP Residual norm 8.415336139780e+03 > 15 KSP Residual norm 7.807497808435e+03 > 16 KSP Residual norm 7.341703768294e+03 > 17 KSP Residual norm 6.979298049282e+03 > 18 KSP Residual norm 6.521277772081e+03 > 19 KSP Residual norm 6.174842408773e+03 > 20 KSP Residual norm 5.889819665003e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 1.000525348433e+04 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 0 SNES Function norm 1.000525348433e+04 > 0 KSP Residual norm 1.000525348433e+04 > 1 KSP Residual norm 7.908741564765e+03 > 2 KSP Residual norm 6.825263536686e+03 > 3 KSP Residual norm 6.224930664968e+03 > 4 KSP Residual norm 6.095547180532e+03 > 5 KSP Residual norm 5.952968230430e+03 > 6 KSP Residual norm 5.861251998116e+03 > 7 KSP Residual norm 5.712439327755e+03 > 8 KSP Residual norm 5.583056913266e+03 > 9 KSP Residual norm 5.461768804626e+03 > 10 KSP Residual norm 5.351937611098e+03 > 11 KSP Residual norm 5.224288337578e+03 > 12 KSP Residual norm 5.129863847081e+03 > 13 KSP Residual norm 5.010818237218e+03 > 14 KSP Residual norm 4.907162936199e+03 > 15 KSP Residual norm 4.789564773955e+03 > 16 KSP Residual norm 4.695173370720e+03 > 17 KSP Residual norm 4.584070962171e+03 > 18 KSP Residual norm 4.483061424742e+03 > 19 KSP Residual norm 4.373384070745e+03 > 20 KSP Residual norm 4.260704657592e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 4.662386014882e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 0 SNES Function norm 4.662386014882e+03 > 0 KSP Residual norm 4.662386014882e+03 > 1 KSP Residual norm 4.408316259864e+03 > 2 KSP Residual norm 4.184867769829e+03 > 3 KSP Residual norm 4.079091244351e+03 > 4 KSP Residual norm 4.009247390166e+03 > 5 KSP Residual norm 3.928417371428e+03 > 6 KSP Residual norm 3.865152075780e+03 > 7 KSP Residual norm 3.795606446033e+03 > 8 KSP Residual norm 3.735294554158e+03 > 9 KSP Residual norm 3.674393726487e+03 > 10 KSP Residual norm 3.617795166786e+03 > 11 KSP Residual norm 3.563807982274e+03 > 12 KSP Residual norm 3.512269444921e+03 > 13 KSP Residual norm 3.455110223236e+03 > 14 KSP Residual norm 3.407141247372e+03 > 15 KSP Residual norm 3.356562415982e+03 > 16 KSP Residual norm 3.312720047685e+03 > 17 KSP Residual norm 3.263690150810e+03 > 18 KSP Residual norm 3.219359862444e+03 > 19 KSP Residual norm 3.173500955995e+03 > 20 KSP Residual norm 3.127528790155e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 3.186752172556e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > > > > 0 SNES Function norm 3.424003312857e+04 > 0 KSP Residual norm 3.424003312857e+04 > 1 KSP Residual norm 2.886124328003e+04 > 2 KSP Residual norm 2.504664994221e+04 > 3 KSP Residual norm 2.104615835130e+04 > 4 KSP Residual norm 1.938102896610e+04 > 5 KSP Residual norm 1.793774642406e+04 > 6 KSP Residual norm 1.671392566981e+04 > 7 KSP Residual norm 1.501504103854e+04 > 8 KSP Residual norm 1.366362900726e+04 > 9 KSP Residual norm 1.240398500414e+04 > 10 KSP Residual norm 1.156293733914e+04 > 11 KSP Residual norm 1.066296477972e+04 > 12 KSP Residual norm 9.835601967036e+03 > 13 KSP Residual norm 9.017480191500e+03 > 14 KSP Residual norm 8.415336139732e+03 > 15 KSP Residual norm 7.807497808414e+03 > 16 KSP Residual norm 7.341703768300e+03 > 17 KSP Residual norm 6.979298049244e+03 > 18 KSP Residual norm 6.521277772042e+03 > 19 KSP Residual norm 6.174842408713e+03 > 20 KSP Residual norm 5.889819664983e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 1.000525348435e+04 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 0 SNES Function norm 1.000525348435e+04 > 0 KSP Residual norm 1.000525348435e+04 > 1 KSP Residual norm 7.908741565645e+03 > 2 KSP Residual norm 6.825263536988e+03 > 3 KSP Residual norm 6.224930664967e+03 > 4 KSP Residual norm 6.095547180474e+03 > 5 KSP Residual norm 5.952968230397e+03 > 6 KSP Residual norm 5.861251998127e+03 > 7 KSP Residual norm 5.712439327726e+03 > 8 KSP Residual norm 5.583056913167e+03 > 9 KSP Residual norm 5.461768804526e+03 > 10 KSP Residual norm 5.351937611030e+03 > 11 KSP Residual norm 5.224288337536e+03 > 12 KSP Residual norm 5.129863847028e+03 > 13 KSP Residual norm 5.010818237161e+03 > 14 KSP Residual norm 4.907162936143e+03 > 15 KSP Residual norm 4.789564773923e+03 > 16 KSP Residual norm 4.695173370709e+03 > 17 KSP Residual norm 4.584070962145e+03 > 18 KSP Residual norm 4.483061424714e+03 > 19 KSP Residual norm 4.373384070713e+03 > 20 KSP Residual norm 4.260704657576e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 4.662386014874e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 0 SNES Function norm 4.662386014874e+03 > 0 KSP Residual norm 4.662386014874e+03 > 1 KSP Residual norm 4.408316259834e+03 > 2 KSP Residual norm 4.184867769891e+03 > 3 KSP Residual norm 4.079091244367e+03 > 4 KSP Residual norm 4.009247390184e+03 > 5 KSP Residual norm 3.928417371457e+03 > 6 KSP Residual norm 3.865152075802e+03 > 7 KSP Residual norm 3.795606446041e+03 > 8 KSP Residual norm 3.735294554160e+03 > 9 KSP Residual norm 3.674393726485e+03 > 10 KSP Residual norm 3.617795166775e+03 > 11 KSP Residual norm 3.563807982249e+03 > 12 KSP Residual norm 3.512269444873e+03 > 13 KSP Residual norm 3.455110223193e+03 > 14 KSP Residual norm 3.407141247334e+03 > 15 KSP Residual norm 3.356562415949e+03 > 16 KSP Residual norm 3.312720047652e+03 > 17 KSP Residual norm 3.263690150782e+03 > 18 KSP Residual norm 3.219359862425e+03 > 19 KSP Residual norm 3.173500955997e+03 > 20 KSP Residual norm 3.127528790156e+03 > Linear solve converged due to CONVERGED_ITS iterations 20 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > 1 SNES Function norm 3.186752172503e+03 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > SNES Object: 1 MPI process > type: newtonls > maximum iterations=1, maximum function evaluations=-1 > tolerances: relative=0.1, absolute=1e-15, solution=1e-15 > total number of linear solver iterations=20 > total number of function evaluations=2 > norm schedule ALWAYS > Jacobian is never rebuilt > Jacobian is built using finite differences with coloring > SNESLineSearch Object: 1 MPI process > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=20, initial guess is zero > tolerances: relative=0.1, absolute=1e-15, divergence=10. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI process > type: none > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqbaij > rows=16384, cols=16384, bs=16 > total: nonzeros=1277952, allocated nonzeros=1277952 > total number of mallocs used during MatSetValues calls=0 > block size is 16 > > On Thu, May 4, 2023 at 5:22?PM Matthew Knepley wrote: > >> On Thu, May 4, 2023 at 5:03?PM Mark Lohry wrote: >> >>> Do you get different results (in different runs) without >>>> -snes_mf_operator? So just using an explicit matrix? >>> >>> >>> Unfortunately I don't have an explicit matrix available for this, hence >>> the MFFD/JFNK. >>> >> >> I don't mean the actual matrix, I mean a representative matrix. >> >> >>> >>>> (Note: I am not convinced there is even a problem and think it may be >>>> simply different order of floating point operations in different runs.) >>>> >>> >>> I'm not convinced either, but running explicit RK for 10,000 iterations >>> i get exactly the same results every time so i'm fairly confident it's not >>> the residual evaluation. >>> How would there be a different order of floating point ops in different >>> runs in serial? >>> >>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>>> that solver with a sparse matrix. This would give me confidence >>>> that nothing in the solver is variable. >>>> >>>> I could do the sparse finite difference jacobian once, save it to disk, >>> and then use that system each time. >>> >> >> Yes. That would work. >> >> Thanks, >> >> Matt >> >> >>> On Thu, May 4, 2023 at 4:57?PM Matthew Knepley >>> wrote: >>> >>>> On Thu, May 4, 2023 at 4:44?PM Mark Lohry wrote: >>>> >>>>> Is your code valgrind clean? >>>>>> >>>>> >>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not >>>>> using anything uninitialized. >>>>> >>>>> >>>>>> We can try and test this. Replace your MatMFFD with an actual matrix >>>>>> and run. Do you see any variability? >>>>>> >>>>> >>>>> I think I did what you're asking. I have -snes_mf_operator set, and >>>>> then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where >>>>> diag_ones is a matrix with ones on the diagonal. Two runs below, still with >>>>> differences but sometimes identical. >>>>> >>>> >>>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>>> that solver with a sparse matrix. This would give me confidence >>>> that nothing in the solver is variable. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>> 3 KSP Residual norm 2.131675872968e+04 >>>>> 4 KSP Residual norm 1.973129814235e+04 >>>>> 5 KSP Residual norm 1.832377856317e+04 >>>>> 6 KSP Residual norm 1.716783617436e+04 >>>>> 7 KSP Residual norm 1.583963149542e+04 >>>>> 8 KSP Residual norm 1.482272170304e+04 >>>>> 9 KSP Residual norm 1.380312106742e+04 >>>>> 10 KSP Residual norm 1.297793480658e+04 >>>>> 11 KSP Residual norm 1.208599123244e+04 >>>>> 12 KSP Residual norm 1.137345655227e+04 >>>>> 13 KSP Residual norm 1.059676909366e+04 >>>>> 14 KSP Residual norm 1.003823862398e+04 >>>>> 15 KSP Residual norm 9.425879221354e+03 >>>>> 16 KSP Residual norm 8.954805890038e+03 >>>>> 17 KSP Residual norm 8.592372470456e+03 >>>>> 18 KSP Residual norm 8.060707175821e+03 >>>>> 19 KSP Residual norm 7.782057728723e+03 >>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>> Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqaij >>>>> rows=16384, cols=16384 >>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> 1 SNES Function norm 1.085015646971e+04 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=23 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is applied matrix-free with differencing >>>>> Preconditioning Jacobian is built using finite differences with >>>>> coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>> lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>> Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqaij >>>>> rows=16384, cols=16384 >>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>> 3 KSP Residual norm 2.131675873776e+04 >>>>> 4 KSP Residual norm 1.973129814908e+04 >>>>> 5 KSP Residual norm 1.832377852186e+04 >>>>> 6 KSP Residual norm 1.716783608174e+04 >>>>> 7 KSP Residual norm 1.583963128956e+04 >>>>> 8 KSP Residual norm 1.482272160069e+04 >>>>> 9 KSP Residual norm 1.380312087005e+04 >>>>> 10 KSP Residual norm 1.297793458796e+04 >>>>> 11 KSP Residual norm 1.208599115602e+04 >>>>> 12 KSP Residual norm 1.137345657533e+04 >>>>> 13 KSP Residual norm 1.059676906197e+04 >>>>> 14 KSP Residual norm 1.003823857515e+04 >>>>> 15 KSP Residual norm 9.425879177747e+03 >>>>> 16 KSP Residual norm 8.954805850825e+03 >>>>> 17 KSP Residual norm 8.592372413320e+03 >>>>> 18 KSP Residual norm 8.060706994110e+03 >>>>> 19 KSP Residual norm 7.782057560782e+03 >>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>> Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqaij >>>>> rows=16384, cols=16384 >>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> 1 SNES Function norm 1.085015821006e+04 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=23 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is applied matrix-free with differencing >>>>> Preconditioning Jacobian is built using finite differences with >>>>> coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>> lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>> Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix followed by preconditioner matrix: >>>>> Mat Object: 1 MPI process >>>>> type: mffd >>>>> rows=16384, cols=16384 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> Using wp compute h routine >>>>> Does not compute normU >>>>> Mat Object: 1 MPI process >>>>> type: seqaij >>>>> rows=16384, cols=16384 >>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> not using I-node routines >>>>> >>>>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: >>>>>> >>>>>>> Try -pc_type none. >>>>>>>> >>>>>>> >>>>>>> With -pc_type none the 0 KSP residual looks identical. But >>>>>>> *sometimes* it's producing exactly the same history and others it's >>>>>>> gradually changing. I'm reasonably confident my residual evaluation has no >>>>>>> randomness, see info after the petsc output. >>>>>>> >>>>>> >>>>>> We can try and test this. Replace your MatMFFD with an actual matrix >>>>>> and run. Do you see any variability? >>>>>> >>>>>> If not, then it could be your routine, or it could be MatMFFD. So run >>>>>> a few with -snes_view, and we can see if the >>>>>> "w" parameter changes. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> solve history 1: >>>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>> ... >>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>> >>>>>>> solve history 2, identical to 1: >>>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>> ... >>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>> >>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, >>>>>>> growing difference to the end: >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>> ... >>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>> >>>>>>> >>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 >>>>>>> iterations, so 30 calls of the same residual evaluation, identical >>>>>>> residuals every time >>>>>>> >>>>>>> run 1: >>>>>>> >>>>>>> # iteration rho rhou rhov >>>>>>> rhoE abs_res rel_res >>>>>>> umin vmax vmin elapsed_time >>>>>>> >>>>>>> # >>>>>>> >>>>>>> >>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>>>> 6.34834e-01 >>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>>>> 6.40063e-01 >>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>>>> 6.45166e-01 >>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>>>> 6.50494e-01 >>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>>>> 6.55656e-01 >>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>>>> 6.60872e-01 >>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>>>> 6.66041e-01 >>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>>>> 6.71316e-01 >>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>>>> 6.76447e-01 >>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>>>> 6.81716e-01 >>>>>>> >>>>>>> run N: >>>>>>> >>>>>>> >>>>>>> # >>>>>>> >>>>>>> >>>>>>> # iteration rho rhou rhov >>>>>>> rhoE abs_res rel_res >>>>>>> umin vmax vmin elapsed_time >>>>>>> >>>>>>> # >>>>>>> >>>>>>> >>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>>>> 6.23316e-01 >>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>>>> 6.28510e-01 >>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>>>> 6.33558e-01 >>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>>>> 6.38773e-01 >>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>>>> 6.43887e-01 >>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>>>> 6.49073e-01 >>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>>>> 6.54167e-01 >>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>>>> 6.59394e-01 >>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>>>> 6.64516e-01 >>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>>>> 6.69677e-01 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: >>>>>>> >>>>>>>> ASM is just the sub PC with one proc but gets weaker with more >>>>>>>> procs unless you use jacobi. (maybe I am missing something). >>>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry wrote: >>>>>>>> >>>>>>>>> Please send the output of -snes_view. >>>>>>>>>> >>>>>>>>> pasted below. anything stand out? >>>>>>>>> >>>>>>>>> >>>>>>>>> SNES Object: 1 MPI process >>>>>>>>> type: newtonls >>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>> total number of linear solver iterations=20 >>>>>>>>> total number of function evaluations=22 >>>>>>>>> norm schedule ALWAYS >>>>>>>>> Jacobian is never rebuilt >>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>> Preconditioning Jacobian is built using finite differences with >>>>>>>>> coloring >>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>> type: basic >>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>>>> lambda=1.000000e-08 >>>>>>>>> maximum iterations=40 >>>>>>>>> KSP Object: 1 MPI process >>>>>>>>> type: gmres >>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>>>> Orthogonalization with no iterative refinement >>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>> left preconditioning >>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI process >>>>>>>>> type: asm >>>>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>>>> restriction/interpolation type - RESTRICT >>>>>>>>> Local solver information for first block is in the following >>>>>>>>> KSP and PC objects on rank 0: >>>>>>>>> Use -ksp_view ::ascii_info_detail to display information for >>>>>>>>> all blocks >>>>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>>>> type: preonly >>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>>>>>> divergence=10000. >>>>>>>>> left preconditioning >>>>>>>>> using NONE norm type for convergence test >>>>>>>>> PC Object: (sub_) 1 MPI process >>>>>>>>> type: ilu >>>>>>>>> out-of-place factorization >>>>>>>>> 0 levels of fill >>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>> matrix ordering: natural >>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>> Factored matrix follows: >>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>> type: seqbaij >>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>> package used to perform factorization: petsc >>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>> block size is 16 >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>> type: seqbaij >>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>> block size is 16 >>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: mffd >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> Matrix-free approximation: >>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>> Using wp compute h routine >>>>>>>>> Does not compute normU >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: seqbaij >>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>> block size is 16 >>>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams wrote: >>>>>>>>> >>>>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>>>> -snes_view might give you that. >>>>>>>>>> >>>>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>>> apart? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>>>> >>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the >>>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is identical >>>>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated applications to >>>>>>>>>>>> solve a steady state multigrid problem, though here just one level) the >>>>>>>>>>>> differences add up such that I might reach global convergence in 35 >>>>>>>>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>>>>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>>>>>>>> would be. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The initial KSP residual is different, so its the PC. >>>>>>>>>>> Please send the output of -snes_view. If your ASM is using direct >>>>>>>>>>> factorization, then it >>>>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>>>> [...] >>>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>>>>>>> different >>>>>>>>>>>> >>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>>>> [...] >>>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>>> apart? That is the first couple of KSP iterations they are almost identical >>>>>>>>>>>>> but then for each iteration get a bit further. Similar for the SNES >>>>>>>>>>>>> iterations, starting close and then for more iterations and more solves >>>>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very different? You >>>>>>>>>>>>> can run with -snes_monitor -ksp_monitor >>>>>>>>>>>>> >>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, >>>>>>>>>>>>> was just guessing there. But the solutions/residuals are slightly different >>>>>>>>>>>>> from run to run. >>>>>>>>>>>>> >>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should >>>>>>>>>>>>> expect bitwise identical results? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> No, the coloring should be identical every time. Do you see >>>>>>>>>>>>>> differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an >>>>>>>>>>>>>> MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, >>>>>>>>>>>>>> with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in >>>>>>>>>>>>>> residuals from run to run. I'm wondering where randomness might enter here >>>>>>>>>>>>>> -- does the jacobian coloring use a random seed? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 1582158 bytes Desc: not available URL: From pierre.jolivet at lip6.fr Fri May 5 07:56:10 2023 From: pierre.jolivet at lip6.fr (Pierre Jolivet) Date: Fri, 5 May 2023 14:56:10 +0200 Subject: [petsc-users] parallel computing error In-Reply-To: References: Message-ID: > On 5 May 2023, at 2:00 PM, ???? / ?? / ??????? wrote: > > Dear Pierre Jolivet > > Thank you for your explanation. > > I will try to use a converting matrix. > > I know it's really inefficient, but I need an inverse matrix (inv(A)) itself for my research. > > If parallel computing is difficult to get inv(A), can I run the part related to MatMatSolve with a single core? Yes. Thanks, Pierre > Best, > Seung Lee Kwon > > 2023? 5? 5? (?) ?? 8:35, Pierre Jolivet >?? ??: >> >> >>> On 5 May 2023, at 1:25 PM, ???? / ?? / ??????? > wrote: >>> >>> Dear Matthew Knepley >>> >>> However, I've already installed ScaLAPACK. >>> cd $PETSC_DIR >>> ./configure --download-mpich --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mumps --download-scalapack --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc >>> >>> Is there some way to use ScaLAPCK? >> >> You need to convert your MatDense to MatSCALAPACK before the call to MatMatSolve(). >> This library (ScaLAPACK, but also Elemental) has severe limitations with respect to the matrix distribution. >> Depending on what you are doing, you may be better of using KSPMatSolve() and computing only an approximation of the solution with a cheap preconditioner (I don?t recall you telling us why you need to do such an operation even though we told you it was not practical ? or maybe I?m being confused by another thread). >> >> Thanks, >> Pierre >> >>> Or, Can I run the part related to MatMatSolve with a single core? >>> >>> 2023? 5? 5? (?) ?? 6:21, Matthew Knepley >?? ??: >>>> On Fri, May 5, 2023 at 3:49?AM ???? / ?? / ??????? > wrote: >>>>> Dear Barry Smith >>>>> >>>>> Thanks to you, I knew the difference between MATAIJ and MATDENSE. >>>>> >>>>> However, I still have some problems. >>>>> >>>>> There is no problem when I run with a single core. But, MatGetFactor error occurs when using multi-core. >>>>> >>>>> Could you give me some advice? >>>>> >>>>> The error message is >>>>> >>>>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>>> [0]PETSC ERROR: See https://petsc.org/release/overview/linear_solve_table/ for possible LU and Cholesky solvers >>>>> [0]PETSC ERROR: MatSolverType petsc does not support matrix type mpidense >>>> >>>> PETSc uses 3rd party packages for parallel dense factorization. You would need to reconfigure with either ScaLAPACK >>>> or Elemental. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>>>> [0]PETSC ERROR: Petsc Release Version 3.18.5, unknown >>>>> [0]PETSC ERROR: ./app on a arch-linux-c-opt named ubuntu by ksl Fri May 5 00:35:23 2023 >>>>> [0]PETSC ERROR: Configure options --download-mpich --with-debugging=0 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-mumps --download-scalapack --download-parmetis --download-metis --download-parmetis --download-hpddm --download-slepc >>>>> [0]PETSC ERROR: #1 MatGetFactor() at /home/ksl/petsc/src/mat/interface/matrix.c:4757 >>>>> [0]PETSC ERROR: #2 main() at /home/ksl/Downloads/coding_test/coding/a1.c:66 >>>>> [0]PETSC ERROR: No PETSc Option Table entries >>>>> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >>>>> application called MPI_Abort(MPI_COMM_SELF, 92) - process 0 >>>>> >>>>> My code is below: >>>>> >>>>> int main(int argc, char** args) >>>>> { >>>>> Mat A, E, A_temp, A_fac; >>>>> int n = 15; >>>>> PetscInitialize(&argc, &args, NULL, NULL); >>>>> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); >>>>> >>>>> PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); >>>>> PetscCall(MatSetType(A,MATDENSE)); >>>>> PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, n, n)); >>>>> PetscCall(MatSetFromOptions(A)); >>>>> PetscCall(MatSetUp(A)); >>>>> // Insert values >>>>> double val; >>>>> for (int i = 0; i < n; i++) { >>>>> for (int j = 0; j < n; j++) { >>>>> if (i == j){ >>>>> val = 2.0; >>>>> } >>>>> else{ >>>>> val = 1.0; >>>>> } >>>>> PetscCall(MatSetValue(A, i, j, val, INSERT_VALUES)); >>>>> } >>>>> } >>>>> PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY)); >>>>> PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY)); >>>>> >>>>> // Make Identity matrix >>>>> PetscCall(MatCreate(PETSC_COMM_WORLD, &E)); >>>>> PetscCall(MatSetType(E,MATDENSE)); >>>>> PetscCall(MatSetSizes(E, PETSC_DECIDE, PETSC_DECIDE, n, n)); >>>>> PetscCall(MatSetFromOptions(E)); >>>>> PetscCall(MatSetUp(E)); >>>>> PetscCall(MatShift(E,1.0)); >>>>> PetscCall(MatAssemblyBegin(E, MAT_FINAL_ASSEMBLY)); >>>>> PetscCall(MatAssemblyEnd(E, MAT_FINAL_ASSEMBLY)); >>>>> >>>>> PetscCall(MatDuplicate(A, MAT_DO_NOT_COPY_VALUES, &A_temp)); >>>>> PetscCall(MatGetFactor(A, MATSOLVERPETSC, MAT_FACTOR_LU, &A_fac)); >>>>> >>>>> IS isr, isc; MatFactorInfo info; >>>>> MatGetOrdering(A, MATORDERINGNATURAL, &isr, &isc); >>>>> PetscCall(MatLUFactorSymbolic(A_fac, A, isr, isc, &info)); >>>>> PetscCall(MatLUFactorNumeric(A_fac, A, &info)); >>>>> MatMatSolve(A_fac, E, A_temp); >>>>> >>>>> PetscCall(MatView(A_temp, PETSC_VIEWER_STDOUT_WORLD)); >>>>> MatDestroy(&A); >>>>> MatDestroy(&A_temp); >>>>> MatDestroy(&A_fac); >>>>> MatDestroy(&E); >>>>> PetscCall(PetscFinalize()); >>>>> } >>>>> >>>>> Best regards >>>>> Seung Lee Kwon >>>>> >>>>> 2023? 5? 4? (?) ?? 10:19, Barry Smith >?? ??: >>>>>> >>>>>> The code in ex125.c contains >>>>>> >>>>>> PetscCall(MatCreate(PETSC_COMM_WORLD, &C)); >>>>>> PetscCall(MatSetOptionsPrefix(C, "rhs_")); >>>>>> PetscCall(MatSetSizes(C, m, PETSC_DECIDE, PETSC_DECIDE, nrhs)); >>>>>> PetscCall(MatSetType(C, MATDENSE)); >>>>>> PetscCall(MatSetFromOptions(C)); >>>>>> PetscCall(MatSetUp(C)); >>>>>> >>>>>> This dense parallel matrix is suitable for passing to MatMatSolve() as the right-hand side matrix. Note it is created with PETSC_COMM_WORLD and its type is set to be MATDENSE. >>>>>> >>>>>> You may need to make a sample code by stripping out all the excess code in ex125.c to just create an MATAIJ and MATDENSE and solves with MatMatSolve() to determine why you code does not work. >>>>>> >>>>>> >>>>>> >>>>>>> On May 4, 2023, at 3:20 AM, ???? / ?? / ??????? > wrote: >>>>>>> >>>>>>> Dear Barry Smith >>>>>>> >>>>>>> Thank you for your reply. >>>>>>> >>>>>>> I've already installed MUMPS. >>>>>>> >>>>>>> And I checked the example you said (ex125.c), I don't understand why the RHS matrix becomes the SeqDense matrix. >>>>>>> >>>>>>> Could you explain in more detail? >>>>>>> >>>>>>> Best regards >>>>>>> Seung Lee Kwon >>>>>>> >>>>>>> 2023? 5? 4? (?) ?? 12:08, Barry Smith >?? ??: >>>>>>>> >>>>>>>> You can configure with MUMPS ./configure --download-mumps --download-scalapack --download-ptscotch --download-metis --download-parmetis >>>>>>>> >>>>>>>> And then use MatMatSolve() as in src/mat/tests/ex125.c with parallel MatMatSolve() using MUMPS as the solver. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> >>>>>>>>> On May 3, 2023, at 10:29 PM, ???? / ?? / ??????? > wrote: >>>>>>>>> >>>>>>>>> Dear developers >>>>>>>>> >>>>>>>>> Thank you for your explanation. >>>>>>>>> >>>>>>>>> But I should use the MatCreateSeqDense because I want to use the MatMatSolve that B matrix must be a SeqDense matrix. >>>>>>>>> >>>>>>>>> Using MatMatSolve is an inevitable part of my code. >>>>>>>>> >>>>>>>>> Could you give me a comment to avoid this error? >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Seung Lee Kwon >>>>>>>>> >>>>>>>>> 2023? 5? 3? (?) ?? 7:30, Matthew Knepley >?? ??: >>>>>>>>>> On Wed, May 3, 2023 at 6:05?AM ???? / ?? / ??????? > wrote: >>>>>>>>>>> Dear developers >>>>>>>>>>> >>>>>>>>>>> I'm trying to use parallel computing and I ran the command 'mpirun -np 4 ./app' >>>>>>>>>>> >>>>>>>>>>> In this case, there are two problems. >>>>>>>>>>> >>>>>>>>>>> First, I encountered error message >>>>>>>>>>> /// >>>>>>>>>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>>>>>>>>> [1]PETSC ERROR: Invalid argument >>>>>>>>>>> [1]PETSC ERROR: Comm must be of size 1 >>>>>>>>>>> /// >>>>>>>>>>> The code on the error position is >>>>>>>>>>> MatCreateSeqDense(PETSC_COMM_SELF, nns, ns, NULL, &Kns)); >>>>>>>>>> >>>>>>>>>> 1) "Seq" means sequential, that is "not parallel". >>>>>>>>>> >>>>>>>>>> 2) This line should still be fine since PETSC_COMM_SELF is a serial communicator >>>>>>>>>> >>>>>>>>>> 3) You should be checking the error code for each call, maybe using the CHKERRQ() macro >>>>>>>>>> >>>>>>>>>> 4) Please always send the entire error message, not a snippet >>>>>>>>>> >>>>>>>>>> THanks >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>>> Could "MatCreateSeqDense" not be used in parallel computing? >>>>>>>>>>> >>>>>>>>>>> Second, the same error message is repeated as many times as the number of cores. >>>>>>>>>>> if I use command -np 4, then the error message is repeated 4 times. >>>>>>>>>>> Could you recommend some advice related to this? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Seung Lee Kwon >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Seung Lee Kwon, Ph.D.Candidate >>>>>>>>>>> Aerospace Structures and Materials Laboratory >>>>>>>>>>> Department of Mechanical and Aerospace Engineering >>>>>>>>>>> Seoul National University >>>>>>>>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>>>>>>>> E-mail : ksl7912 at snu.ac.kr >>>>>>>>>>> Office : +82-2-880-7389 >>>>>>>>>>> C. P : +82-10-4695-1062 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Seung Lee Kwon, Ph.D.Candidate >>>>>>>>> Aerospace Structures and Materials Laboratory >>>>>>>>> Department of Mechanical and Aerospace Engineering >>>>>>>>> Seoul National University >>>>>>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>>>>>> E-mail : ksl7912 at snu.ac.kr >>>>>>>>> Office : +82-2-880-7389 >>>>>>>>> C. P : +82-10-4695-1062 >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Seung Lee Kwon, Ph.D.Candidate >>>>>>> Aerospace Structures and Materials Laboratory >>>>>>> Department of Mechanical and Aerospace Engineering >>>>>>> Seoul National University >>>>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>>>> E-mail : ksl7912 at snu.ac.kr >>>>>>> Office : +82-2-880-7389 >>>>>>> C. P : +82-10-4695-1062 >>>>>> >>>>> >>>>> >>>>> -- >>>>> Seung Lee Kwon, Ph.D.Candidate >>>>> Aerospace Structures and Materials Laboratory >>>>> Department of Mechanical and Aerospace Engineering >>>>> Seoul National University >>>>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>>>> E-mail : ksl7912 at snu.ac.kr >>>>> Office : +82-2-880-7389 >>>>> C. P : +82-10-4695-1062 >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> -- >>> Seung Lee Kwon, Ph.D.Candidate >>> Aerospace Structures and Materials Laboratory >>> Department of Mechanical and Aerospace Engineering >>> Seoul National University >>> Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 >>> E-mail : ksl7912 at snu.ac.kr >>> Office : +82-2-880-7389 >>> C. P : +82-10-4695-1062 >> > > > -- > Seung Lee Kwon, Ph.D.Candidate > Aerospace Structures and Materials Laboratory > Department of Mechanical and Aerospace Engineering > Seoul National University > Building 300 Rm 503, Gwanak-ro 1, Gwanak-gu, Seoul, South Korea, 08826 > E-mail : ksl7912 at snu.ac.kr > Office : +82-2-880-7389 > C. P : +82-10-4695-1062 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri May 5 09:51:42 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 5 May 2023 10:51:42 -0400 Subject: [petsc-users] issues with VecSetValues in petsc 3.19 In-Reply-To: References: Message-ID: To expand on what Matt said slightly. When you have a preconditioner based on (possibly nested) sub solves one generally "tunes" the solves to minimize time to solution. We recommend doing this by first using very accurate subsolves (when possible using direct solves inside); this tells us the "best reasonably expected" convergence rate of the outermost solve. One then "backs off" on the accuracy of the inner solves (making them less accurate), monitoring how this "backing off" affects the convergence rate of the outermost solve (one will often allow the convergence of the outer-most solve to get "a bit worse" in the hope that the since the cost of the inner solves will be much cheaper (with the less accuracy) the cost of the extra outer iterations will not make the entire solve time longer. Deciding how much to "back-off" on each inner solve is mostly determined by monitoring the time for convergence of the entire problem (but monitoring number of iterations also provides important information, for example if the number of outer iterations suddenly shots up you know you are probably undersolving a subproblem. While doing this process, it is important to work with "production" size problems since for PDEs the convergence rates are often very affected by the problem size. One should start with smaller problems to get a feeling for the tolerances needed but usually final tuning needs to be done for large problems. And, sadly, if one changes some parameters in the system a very different tuning may be optimal. Barry > On May 5, 2023, at 5:13 AM, Edoardo alinovi wrote: > > Hi Matt, > > I have some more questions on the fieldsplit saga :) > > I am running a 1M cell ahmed body case using the following options: > > "solver": "fgmres", > "preconditioner": "fieldsplit", > "absTol": 1e-6, > "relTol": 0.0, > > "options":{ > "pc_fieldsplit_type": "multiplicative", > "fieldsplit_u_pc_type": "ml", > "fieldsplit_p_pc_type": "ml", > "fieldsplit_u_ksp_type": "preonly", > "fieldsplit_p_ksp_type": "preonly", > "fieldsplit_u_ksp_rtol": 1e-2, > "fieldsplit_p_ksp_rtol": 1e-2, > } > > > I have run the case using -ksp_monitor_true_residual and this is the results I am getting: > > Residual norms for UPeqn_ solve. > 0 KSP unpreconditioned resid norm 5.003190920461e+00 true resid norm 5.003190920461e+00 ||r(i)||/||b|| 4.993739374163e-03 > 1 KSP unpreconditioned resid norm 4.959389845148e+00 true resid norm 4.959389845148e+00 ||r(i)||/||b|| 4.950021043622e-03 > 2 KSP unpreconditioned resid norm 4.793097370730e+00 true resid norm 4.793097370730e+00 ||r(i)||/||b|| 4.784042712927e-03 > 3 KSP unpreconditioned resid norm 4.187770916162e+00 true resid norm 4.187770916162e+00 ||r(i)||/||b|| 4.179859782782e-03 > 4 KSP unpreconditioned resid norm 3.099045576565e+00 true resid norm 3.099045576565e+00 ||r(i)||/||b|| 3.093191158212e-03 > 5 KSP unpreconditioned resid norm 2.072551338956e+00 true resid norm 2.072551338956e+00 ||r(i)||/||b|| 2.068636074628e-03 > 6 KSP unpreconditioned resid norm 1.414678932482e+00 true resid norm 1.414678932482e+00 ||r(i)||/||b|| 1.412006457328e-03 > 7 KSP unpreconditioned resid norm 1.006854855789e+00 true resid norm 1.006854855789e+00 ||r(i)||/||b|| 1.004952802592e-03 > 8 KSP unpreconditioned resid norm 7.332800358083e-01 true resid norm 7.332800358084e-01 ||r(i)||/||b|| 7.318947938062e-04 > 9 KSP unpreconditioned resid norm 5.406076142092e-01 true resid norm 5.406076142093e-01 ||r(i)||/||b|| 5.395863503846e-04 > 10 KSP unpreconditioned resid norm 4.037336888099e-01 true resid norm 4.037336888100e-01 ||r(i)||/||b|| 4.029709940193e-04 > 11 KSP unpreconditioned resid norm 3.041388930530e-01 true resid norm 3.041388930530e-01 ||r(i)||/||b|| 3.035643431559e-04 > 12 KSP unpreconditioned resid norm 2.299815364065e-01 true resid norm 2.299815364066e-01 ||r(i)||/||b|| 2.295470774436e-04 > 13 KSP unpreconditioned resid norm 1.739866268817e-01 true resid norm 1.739866268817e-01 ||r(i)||/||b|| 1.736579481074e-04 > 14 KSP unpreconditioned resid norm 1.317133652074e-01 true resid norm 1.317133652074e-01 ||r(i)||/||b|| 1.314645450067e-04 > 15 KSP unpreconditioned resid norm 9.966247212017e-02 true resid norm 9.966247212019e-02 ||r(i)||/||b|| 9.947419937897e-05 > 16 KSP unpreconditioned resid norm 7.531138284402e-02 true resid norm 7.531138284404e-02 ||r(i)||/||b|| 7.516911183479e-05 > 17 KSP unpreconditioned resid norm 5.646770286889e-02 true resid norm 5.646770286889e-02 ||r(i)||/||b|| 5.636102952452e-05 > 18 KSP unpreconditioned resid norm 4.225114444838e-02 true resid norm 4.225114444838e-02 ||r(i)||/||b|| 4.217132765661e-05 > 19 KSP unpreconditioned resid norm 3.160393382046e-02 true resid norm 3.160393382046e-02 ||r(i)||/||b|| 3.154423071329e-05 > 20 KSP unpreconditioned resid norm 2.366499890335e-02 true resid norm 2.366499890334e-02 ||r(i)||/||b|| 2.362029326721e-05 > 21 KSP unpreconditioned resid norm 1.759504138840e-02 true resid norm 1.759504138839e-02 ||r(i)||/||b|| 1.756180253124e-05 > 22 KSP unpreconditioned resid norm 1.309326628086e-02 true resid norm 1.309326628085e-02 ||r(i)||/||b|| 1.306853174355e-05 > 23 KSP unpreconditioned resid norm 9.710513089816e-03 true resid norm 9.710513089808e-03 ||r(i)||/||b|| 9.692168923954e-06 > 24 KSP unpreconditioned resid norm 7.171039760236e-03 true resid norm 7.171039760227e-03 ||r(i)||/||b|| 7.157492922743e-06 > 25 KSP unpreconditioned resid norm 5.277221846963e-03 true resid norm 5.277221846949e-03 ||r(i)||/||b|| 5.267252627824e-06 > 26 KSP unpreconditioned resid norm 3.906960734588e-03 true resid norm 3.906960734575e-03 ||r(i)||/||b|| 3.899580080737e-06 > 27 KSP unpreconditioned resid norm 2.896843259283e-03 true resid norm 2.896843259273e-03 ||r(i)||/||b|| 2.891370822059e-06 > 28 KSP unpreconditioned resid norm 2.140269358580e-03 true resid norm 2.140269358568e-03 ||r(i)||/||b|| 2.136226167881e-06 > 29 KSP unpreconditioned resid norm 1.585513255966e-03 true resid norm 1.585513255956e-03 ||r(i)||/||b|| 1.582518057055e-06 > 30 KSP unpreconditioned resid norm 1.173839272299e-03 true resid norm 1.173839272299e-03 ||r(i)||/||b|| 1.171621768229e-06 > 31 KSP unpreconditioned resid norm 8.777233482545e-04 true resid norm 8.777233482544e-04 ||r(i)||/||b|| 8.760652378615e-07 > 32 KSP unpreconditioned resid norm 6.546689191353e-04 true resid norm 6.546689191370e-04 ||r(i)||/||b|| 6.534321816833e-07 > 33 KSP unpreconditioned resid norm 4.973281362004e-04 true resid norm 4.973281362015e-04 ||r(i)||/||b|| 4.963886317973e-07 > 34 KSP unpreconditioned resid norm 3.775325682448e-04 true resid norm 3.775325682480e-04 ||r(i)||/||b|| 3.768193700902e-07 > 35 KSP unpreconditioned resid norm 2.908052735383e-04 true resid norm 2.908052735387e-04 ||r(i)||/||b|| 2.902559122310e-07 > 36 KSP unpreconditioned resid norm 2.239556213185e-04 true resid norm 2.239556213218e-04 ||r(i)||/||b|| 2.235325459370e-07 > 37 KSP unpreconditioned resid norm 1.698373081988e-04 true resid norm 1.698373081996e-04 ||r(i)||/||b|| 1.695164679184e-07 > 38 KSP unpreconditioned resid norm 1.277467123301e-04 true resid norm 1.277467123333e-04 ||r(i)||/||b|| 1.275053855510e-07 > 39 KSP unpreconditioned resid norm 9.506326626848e-05 true resid norm 9.506326626983e-05 ||r(i)||/||b|| 9.488368190523e-08 > 40 KSP unpreconditioned resid norm 7.223958235163e-05 true resid norm 7.223958235336e-05 ||r(i)||/||b|| 7.210311429367e-08 > 41 KSP unpreconditioned resid norm 5.509671615415e-05 true resid norm 5.509671615512e-05 ||r(i)||/||b|| 5.499263274677e-08 > 42 KSP unpreconditioned resid norm 4.189229263778e-05 true resid norm 4.189229263744e-05 ||r(i)||/||b|| 4.181315375394e-08 > 43 KSP unpreconditioned resid norm 3.067856645608e-05 true resid norm 3.067856645894e-05 ||r(i)||/||b|| 3.062061146665e-08 > 44 KSP unpreconditioned resid norm 2.340298386078e-05 true resid norm 2.340298386081e-05 ||r(i)||/||b|| 2.335877319825e-08 > 45 KSP unpreconditioned resid norm 1.791143784234e-05 true resid norm 1.791143784276e-05 ||r(i)||/||b|| 1.787760127991e-08 > 46 KSP unpreconditioned resid norm 1.355654057227e-05 true resid norm 1.355654057087e-05 ||r(i)||/||b|| 1.353093086041e-08 > 47 KSP unpreconditioned resid norm 1.020861247518e-05 true resid norm 1.020861247732e-05 ||r(i)||/||b|| 1.018932735009e-08 > 48 KSP unpreconditioned resid norm 7.642335784085e-06 true resid norm 7.642335784452e-06 ||r(i)||/||b|| 7.627898619923e-09 > 49 KSP unpreconditioned resid norm 5.874756954976e-06 true resid norm 5.874756956850e-06 ||r(i)||/||b|| 5.863658931960e-09 > 50 KSP unpreconditioned resid norm 4.512356844825e-06 true resid norm 4.512356846552e-06 ||r(i)||/||b|| 4.503832536701e-09 > 51 KSP unpreconditioned resid norm 3.438985239280e-06 true resid norm 3.438985240743e-06 ||r(i)||/||b|| 3.432488641125e-09 > 52 KSP unpreconditioned resid norm 2.655998139390e-06 true resid norm 2.655998140374e-06 ||r(i)||/||b|| 2.650980684556e-09 > 53 KSP unpreconditioned resid norm 2.051081181832e-06 true resid norm 2.051081181929e-06 ||r(i)||/||b|| 2.047206476954e-09 > 54 KSP unpreconditioned resid norm 1.581756364000e-06 true resid norm 1.581756364725e-06 ||r(i)||/||b|| 1.578768262982e-09 > 55 KSP unpreconditioned resid norm 1.207420527415e-06 true resid norm 1.207420527996e-06 ||r(i)||/||b|| 1.205139585453e-09 > 56 KSP unpreconditioned resid norm 9.377914349033e-07 true resid norm 9.377914347157e-07 ||r(i)||/||b|| 9.360198494806e-10 > > I have the feeling I am self-inflicting some overhead not using relative tolerance (if I set 1e-3 I converge in 8 iters vs 56...), however I would like to ask the following question: > > From your point of view, is this a reasonably convergence history? > > Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From vilmer.dahlberg at solid.lth.se Fri May 5 09:54:51 2023 From: vilmer.dahlberg at solid.lth.se (Vilmer Dahlberg) Date: Fri, 5 May 2023 14:54:51 +0000 Subject: [petsc-users] Issues creating DMPlex from higher order mesh generated by gmsh Message-ID: <3d6762c6d99c4d319e8e985c91bc739e@solid.lth.se> Hi. I'm trying to read a mesh of higher element order, in this example a mesh consisting of 10-node tetrahedral elements, from gmsh, into PETSC. But It looks like the mesh is not properly being loaded and converted into a DMPlex. gmsh tells me it has generated a mesh with 7087 nodes, but when I view my dm object it tells me it has 1081 0-cells. This is the printout I get ... Info : Done meshing order 2 (Wall 0.0169823s, CPU 0.016662s) Info : 7087 nodes 5838 elements ... DM Object: DM_0x84000000_0 1 MPI process type: plex DM_0x84000000_0 in 3 dimensions: Number of 0-cells per rank: 1081 Number of 1-cells per rank: 6006 Number of 2-cells per rank: 9104 Number of 3-cells per rank: 4178 Labels: celltype: 4 strata with value/size (0 (1081), 6 (4178), 3 (9104), 1 (6006)) depth: 4 strata with value/size (0 (1081), 1 (6006), 2 (9104), 3 (4178)) Cell Sets: 1 strata with value/size (2 (4178)) Face Sets: 6 strata with value/size (12 (190), 21 (242), 20 (242), 11 (192), 22 (242), 10 (188)) Field P2: adjacency FEM ... To replicate the error try generating a mesh according to https://gmsh.info/doc/texinfo/gmsh.html#t5 setting the element order to 2, and then loading the mesh using DMPlexCreateGmshFromFile I don't have any issues when i set the element order to 1. Thanks in advance, Vilmer -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri May 5 09:55:54 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 5 May 2023 10:55:54 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: Mark, Thank you. You do have aggressive optimizations: -O3 -march=native, which means out-of-order instructions may be performed thus, two runs may have different order of operations and possibly different round-off values. You could try turning off all of this with -O0 for an experiment and see what happens. My guess is that you will see much smaller differences in the residuals. Barry > On May 5, 2023, at 8:11 AM, Mark Lohry wrote: > > > > On Thu, May 4, 2023 at 9:51?PM Barry Smith > wrote: >> >> Send configure.log >> >> >>> On May 4, 2023, at 5:35 PM, Mark Lohry > wrote: >>> >>>> Sure, but why only once and why save to disk? Why not just use that computed approximate Jacobian at each Newton step to drive the Newton solves along for a bunch of time steps? >>> >>> Ah I get what you mean. Okay I did three newton steps with the same LHS, with a few repeated manual tests. 3 out of 4 times i got the same exact history. is it in the realm of possibility that a hardware error could cause something this subtle, bad memory bit or something? >>> >>> 2 runs of 3 newton solves below, ever-so-slightly different. >>> >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.886124328003e+04 >>> 2 KSP Residual norm 2.504664994246e+04 >>> 3 KSP Residual norm 2.104615835161e+04 >>> 4 KSP Residual norm 1.938102896632e+04 >>> 5 KSP Residual norm 1.793774642408e+04 >>> 6 KSP Residual norm 1.671392566980e+04 >>> 7 KSP Residual norm 1.501504103873e+04 >>> 8 KSP Residual norm 1.366362900747e+04 >>> 9 KSP Residual norm 1.240398500429e+04 >>> 10 KSP Residual norm 1.156293733914e+04 >>> 11 KSP Residual norm 1.066296477958e+04 >>> 12 KSP Residual norm 9.835601966950e+03 >>> 13 KSP Residual norm 9.017480191491e+03 >>> 14 KSP Residual norm 8.415336139780e+03 >>> 15 KSP Residual norm 7.807497808435e+03 >>> 16 KSP Residual norm 7.341703768294e+03 >>> 17 KSP Residual norm 6.979298049282e+03 >>> 18 KSP Residual norm 6.521277772081e+03 >>> 19 KSP Residual norm 6.174842408773e+03 >>> 20 KSP Residual norm 5.889819665003e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 1.000525348433e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 1.000525348433e+04 >>> 0 KSP Residual norm 1.000525348433e+04 >>> 1 KSP Residual norm 7.908741564765e+03 >>> 2 KSP Residual norm 6.825263536686e+03 >>> 3 KSP Residual norm 6.224930664968e+03 >>> 4 KSP Residual norm 6.095547180532e+03 >>> 5 KSP Residual norm 5.952968230430e+03 >>> 6 KSP Residual norm 5.861251998116e+03 >>> 7 KSP Residual norm 5.712439327755e+03 >>> 8 KSP Residual norm 5.583056913266e+03 >>> 9 KSP Residual norm 5.461768804626e+03 >>> 10 KSP Residual norm 5.351937611098e+03 >>> 11 KSP Residual norm 5.224288337578e+03 >>> 12 KSP Residual norm 5.129863847081e+03 >>> 13 KSP Residual norm 5.010818237218e+03 >>> 14 KSP Residual norm 4.907162936199e+03 >>> 15 KSP Residual norm 4.789564773955e+03 >>> 16 KSP Residual norm 4.695173370720e+03 >>> 17 KSP Residual norm 4.584070962171e+03 >>> 18 KSP Residual norm 4.483061424742e+03 >>> 19 KSP Residual norm 4.373384070745e+03 >>> 20 KSP Residual norm 4.260704657592e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 4.662386014882e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 4.662386014882e+03 >>> 0 KSP Residual norm 4.662386014882e+03 >>> 1 KSP Residual norm 4.408316259864e+03 >>> 2 KSP Residual norm 4.184867769829e+03 >>> 3 KSP Residual norm 4.079091244351e+03 >>> 4 KSP Residual norm 4.009247390166e+03 >>> 5 KSP Residual norm 3.928417371428e+03 >>> 6 KSP Residual norm 3.865152075780e+03 >>> 7 KSP Residual norm 3.795606446033e+03 >>> 8 KSP Residual norm 3.735294554158e+03 >>> 9 KSP Residual norm 3.674393726487e+03 >>> 10 KSP Residual norm 3.617795166786e+03 >>> 11 KSP Residual norm 3.563807982274e+03 >>> 12 KSP Residual norm 3.512269444921e+03 >>> 13 KSP Residual norm 3.455110223236e+03 >>> 14 KSP Residual norm 3.407141247372e+03 >>> 15 KSP Residual norm 3.356562415982e+03 >>> 16 KSP Residual norm 3.312720047685e+03 >>> 17 KSP Residual norm 3.263690150810e+03 >>> 18 KSP Residual norm 3.219359862444e+03 >>> 19 KSP Residual norm 3.173500955995e+03 >>> 20 KSP Residual norm 3.127528790155e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 3.186752172556e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> >>> >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.886124328003e+04 >>> 2 KSP Residual norm 2.504664994221e+04 >>> 3 KSP Residual norm 2.104615835130e+04 >>> 4 KSP Residual norm 1.938102896610e+04 >>> 5 KSP Residual norm 1.793774642406e+04 >>> 6 KSP Residual norm 1.671392566981e+04 >>> 7 KSP Residual norm 1.501504103854e+04 >>> 8 KSP Residual norm 1.366362900726e+04 >>> 9 KSP Residual norm 1.240398500414e+04 >>> 10 KSP Residual norm 1.156293733914e+04 >>> 11 KSP Residual norm 1.066296477972e+04 >>> 12 KSP Residual norm 9.835601967036e+03 >>> 13 KSP Residual norm 9.017480191500e+03 >>> 14 KSP Residual norm 8.415336139732e+03 >>> 15 KSP Residual norm 7.807497808414e+03 >>> 16 KSP Residual norm 7.341703768300e+03 >>> 17 KSP Residual norm 6.979298049244e+03 >>> 18 KSP Residual norm 6.521277772042e+03 >>> 19 KSP Residual norm 6.174842408713e+03 >>> 20 KSP Residual norm 5.889819664983e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 1.000525348435e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 1.000525348435e+04 >>> 0 KSP Residual norm 1.000525348435e+04 >>> 1 KSP Residual norm 7.908741565645e+03 >>> 2 KSP Residual norm 6.825263536988e+03 >>> 3 KSP Residual norm 6.224930664967e+03 >>> 4 KSP Residual norm 6.095547180474e+03 >>> 5 KSP Residual norm 5.952968230397e+03 >>> 6 KSP Residual norm 5.861251998127e+03 >>> 7 KSP Residual norm 5.712439327726e+03 >>> 8 KSP Residual norm 5.583056913167e+03 >>> 9 KSP Residual norm 5.461768804526e+03 >>> 10 KSP Residual norm 5.351937611030e+03 >>> 11 KSP Residual norm 5.224288337536e+03 >>> 12 KSP Residual norm 5.129863847028e+03 >>> 13 KSP Residual norm 5.010818237161e+03 >>> 14 KSP Residual norm 4.907162936143e+03 >>> 15 KSP Residual norm 4.789564773923e+03 >>> 16 KSP Residual norm 4.695173370709e+03 >>> 17 KSP Residual norm 4.584070962145e+03 >>> 18 KSP Residual norm 4.483061424714e+03 >>> 19 KSP Residual norm 4.373384070713e+03 >>> 20 KSP Residual norm 4.260704657576e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 4.662386014874e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 4.662386014874e+03 >>> 0 KSP Residual norm 4.662386014874e+03 >>> 1 KSP Residual norm 4.408316259834e+03 >>> 2 KSP Residual norm 4.184867769891e+03 >>> 3 KSP Residual norm 4.079091244367e+03 >>> 4 KSP Residual norm 4.009247390184e+03 >>> 5 KSP Residual norm 3.928417371457e+03 >>> 6 KSP Residual norm 3.865152075802e+03 >>> 7 KSP Residual norm 3.795606446041e+03 >>> 8 KSP Residual norm 3.735294554160e+03 >>> 9 KSP Residual norm 3.674393726485e+03 >>> 10 KSP Residual norm 3.617795166775e+03 >>> 11 KSP Residual norm 3.563807982249e+03 >>> 12 KSP Residual norm 3.512269444873e+03 >>> 13 KSP Residual norm 3.455110223193e+03 >>> 14 KSP Residual norm 3.407141247334e+03 >>> 15 KSP Residual norm 3.356562415949e+03 >>> 16 KSP Residual norm 3.312720047652e+03 >>> 17 KSP Residual norm 3.263690150782e+03 >>> 18 KSP Residual norm 3.219359862425e+03 >>> 19 KSP Residual norm 3.173500955997e+03 >>> 20 KSP Residual norm 3.127528790156e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 3.186752172503e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> >>> On Thu, May 4, 2023 at 5:22?PM Matthew Knepley > wrote: >>>> On Thu, May 4, 2023 at 5:03?PM Mark Lohry > wrote: >>>>>> Do you get different results (in different runs) without -snes_mf_operator? So just using an explicit matrix? >>>>> >>>>> Unfortunately I don't have an explicit matrix available for this, hence the MFFD/JFNK. >>>> >>>> I don't mean the actual matrix, I mean a representative matrix. >>>> >>>>>> >>>>>> (Note: I am not convinced there is even a problem and think it may be simply different order of floating point operations in different runs.) >>>>> >>>>> I'm not convinced either, but running explicit RK for 10,000 iterations i get exactly the same results every time so i'm fairly confident it's not the residual evaluation. >>>>> How would there be a different order of floating point ops in different runs in serial? >>>>> >>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence >>>>>> that nothing in the solver is variable. >>>>>> >>>>> I could do the sparse finite difference jacobian once, save it to disk, and then use that system each time. >>>> >>>> Yes. That would work. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> On Thu, May 4, 2023 at 4:57?PM Matthew Knepley > wrote: >>>>>> On Thu, May 4, 2023 at 4:44?PM Mark Lohry > wrote: >>>>>>>> Is your code valgrind clean? >>>>>>> >>>>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not using anything uninitialized. >>>>>>> >>>>>>>> >>>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >>>>>>> >>>>>>> I think I did what you're asking. I have -snes_mf_operator set, and then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is a matrix with ones on the diagonal. Two runs below, still with differences but sometimes identical. >>>>>> >>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence >>>>>> that nothing in the solver is variable. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>> 3 KSP Residual norm 2.131675872968e+04 >>>>>>> 4 KSP Residual norm 1.973129814235e+04 >>>>>>> 5 KSP Residual norm 1.832377856317e+04 >>>>>>> 6 KSP Residual norm 1.716783617436e+04 >>>>>>> 7 KSP Residual norm 1.583963149542e+04 >>>>>>> 8 KSP Residual norm 1.482272170304e+04 >>>>>>> 9 KSP Residual norm 1.380312106742e+04 >>>>>>> 10 KSP Residual norm 1.297793480658e+04 >>>>>>> 11 KSP Residual norm 1.208599123244e+04 >>>>>>> 12 KSP Residual norm 1.137345655227e+04 >>>>>>> 13 KSP Residual norm 1.059676909366e+04 >>>>>>> 14 KSP Residual norm 1.003823862398e+04 >>>>>>> 15 KSP Residual norm 9.425879221354e+03 >>>>>>> 16 KSP Residual norm 8.954805890038e+03 >>>>>>> 17 KSP Residual norm 8.592372470456e+03 >>>>>>> 18 KSP Residual norm 8.060707175821e+03 >>>>>>> 19 KSP Residual norm 7.782057728723e+03 >>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>> SNES Object: 1 MPI process >>>>>>> type: newtonls >>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>> total number of linear solver iterations=20 >>>>>>> total number of function evaluations=23 >>>>>>> norm schedule ALWAYS >>>>>>> Jacobian is never rebuilt >>>>>>> Jacobian is applied matrix-free with differencing >>>>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>> type: basic >>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>>>> maximum iterations=40 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>> 3 KSP Residual norm 2.131675873776e+04 >>>>>>> 4 KSP Residual norm 1.973129814908e+04 >>>>>>> 5 KSP Residual norm 1.832377852186e+04 >>>>>>> 6 KSP Residual norm 1.716783608174e+04 >>>>>>> 7 KSP Residual norm 1.583963128956e+04 >>>>>>> 8 KSP Residual norm 1.482272160069e+04 >>>>>>> 9 KSP Residual norm 1.380312087005e+04 >>>>>>> 10 KSP Residual norm 1.297793458796e+04 >>>>>>> 11 KSP Residual norm 1.208599115602e+04 >>>>>>> 12 KSP Residual norm 1.137345657533e+04 >>>>>>> 13 KSP Residual norm 1.059676906197e+04 >>>>>>> 14 KSP Residual norm 1.003823857515e+04 >>>>>>> 15 KSP Residual norm 9.425879177747e+03 >>>>>>> 16 KSP Residual norm 8.954805850825e+03 >>>>>>> 17 KSP Residual norm 8.592372413320e+03 >>>>>>> 18 KSP Residual norm 8.060706994110e+03 >>>>>>> 19 KSP Residual norm 7.782057560782e+03 >>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>> SNES Object: 1 MPI process >>>>>>> type: newtonls >>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>> total number of linear solver iterations=20 >>>>>>> total number of function evaluations=23 >>>>>>> norm schedule ALWAYS >>>>>>> Jacobian is never rebuilt >>>>>>> Jacobian is applied matrix-free with differencing >>>>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>> type: basic >>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>>>> maximum iterations=40 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> >>>>>>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley > wrote: >>>>>>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry > wrote: >>>>>>>>>> Try -pc_type none. >>>>>>>>> >>>>>>>>> With -pc_type none the 0 KSP residual looks identical. But *sometimes* it's producing exactly the same history and others it's gradually changing. I'm reasonably confident my residual evaluation has no randomness, see info after the petsc output. >>>>>>>> >>>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >>>>>>>> >>>>>>>> If not, then it could be your routine, or it could be MatMFFD. So run a few with -snes_view, and we can see if the >>>>>>>> "w" parameter changes. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>>> solve history 1: >>>>>>>>> >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>> >>>>>>>>> solve history 2, identical to 1: >>>>>>>>> >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>> >>>>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing difference to the end: >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>>>> >>>>>>>>> >>>>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 iterations, so 30 calls of the same residual evaluation, identical residuals every time >>>>>>>>> >>>>>>>>> run 1: >>>>>>>>> >>>>>>>>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>>>>>>>> # >>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.34834e-01 >>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.40063e-01 >>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.45166e-01 >>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.50494e-01 >>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.55656e-01 >>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.60872e-01 >>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.66041e-01 >>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.71316e-01 >>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.76447e-01 >>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.81716e-01 >>>>>>>>> >>>>>>>>> run N: >>>>>>>>> >>>>>>>>> >>>>>>>>> # >>>>>>>>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>>>>>>>> # >>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.23316e-01 >>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.28510e-01 >>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.33558e-01 >>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.38773e-01 >>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.43887e-01 >>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.49073e-01 >>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.54167e-01 >>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.59394e-01 >>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.64516e-01 >>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.69677e-01 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams > wrote: >>>>>>>>>> ASM is just the sub PC with one proc but gets weaker with more procs unless you use jacobi. (maybe I am missing something). >>>>>>>>>> >>>>>>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry > wrote: >>>>>>>>>>>> Please send the output of -snes_view. >>>>>>>>>>> pasted below. anything stand out? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> SNES Object: 1 MPI process >>>>>>>>>>> type: newtonls >>>>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>>>> total number of linear solver iterations=20 >>>>>>>>>>> total number of function evaluations=22 >>>>>>>>>>> norm schedule ALWAYS >>>>>>>>>>> Jacobian is never rebuilt >>>>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>>>> type: basic >>>>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>>>>>>>> maximum iterations=40 >>>>>>>>>>> KSP Object: 1 MPI process >>>>>>>>>>> type: gmres >>>>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: 1 MPI process >>>>>>>>>>> type: asm >>>>>>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>>>>>> restriction/interpolation type - RESTRICT >>>>>>>>>>> Local solver information for first block is in the following KSP and PC objects on rank 0: >>>>>>>>>>> Use -ksp_view ::ascii_info_detail to display information for all blocks >>>>>>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>>>>>> type: preonly >>>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>>> PC Object: (sub_) 1 MPI process >>>>>>>>>>> type: ilu >>>>>>>>>>> out-of-place factorization >>>>>>>>>>> 0 levels of fill >>>>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>>>> matrix ordering: natural >>>>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>>>> Factored matrix follows: >>>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> package used to perform factorization: petsc >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> block size is 16 >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>>> block size is 16 >>>>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: mffd >>>>>>>>>>> rows=16384, cols=16384 >>>>>>>>>>> Matrix-free approximation: >>>>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>>>> Using wp compute h routine >>>>>>>>>>> Does not compute normU >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>>> block size is 16 >>>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams > wrote: >>>>>>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>>>>>> -snes_view might give you that. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley > wrote: >>>>>>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry > wrote: >>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>>>>>> >>>>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the end. The difference for one solve is slight (final SNES norm is identical to 5 digits), but in the context I'm using it in (repeated applications to solve a steady state multigrid problem, though here just one level) the differences add up such that I might reach global convergence in 35 iterations or 38. It's not the end of the world, but I was expecting that with -np 1 these would be identical and I'm not sure where the root cause would be. >>>>>>>>>>>>> >>>>>>>>>>>>> The initial KSP residual is different, so its the PC. Please send the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>>>>>> [...] >>>>>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>>>>>> [...] >>>>>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith > wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further apart? That is the first couple of KSP iterations they are almost identical but then for each iteration get a bit further. Similar for the SNES iterations, starting close and then for more iterations and more solves they start moving apart. Or do they suddenly jump to be very different? You can run with -snes_monitor -ksp_monitor >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry > wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, was just guessing there. But the solutions/residuals are slightly different from run to run. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise identical results? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith > wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> No, the coloring should be identical every time. Do you see differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry > wrote: >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from run to run. I'm wondering where randomness might enter here -- does the jacobian coloring use a random seed? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri May 5 10:00:53 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 5 May 2023 11:00:53 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: <0BF7FAC1-C8AF-46B8-9E36-20AC208DE567@petsc.dev> I will add the two interfaces you requested today. (Likely you may need more also). Barry > On May 4, 2023, at 6:01 PM, Matthew Knepley wrote: > > On Thu, May 4, 2023 at 1:43?PM LEONARDO MUTTI > wrote: >> Of course, I'll try to explain. >> >> I am solving a parabolic equation with space-time FEM and I want an efficient solver/preconditioner for the resulting system. >> The corresponding matrix, call it X, has an e.g. block bi-diagonal structure, if the cG(1)-dG(0) method is used (i.e. implicit Euler solved in batch). >> Every block-row of X corresponds to a time instant. >> >> I want to introduce parallelism in time by subdividing X into overlapping submatrices of e.g 2x2 or 3x3 blocks, along the block diagonal. >> For instance, call X_i the individual blocks. The submatrices would be, for various i, (X_{i-1,i-1},X_{i-1,i};X_{i,i-1},X_{i,i}). >> I'd like each submatrix to be solved in parallel, to combine the various results together in an ASM like fashion. >> Every submatrix has thus a predecessor and a successor, and it overlaps with both, so that as far as I could understand, GASM has to be used in place of ASM. > > Yes, ordered that way you need GASM. I wonder if inverting the ordering would be useful, namely putting the time index on the inside. > Then the blocks would be over all time, but limited space, which is more the spirit of ASM I think. > > Have you considered waveform relaxation for this problem? > > Thanks, > > Matt > >> Hope this helps. >> Best, >> Leonardo >> >> Il giorno gio 4 mag 2023 alle ore 18:05 Matthew Knepley > ha scritto: >>> On Thu, May 4, 2023 at 11:24?AM LEONARDO MUTTI > wrote: >>>> Thank you for the help. >>>> Adding to my example: >>>> call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) >>>> call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr) >>>> results in: >>>> Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS referenced in function ... >>>> Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS referenced in function ... >>>> I'm not sure if the interfaces are missing or if I have a compilation problem. >>> >>> I just want to make sure you really want GASM. It sounded like you might able to do what you want just with ASM. >>> Can you tell me again what you want to do overall? >>> >>> Thanks, >>> >>> Matt >>> >>>> Thank you again. >>>> Best, >>>> Leonardo >>>> >>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith > ha scritto: >>>>> >>>>> Thank you for the test code. I have a fix in the branch barry/2023-04-29/fix-pcasmcreatesubdomains2d with merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>> >>>>> The functions did not have proper Fortran stubs and interfaces so I had to provide them manually in the new branch. >>>>> >>>>> Use >>>>> >>>>> git fetch >>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>> ./configure etc >>>>> >>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to change things slightly and I updated the error handling for the latest version. >>>>> >>>>> Please let us know if you have any later questions. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI > wrote: >>>>>> >>>>>> Hello. I am having a hard time understanding the index sets to feed PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To get more intuition on how the IS objects behave I tried the following minimal (non) working example, which should tile a 16x16 matrix into 16 square, non-overlapping submatrices: >>>>>> >>>>>> #include >>>>>> #include >>>>>> #include >>>>>> USE petscmat >>>>>> USE petscksp >>>>>> USE petscpc >>>>>> >>>>>> Mat :: A >>>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>>> INTEGER :: I,J >>>>>> PetscErrorCode :: ierr >>>>>> PetscScalar :: v >>>>>> KSP :: ksp >>>>>> PC :: pc >>>>>> IS :: subdomains_IS, inflated_IS >>>>>> >>>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>>> >>>>>> !-----Create a dummy matrix >>>>>> M = 16 >>>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>> & M, M, >>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>> & A, ierr) >>>>>> >>>>>> DO I=1,M >>>>>> DO J=1,M >>>>>> v = I*J >>>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>>> & INSERT_VALUES , ierr) >>>>>> END DO >>>>>> END DO >>>>>> >>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>> >>>>>> !-----Create KSP and PC >>>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>>> call KSPGetPC(ksp,pc,ierr) >>>>>> call KSPSetUp(ksp, ierr) >>>>>> call PCSetType(pc,PCGASM, ierr) >>>>>> call PCSetUp(pc , ierr) >>>>>> >>>>>> !-----GASM setup >>>>>> NSubx = 4 >>>>>> dof = 1 >>>>>> overlap = 0 >>>>>> >>>>>> call PCGASMCreateSubdomains2D(pc, >>>>>> & M, M, >>>>>> & NSubx, NSubx, >>>>>> & dof, overlap, >>>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>>> >>>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>>> >>>>>> call KSPDestroy(ksp, ierr) >>>>>> call PetscFinalize(ierr) >>>>>> >>>>>> Running this on one processor, I get NSub = 4. >>>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as expected. >>>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - access violation". So: >>>>>> 1) why do I get two different results with ASM, and GASM? >>>>>> 2) why do I get access violation and how can I solve this? >>>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. As I see on the Fortran interface, the arguments to PCGASMCreateSubdomains2D are IS objects: >>>>>> >>>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>>> import tPC,tIS >>>>>> PC a ! PC >>>>>> PetscInt b ! PetscInt >>>>>> PetscInt c ! PetscInt >>>>>> PetscInt d ! PetscInt >>>>>> PetscInt e ! PetscInt >>>>>> PetscInt f ! PetscInt >>>>>> PetscInt g ! PetscInt >>>>>> PetscInt h ! PetscInt >>>>>> IS i ! IS >>>>>> IS j ! IS >>>>>> PetscErrorCode z >>>>>> end subroutine PCGASMCreateSubdomains2D >>>>>> Thus: >>>>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, for every created subdomain, the list of rows and columns defining the subblock in the matrix, am I right? >>>>>> >>>>>> Context: I have a block-tridiagonal system arising from space-time finite elements, and I want to solve it with GMRES+PCGASM preconditioner, where each overlapping submatrix is on the diagonal and of size 3x3 blocks (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>>> >>>>>> Thanks in advance, >>>>>> Leonardo >>>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Fri May 5 10:17:03 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Fri, 5 May 2023 17:17:03 +0200 Subject: [petsc-users] issues with VecSetValues in petsc 3.19 In-Reply-To: References: Message-ID: Hello Barry, Welcome to the party! Thank you guys for your precious suggestions, they are really helpful! It's been a while since I am messing around and I have tested many combinations. Schur + selfp is the best preconditioner, it converges within 5 iters using gmres for inner solvers but it is not very fast and sometimes multiplicative it's a better option as the inner iterations looks lighter. If I tune the relative tolerance I get a huge speed up, but I am a bit less confident about the results. The funny thing is that the default relative tolerance if commercial CFD solver is huge, very often 0.1 -.- If I may borrow your brain guys for a while, I would like to ask your opinion about the multigrid they use in this paper: https://www.aub.edu.lb/msfea/research/Documents/CFD-P18.pdf. At some point they say: " The algorithm used in this work is a combination of the ILU(0) [28] algorithm with an additive corrective multigrid method [29] ", 29: https://www.tandfonline.com/doi/abs/10.1080/10407788608913491 Is that something similar to fieldsplit additive? Many thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Fri May 5 10:17:17 2023 From: leonardo.mutti01 at universitadipavia.it (LEONARDO MUTTI) Date: Fri, 5 May 2023 17:17:17 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: <0BF7FAC1-C8AF-46B8-9E36-20AC208DE567@petsc.dev> References: <0BF7FAC1-C8AF-46B8-9E36-20AC208DE567@petsc.dev> Message-ID: Thanks a lot. If this can help, we should need (not much more than) the functionalities from https://petsc.org/release/src/ksp/ksp/tutorials/ex62.c.html: - PCGASMSetSubdomains - PCGASMDestroySubdomains - PCGASMGetSubKSP Best, Leonardo Il giorno ven 5 mag 2023 alle ore 17:00 Barry Smith ha scritto: > > I will add the two interfaces you requested today. (Likely you may need > more also). > > Barry > > > On May 4, 2023, at 6:01 PM, Matthew Knepley wrote: > > On Thu, May 4, 2023 at 1:43?PM LEONARDO MUTTI < > leonardo.mutti01 at universitadipavia.it> wrote: > >> Of course, I'll try to explain. >> >> I am solving a parabolic equation with space-time FEM and I want an >> efficient solver/preconditioner for the resulting system. >> The corresponding matrix, call it X, has an e.g. block bi-diagonal >> structure, if the cG(1)-dG(0) method is used (i.e. implicit Euler solved in >> batch). >> Every block-row of X corresponds to a time instant. >> >> I want to introduce parallelism in time by subdividing X into overlapping >> submatrices of e.g 2x2 or 3x3 blocks, along the block diagonal. >> For instance, call X_i the individual blocks. The submatrices would be, >> for various i, (X_{i-1,i-1},X_{i-1,i};X_{i,i-1},X_{i,i}). >> I'd like each submatrix to be solved in parallel, to combine the various >> results together in an ASM like fashion. >> Every submatrix has thus a predecessor and a successor, and it overlaps >> with both, so that as far as I could understand, GASM has to be used in >> place of ASM. >> > > Yes, ordered that way you need GASM. I wonder if inverting the ordering > would be useful, namely putting the time index on the inside. > Then the blocks would be over all time, but limited space, which is more > the spirit of ASM I think. > > Have you considered waveform relaxation for this problem? > > Thanks, > > Matt > > >> Hope this helps. >> Best, >> Leonardo >> >> Il giorno gio 4 mag 2023 alle ore 18:05 Matthew Knepley < >> knepley at gmail.com> ha scritto: >> >>> On Thu, May 4, 2023 at 11:24?AM LEONARDO MUTTI < >>> leonardo.mutti01 at universitadipavia.it> wrote: >>> >>>> Thank you for the help. >>>> Adding to my example: >>>> >>>> >>>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>>> inflated_IS,ierr) call >>>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>>> results in: >>>> >>>> * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS >>>> referenced in function ... * >>>> >>>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>>> referenced in function ... * >>>> I'm not sure if the interfaces are missing or if I have a compilation >>>> problem. >>>> >>> >>> I just want to make sure you really want GASM. It sounded like you might >>> able to do what you want just with ASM. >>> Can you tell me again what you want to do overall? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thank you again. >>>> Best, >>>> Leonardo >>>> >>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith >>>> ha scritto: >>>> >>>>> >>>>> Thank you for the test code. I have a fix in the branch >>>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>> with >>>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>> >>>>> The functions did not have proper Fortran stubs and interfaces so I >>>>> had to provide them manually in the new branch. >>>>> >>>>> Use >>>>> >>>>> git fetch >>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>> >>>>> ./configure etc >>>>> >>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had >>>>> to change things slightly and I updated the error handling for the latest >>>>> version. >>>>> >>>>> Please let us know if you have any later questions. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>> >>>>> Hello. I am having a hard time understanding the index sets to feed >>>>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>>>> get more intuition on how the IS objects behave I tried the following >>>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>>> square, non-overlapping submatrices: >>>>> >>>>> #include >>>>> #include >>>>> #include >>>>> USE petscmat >>>>> USE petscksp >>>>> USE petscpc >>>>> >>>>> Mat :: A >>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>> INTEGER :: I,J >>>>> PetscErrorCode :: ierr >>>>> PetscScalar :: v >>>>> KSP :: ksp >>>>> PC :: pc >>>>> IS :: subdomains_IS, inflated_IS >>>>> >>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>> >>>>> !-----Create a dummy matrix >>>>> M = 16 >>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>> & M, M, >>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>> & A, ierr) >>>>> >>>>> DO I=1,M >>>>> DO J=1,M >>>>> v = I*J >>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>> & INSERT_VALUES , ierr) >>>>> END DO >>>>> END DO >>>>> >>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>> >>>>> !-----Create KSP and PC >>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>> call KSPGetPC(ksp,pc,ierr) >>>>> call KSPSetUp(ksp, ierr) >>>>> call PCSetType(pc,PCGASM, ierr) >>>>> call PCSetUp(pc , ierr) >>>>> >>>>> !-----GASM setup >>>>> NSubx = 4 >>>>> dof = 1 >>>>> overlap = 0 >>>>> >>>>> call PCGASMCreateSubdomains2D(pc, >>>>> & M, M, >>>>> & NSubx, NSubx, >>>>> & dof, overlap, >>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>> >>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>> >>>>> call KSPDestroy(ksp, ierr) >>>>> call PetscFinalize(ierr) >>>>> >>>>> Running this on one processor, I get NSub = 4. >>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 >>>>> as expected. >>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - >>>>> access violation". So: >>>>> 1) why do I get two different results with ASM, and GASM? >>>>> 2) why do I get access violation and how can I solve this? >>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>>> objects. As I see on the Fortran interface, the arguments to >>>>> PCGASMCreateSubdomains2D are IS objects: >>>>> >>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>> import tPC,tIS >>>>> PC a ! PC >>>>> PetscInt b ! PetscInt >>>>> PetscInt c ! PetscInt >>>>> PetscInt d ! PetscInt >>>>> PetscInt e ! PetscInt >>>>> PetscInt f ! PetscInt >>>>> PetscInt g ! PetscInt >>>>> PetscInt h ! PetscInt >>>>> IS i ! IS >>>>> IS j ! IS >>>>> PetscErrorCode z >>>>> end subroutine PCGASMCreateSubdomains2D >>>>> Thus: >>>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, >>>>> for every created subdomain, the list of rows and columns defining the >>>>> subblock in the matrix, am I right? >>>>> >>>>> Context: I have a block-tridiagonal system arising from space-time >>>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>> >>>>> Thanks in advance, >>>>> Leonardo >>>>> >>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri May 5 11:43:33 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 5 May 2023 12:43:33 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: Message-ID: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> Added in barry/2023-05-04/add-pcgasm-set-subdomains see also https://gitlab.com/petsc/petsc/-/merge_requests/6419 Barry > On May 4, 2023, at 11:23 AM, LEONARDO MUTTI wrote: > > Thank you for the help. > Adding to my example: > call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) > call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr) > results in: > Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS referenced in function ... > Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS referenced in function ... > I'm not sure if the interfaces are missing or if I have a compilation problem. > Thank you again. > Best, > Leonardo > > Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith > ha scritto: >> >> Thank you for the test code. I have a fix in the branch barry/2023-04-29/fix-pcasmcreatesubdomains2d with merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >> >> The functions did not have proper Fortran stubs and interfaces so I had to provide them manually in the new branch. >> >> Use >> >> git fetch >> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >> ./configure etc >> >> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to change things slightly and I updated the error handling for the latest version. >> >> Please let us know if you have any later questions. >> >> Barry >> >> >> >> >>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI > wrote: >>> >>> Hello. I am having a hard time understanding the index sets to feed PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To get more intuition on how the IS objects behave I tried the following minimal (non) working example, which should tile a 16x16 matrix into 16 square, non-overlapping submatrices: >>> >>> #include >>> #include >>> #include >>> USE petscmat >>> USE petscksp >>> USE petscpc >>> >>> Mat :: A >>> PetscInt :: M, NSubx, dof, overlap, NSub >>> INTEGER :: I,J >>> PetscErrorCode :: ierr >>> PetscScalar :: v >>> KSP :: ksp >>> PC :: pc >>> IS :: subdomains_IS, inflated_IS >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>> >>> !-----Create a dummy matrix >>> M = 16 >>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>> & M, M, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & A, ierr) >>> >>> DO I=1,M >>> DO J=1,M >>> v = I*J >>> CALL MatSetValue (A,I-1,J-1,v, >>> & INSERT_VALUES , ierr) >>> END DO >>> END DO >>> >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>> >>> !-----Create KSP and PC >>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>> call KSPSetOperators(ksp,A,A, ierr) >>> call KSPSetType(ksp,"bcgs",ierr) >>> call KSPGetPC(ksp,pc,ierr) >>> call KSPSetUp(ksp, ierr) >>> call PCSetType(pc,PCGASM, ierr) >>> call PCSetUp(pc , ierr) >>> >>> !-----GASM setup >>> NSubx = 4 >>> dof = 1 >>> overlap = 0 >>> >>> call PCGASMCreateSubdomains2D(pc, >>> & M, M, >>> & NSubx, NSubx, >>> & dof, overlap, >>> & NSub, subdomains_IS, inflated_IS, ierr) >>> >>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>> >>> call KSPDestroy(ksp, ierr) >>> call PetscFinalize(ierr) >>> >>> Running this on one processor, I get NSub = 4. >>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as expected. >>> Moreover, I get in the end "forrtl: severe (157): Program Exception - access violation". So: >>> 1) why do I get two different results with ASM, and GASM? >>> 2) why do I get access violation and how can I solve this? >>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. As I see on the Fortran interface, the arguments to PCGASMCreateSubdomains2D are IS objects: >>> >>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>> import tPC,tIS >>> PC a ! PC >>> PetscInt b ! PetscInt >>> PetscInt c ! PetscInt >>> PetscInt d ! PetscInt >>> PetscInt e ! PetscInt >>> PetscInt f ! PetscInt >>> PetscInt g ! PetscInt >>> PetscInt h ! PetscInt >>> IS i ! IS >>> IS j ! IS >>> PetscErrorCode z >>> end subroutine PCGASMCreateSubdomains2D >>> Thus: >>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, for every created subdomain, the list of rows and columns defining the subblock in the matrix, am I right? >>> >>> Context: I have a block-tridiagonal system arising from space-time finite elements, and I want to solve it with GMRES+PCGASM preconditioner, where each overlapping submatrix is on the diagonal and of size 3x3 blocks (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>> >>> Thanks in advance, >>> Leonardo >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Fri May 5 15:45:28 2023 From: mlohry at gmail.com (Mark Lohry) Date: Fri, 5 May 2023 16:45:28 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: wow. leaving -O3 and turning off -march=native seems to have made it repeatable. this is on an avx2 cpu if it matters. out-of-order instructions may be performed thus, two runs may have > different order of operations > > this is terrifying if true. the source code path is exactly the same every time but the cpu does different things? On Fri, May 5, 2023 at 10:55?AM Barry Smith wrote: > > Mark, > > Thank you. You do have aggressive optimizations: -O3 -march=native, > which means out-of-order instructions may be performed thus, two runs may > have different order of operations and possibly different round-off values. > > You could try turning off all of this with -O0 for an experiment and see > what happens. My guess is that you will see much smaller differences in the > residuals. > > Barry > > > On May 5, 2023, at 8:11 AM, Mark Lohry wrote: > > > > On Thu, May 4, 2023 at 9:51?PM Barry Smith wrote: > >> >> Send configure.log >> >> >> On May 4, 2023, at 5:35 PM, Mark Lohry wrote: >> >> Sure, but why only once and why save to disk? Why not just use that >>> computed approximate Jacobian at each Newton step to drive the Newton >>> solves along for a bunch of time steps? >> >> >> Ah I get what you mean. Okay I did three newton steps with the same LHS, >> with a few repeated manual tests. 3 out of 4 times i got the same exact >> history. is it in the realm of possibility that a hardware error could >> cause something this subtle, bad memory bit or something? >> >> 2 runs of 3 newton solves below, ever-so-slightly different. >> >> >> 0 SNES Function norm 3.424003312857e+04 >> 0 KSP Residual norm 3.424003312857e+04 >> 1 KSP Residual norm 2.886124328003e+04 >> 2 KSP Residual norm 2.504664994246e+04 >> 3 KSP Residual norm 2.104615835161e+04 >> 4 KSP Residual norm 1.938102896632e+04 >> 5 KSP Residual norm 1.793774642408e+04 >> 6 KSP Residual norm 1.671392566980e+04 >> 7 KSP Residual norm 1.501504103873e+04 >> 8 KSP Residual norm 1.366362900747e+04 >> 9 KSP Residual norm 1.240398500429e+04 >> 10 KSP Residual norm 1.156293733914e+04 >> 11 KSP Residual norm 1.066296477958e+04 >> 12 KSP Residual norm 9.835601966950e+03 >> 13 KSP Residual norm 9.017480191491e+03 >> 14 KSP Residual norm 8.415336139780e+03 >> 15 KSP Residual norm 7.807497808435e+03 >> 16 KSP Residual norm 7.341703768294e+03 >> 17 KSP Residual norm 6.979298049282e+03 >> 18 KSP Residual norm 6.521277772081e+03 >> 19 KSP Residual norm 6.174842408773e+03 >> 20 KSP Residual norm 5.889819665003e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 1 SNES Function norm 1.000525348433e+04 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=2 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 0 SNES Function norm 1.000525348433e+04 >> 0 KSP Residual norm 1.000525348433e+04 >> 1 KSP Residual norm 7.908741564765e+03 >> 2 KSP Residual norm 6.825263536686e+03 >> 3 KSP Residual norm 6.224930664968e+03 >> 4 KSP Residual norm 6.095547180532e+03 >> 5 KSP Residual norm 5.952968230430e+03 >> 6 KSP Residual norm 5.861251998116e+03 >> 7 KSP Residual norm 5.712439327755e+03 >> 8 KSP Residual norm 5.583056913266e+03 >> 9 KSP Residual norm 5.461768804626e+03 >> 10 KSP Residual norm 5.351937611098e+03 >> 11 KSP Residual norm 5.224288337578e+03 >> 12 KSP Residual norm 5.129863847081e+03 >> 13 KSP Residual norm 5.010818237218e+03 >> 14 KSP Residual norm 4.907162936199e+03 >> 15 KSP Residual norm 4.789564773955e+03 >> 16 KSP Residual norm 4.695173370720e+03 >> 17 KSP Residual norm 4.584070962171e+03 >> 18 KSP Residual norm 4.483061424742e+03 >> 19 KSP Residual norm 4.373384070745e+03 >> 20 KSP Residual norm 4.260704657592e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 1 SNES Function norm 4.662386014882e+03 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=2 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 0 SNES Function norm 4.662386014882e+03 >> 0 KSP Residual norm 4.662386014882e+03 >> 1 KSP Residual norm 4.408316259864e+03 >> 2 KSP Residual norm 4.184867769829e+03 >> 3 KSP Residual norm 4.079091244351e+03 >> 4 KSP Residual norm 4.009247390166e+03 >> 5 KSP Residual norm 3.928417371428e+03 >> 6 KSP Residual norm 3.865152075780e+03 >> 7 KSP Residual norm 3.795606446033e+03 >> 8 KSP Residual norm 3.735294554158e+03 >> 9 KSP Residual norm 3.674393726487e+03 >> 10 KSP Residual norm 3.617795166786e+03 >> 11 KSP Residual norm 3.563807982274e+03 >> 12 KSP Residual norm 3.512269444921e+03 >> 13 KSP Residual norm 3.455110223236e+03 >> 14 KSP Residual norm 3.407141247372e+03 >> 15 KSP Residual norm 3.356562415982e+03 >> 16 KSP Residual norm 3.312720047685e+03 >> 17 KSP Residual norm 3.263690150810e+03 >> 18 KSP Residual norm 3.219359862444e+03 >> 19 KSP Residual norm 3.173500955995e+03 >> 20 KSP Residual norm 3.127528790155e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 1 SNES Function norm 3.186752172556e+03 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=2 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> >> >> >> 0 SNES Function norm 3.424003312857e+04 >> 0 KSP Residual norm 3.424003312857e+04 >> 1 KSP Residual norm 2.886124328003e+04 >> 2 KSP Residual norm 2.504664994221e+04 >> 3 KSP Residual norm 2.104615835130e+04 >> 4 KSP Residual norm 1.938102896610e+04 >> 5 KSP Residual norm 1.793774642406e+04 >> 6 KSP Residual norm 1.671392566981e+04 >> 7 KSP Residual norm 1.501504103854e+04 >> 8 KSP Residual norm 1.366362900726e+04 >> 9 KSP Residual norm 1.240398500414e+04 >> 10 KSP Residual norm 1.156293733914e+04 >> 11 KSP Residual norm 1.066296477972e+04 >> 12 KSP Residual norm 9.835601967036e+03 >> 13 KSP Residual norm 9.017480191500e+03 >> 14 KSP Residual norm 8.415336139732e+03 >> 15 KSP Residual norm 7.807497808414e+03 >> 16 KSP Residual norm 7.341703768300e+03 >> 17 KSP Residual norm 6.979298049244e+03 >> 18 KSP Residual norm 6.521277772042e+03 >> 19 KSP Residual norm 6.174842408713e+03 >> 20 KSP Residual norm 5.889819664983e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 1 SNES Function norm 1.000525348435e+04 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=2 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 0 SNES Function norm 1.000525348435e+04 >> 0 KSP Residual norm 1.000525348435e+04 >> 1 KSP Residual norm 7.908741565645e+03 >> 2 KSP Residual norm 6.825263536988e+03 >> 3 KSP Residual norm 6.224930664967e+03 >> 4 KSP Residual norm 6.095547180474e+03 >> 5 KSP Residual norm 5.952968230397e+03 >> 6 KSP Residual norm 5.861251998127e+03 >> 7 KSP Residual norm 5.712439327726e+03 >> 8 KSP Residual norm 5.583056913167e+03 >> 9 KSP Residual norm 5.461768804526e+03 >> 10 KSP Residual norm 5.351937611030e+03 >> 11 KSP Residual norm 5.224288337536e+03 >> 12 KSP Residual norm 5.129863847028e+03 >> 13 KSP Residual norm 5.010818237161e+03 >> 14 KSP Residual norm 4.907162936143e+03 >> 15 KSP Residual norm 4.789564773923e+03 >> 16 KSP Residual norm 4.695173370709e+03 >> 17 KSP Residual norm 4.584070962145e+03 >> 18 KSP Residual norm 4.483061424714e+03 >> 19 KSP Residual norm 4.373384070713e+03 >> 20 KSP Residual norm 4.260704657576e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 1 SNES Function norm 4.662386014874e+03 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=2 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 0 SNES Function norm 4.662386014874e+03 >> 0 KSP Residual norm 4.662386014874e+03 >> 1 KSP Residual norm 4.408316259834e+03 >> 2 KSP Residual norm 4.184867769891e+03 >> 3 KSP Residual norm 4.079091244367e+03 >> 4 KSP Residual norm 4.009247390184e+03 >> 5 KSP Residual norm 3.928417371457e+03 >> 6 KSP Residual norm 3.865152075802e+03 >> 7 KSP Residual norm 3.795606446041e+03 >> 8 KSP Residual norm 3.735294554160e+03 >> 9 KSP Residual norm 3.674393726485e+03 >> 10 KSP Residual norm 3.617795166775e+03 >> 11 KSP Residual norm 3.563807982249e+03 >> 12 KSP Residual norm 3.512269444873e+03 >> 13 KSP Residual norm 3.455110223193e+03 >> 14 KSP Residual norm 3.407141247334e+03 >> 15 KSP Residual norm 3.356562415949e+03 >> 16 KSP Residual norm 3.312720047652e+03 >> 17 KSP Residual norm 3.263690150782e+03 >> 18 KSP Residual norm 3.219359862425e+03 >> 19 KSP Residual norm 3.173500955997e+03 >> 20 KSP Residual norm 3.127528790156e+03 >> Linear solve converged due to CONVERGED_ITS iterations 20 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> 1 SNES Function norm 3.186752172503e+03 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> SNES Object: 1 MPI process >> type: newtonls >> maximum iterations=1, maximum function evaluations=-1 >> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >> total number of linear solver iterations=20 >> total number of function evaluations=2 >> norm schedule ALWAYS >> Jacobian is never rebuilt >> Jacobian is built using finite differences with coloring >> SNESLineSearch Object: 1 MPI process >> type: basic >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=20, initial guess is zero >> tolerances: relative=0.1, absolute=1e-15, divergence=10. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: seqbaij >> rows=16384, cols=16384, bs=16 >> total: nonzeros=1277952, allocated nonzeros=1277952 >> total number of mallocs used during MatSetValues calls=0 >> block size is 16 >> >> On Thu, May 4, 2023 at 5:22?PM Matthew Knepley wrote: >> >>> On Thu, May 4, 2023 at 5:03?PM Mark Lohry wrote: >>> >>>> Do you get different results (in different runs) without >>>>> -snes_mf_operator? So just using an explicit matrix? >>>> >>>> >>>> Unfortunately I don't have an explicit matrix available for this, hence >>>> the MFFD/JFNK. >>>> >>> >>> I don't mean the actual matrix, I mean a representative matrix. >>> >>> >>>> >>>>> (Note: I am not convinced there is even a problem and think it may >>>>> be simply different order of floating point operations in different runs.) >>>>> >>>> >>>> I'm not convinced either, but running explicit RK for 10,000 iterations >>>> i get exactly the same results every time so i'm fairly confident it's not >>>> the residual evaluation. >>>> How would there be a different order of floating point ops in different >>>> runs in serial? >>>> >>>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>>>> that solver with a sparse matrix. This would give me confidence >>>>> that nothing in the solver is variable. >>>>> >>>>> I could do the sparse finite difference jacobian once, save it to >>>> disk, and then use that system each time. >>>> >>> >>> Yes. That would work. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> On Thu, May 4, 2023 at 4:57?PM Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, May 4, 2023 at 4:44?PM Mark Lohry wrote: >>>>> >>>>>> Is your code valgrind clean? >>>>>>> >>>>>> >>>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not >>>>>> using anything uninitialized. >>>>>> >>>>>> >>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix >>>>>>> and run. Do you see any variability? >>>>>>> >>>>>> >>>>>> I think I did what you're asking. I have -snes_mf_operator set, and >>>>>> then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where >>>>>> diag_ones is a matrix with ones on the diagonal. Two runs below, still with >>>>>> differences but sometimes identical. >>>>>> >>>>> >>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>>>> that solver with a sparse matrix. This would give me confidence >>>>> that nothing in the solver is variable. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>> 3 KSP Residual norm 2.131675872968e+04 >>>>>> 4 KSP Residual norm 1.973129814235e+04 >>>>>> 5 KSP Residual norm 1.832377856317e+04 >>>>>> 6 KSP Residual norm 1.716783617436e+04 >>>>>> 7 KSP Residual norm 1.583963149542e+04 >>>>>> 8 KSP Residual norm 1.482272170304e+04 >>>>>> 9 KSP Residual norm 1.380312106742e+04 >>>>>> 10 KSP Residual norm 1.297793480658e+04 >>>>>> 11 KSP Residual norm 1.208599123244e+04 >>>>>> 12 KSP Residual norm 1.137345655227e+04 >>>>>> 13 KSP Residual norm 1.059676909366e+04 >>>>>> 14 KSP Residual norm 1.003823862398e+04 >>>>>> 15 KSP Residual norm 9.425879221354e+03 >>>>>> 16 KSP Residual norm 8.954805890038e+03 >>>>>> 17 KSP Residual norm 8.592372470456e+03 >>>>>> 18 KSP Residual norm 8.060707175821e+03 >>>>>> 19 KSP Residual norm 7.782057728723e+03 >>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> KSP Object: 1 MPI process >>>>>> type: gmres >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>> Orthogonalization with no iterative refinement >>>>>> happy breakdown tolerance 1e-30 >>>>>> maximum iterations=20, initial guess is zero >>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>> left preconditioning >>>>>> using PRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI process >>>>>> type: none >>>>>> linear system matrix followed by preconditioner matrix: >>>>>> Mat Object: 1 MPI process >>>>>> type: mffd >>>>>> rows=16384, cols=16384 >>>>>> Matrix-free approximation: >>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>> Using wp compute h routine >>>>>> Does not compute normU >>>>>> Mat Object: 1 MPI process >>>>>> type: seqaij >>>>>> rows=16384, cols=16384 >>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> not using I-node routines >>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>> SNES Object: 1 MPI process >>>>>> type: newtonls >>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>> total number of linear solver iterations=20 >>>>>> total number of function evaluations=23 >>>>>> norm schedule ALWAYS >>>>>> Jacobian is never rebuilt >>>>>> Jacobian is applied matrix-free with differencing >>>>>> Preconditioning Jacobian is built using finite differences with >>>>>> coloring >>>>>> SNESLineSearch Object: 1 MPI process >>>>>> type: basic >>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>> lambda=1.000000e-08 >>>>>> maximum iterations=40 >>>>>> KSP Object: 1 MPI process >>>>>> type: gmres >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>> Orthogonalization with no iterative refinement >>>>>> happy breakdown tolerance 1e-30 >>>>>> maximum iterations=20, initial guess is zero >>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>> left preconditioning >>>>>> using PRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI process >>>>>> type: none >>>>>> linear system matrix followed by preconditioner matrix: >>>>>> Mat Object: 1 MPI process >>>>>> type: mffd >>>>>> rows=16384, cols=16384 >>>>>> Matrix-free approximation: >>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>> Using wp compute h routine >>>>>> Does not compute normU >>>>>> Mat Object: 1 MPI process >>>>>> type: seqaij >>>>>> rows=16384, cols=16384 >>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> not using I-node routines >>>>>> >>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>> 3 KSP Residual norm 2.131675873776e+04 >>>>>> 4 KSP Residual norm 1.973129814908e+04 >>>>>> 5 KSP Residual norm 1.832377852186e+04 >>>>>> 6 KSP Residual norm 1.716783608174e+04 >>>>>> 7 KSP Residual norm 1.583963128956e+04 >>>>>> 8 KSP Residual norm 1.482272160069e+04 >>>>>> 9 KSP Residual norm 1.380312087005e+04 >>>>>> 10 KSP Residual norm 1.297793458796e+04 >>>>>> 11 KSP Residual norm 1.208599115602e+04 >>>>>> 12 KSP Residual norm 1.137345657533e+04 >>>>>> 13 KSP Residual norm 1.059676906197e+04 >>>>>> 14 KSP Residual norm 1.003823857515e+04 >>>>>> 15 KSP Residual norm 9.425879177747e+03 >>>>>> 16 KSP Residual norm 8.954805850825e+03 >>>>>> 17 KSP Residual norm 8.592372413320e+03 >>>>>> 18 KSP Residual norm 8.060706994110e+03 >>>>>> 19 KSP Residual norm 7.782057560782e+03 >>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>> KSP Object: 1 MPI process >>>>>> type: gmres >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>> Orthogonalization with no iterative refinement >>>>>> happy breakdown tolerance 1e-30 >>>>>> maximum iterations=20, initial guess is zero >>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>> left preconditioning >>>>>> using PRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI process >>>>>> type: none >>>>>> linear system matrix followed by preconditioner matrix: >>>>>> Mat Object: 1 MPI process >>>>>> type: mffd >>>>>> rows=16384, cols=16384 >>>>>> Matrix-free approximation: >>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>> Using wp compute h routine >>>>>> Does not compute normU >>>>>> Mat Object: 1 MPI process >>>>>> type: seqaij >>>>>> rows=16384, cols=16384 >>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> not using I-node routines >>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>> SNES Object: 1 MPI process >>>>>> type: newtonls >>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>> total number of linear solver iterations=20 >>>>>> total number of function evaluations=23 >>>>>> norm schedule ALWAYS >>>>>> Jacobian is never rebuilt >>>>>> Jacobian is applied matrix-free with differencing >>>>>> Preconditioning Jacobian is built using finite differences with >>>>>> coloring >>>>>> SNESLineSearch Object: 1 MPI process >>>>>> type: basic >>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>> lambda=1.000000e-08 >>>>>> maximum iterations=40 >>>>>> KSP Object: 1 MPI process >>>>>> type: gmres >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>> Orthogonalization with no iterative refinement >>>>>> happy breakdown tolerance 1e-30 >>>>>> maximum iterations=20, initial guess is zero >>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>> left preconditioning >>>>>> using PRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI process >>>>>> type: none >>>>>> linear system matrix followed by preconditioner matrix: >>>>>> Mat Object: 1 MPI process >>>>>> type: mffd >>>>>> rows=16384, cols=16384 >>>>>> Matrix-free approximation: >>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>> Using wp compute h routine >>>>>> Does not compute normU >>>>>> Mat Object: 1 MPI process >>>>>> type: seqaij >>>>>> rows=16384, cols=16384 >>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>> not using I-node routines >>>>>> >>>>>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: >>>>>>> >>>>>>>> Try -pc_type none. >>>>>>>>> >>>>>>>> >>>>>>>> With -pc_type none the 0 KSP residual looks identical. But >>>>>>>> *sometimes* it's producing exactly the same history and others it's >>>>>>>> gradually changing. I'm reasonably confident my residual evaluation has no >>>>>>>> randomness, see info after the petsc output. >>>>>>>> >>>>>>> >>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix >>>>>>> and run. Do you see any variability? >>>>>>> >>>>>>> If not, then it could be your routine, or it could be MatMFFD. So >>>>>>> run a few with -snes_view, and we can see if the >>>>>>> "w" parameter changes. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> solve history 1: >>>>>>>> >>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>> ... >>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>> >>>>>>>> solve history 2, identical to 1: >>>>>>>> >>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>> ... >>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>> >>>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, >>>>>>>> growing difference to the end: >>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>>> ... >>>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>>> >>>>>>>> >>>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for >>>>>>>> 10 iterations, so 30 calls of the same residual evaluation, identical >>>>>>>> residuals every time >>>>>>>> >>>>>>>> run 1: >>>>>>>> >>>>>>>> # iteration rho rhou rhov >>>>>>>> rhoE abs_res rel_res >>>>>>>> umin vmax vmin elapsed_time >>>>>>>> >>>>>>>> # >>>>>>>> >>>>>>>> >>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>>>>> 6.34834e-01 >>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>>>>> 6.40063e-01 >>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>>>>> 6.45166e-01 >>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>>>>> 6.50494e-01 >>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>>>>> 6.55656e-01 >>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>>>>> 6.60872e-01 >>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>>>>> 6.66041e-01 >>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>>>>> 6.71316e-01 >>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>>>>> 6.76447e-01 >>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>>>>> 6.81716e-01 >>>>>>>> >>>>>>>> run N: >>>>>>>> >>>>>>>> >>>>>>>> # >>>>>>>> >>>>>>>> >>>>>>>> # iteration rho rhou rhov >>>>>>>> rhoE abs_res rel_res >>>>>>>> umin vmax vmin elapsed_time >>>>>>>> >>>>>>>> # >>>>>>>> >>>>>>>> >>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>>>>> 6.23316e-01 >>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>>>>> 6.28510e-01 >>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>>>>> 6.33558e-01 >>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>>>>> 6.38773e-01 >>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>>>>> 6.43887e-01 >>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>>>>> 6.49073e-01 >>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>>>>> 6.54167e-01 >>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>>>>> 6.59394e-01 >>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>>>>> 6.64516e-01 >>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>>>>> 6.69677e-01 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: >>>>>>>> >>>>>>>>> ASM is just the sub PC with one proc but gets weaker with more >>>>>>>>> procs unless you use jacobi. (maybe I am missing something). >>>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Please send the output of -snes_view. >>>>>>>>>>> >>>>>>>>>> pasted below. anything stand out? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> SNES Object: 1 MPI process >>>>>>>>>> type: newtonls >>>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>>> total number of linear solver iterations=20 >>>>>>>>>> total number of function evaluations=22 >>>>>>>>>> norm schedule ALWAYS >>>>>>>>>> Jacobian is never rebuilt >>>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>>> Preconditioning Jacobian is built using finite differences with >>>>>>>>>> coloring >>>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>>> type: basic >>>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>>>>> lambda=1.000000e-08 >>>>>>>>>> maximum iterations=40 >>>>>>>>>> KSP Object: 1 MPI process >>>>>>>>>> type: gmres >>>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>>>>> Orthogonalization with no iterative refinement >>>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>>> left preconditioning >>>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: 1 MPI process >>>>>>>>>> type: asm >>>>>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>>>>> restriction/interpolation type - RESTRICT >>>>>>>>>> Local solver information for first block is in the >>>>>>>>>> following KSP and PC objects on rank 0: >>>>>>>>>> Use -ksp_view ::ascii_info_detail to display information >>>>>>>>>> for all blocks >>>>>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>>>>> type: preonly >>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>>>>>>> divergence=10000. >>>>>>>>>> left preconditioning >>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>> PC Object: (sub_) 1 MPI process >>>>>>>>>> type: ilu >>>>>>>>>> out-of-place factorization >>>>>>>>>> 0 levels of fill >>>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>>> matrix ordering: natural >>>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>>> Factored matrix follows: >>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>> type: seqbaij >>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>> package used to perform factorization: petsc >>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>> block size is 16 >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>> type: seqbaij >>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>> block size is 16 >>>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>> type: mffd >>>>>>>>>> rows=16384, cols=16384 >>>>>>>>>> Matrix-free approximation: >>>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>>> Using wp compute h routine >>>>>>>>>> Does not compute normU >>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>> type: seqbaij >>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>> block size is 16 >>>>>>>>>> >>>>>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>>>>> -snes_view might give you that. >>>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley < >>>>>>>>>>> knepley at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>>>> apart? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>>>>> >>>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the >>>>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is identical >>>>>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated applications to >>>>>>>>>>>>> solve a steady state multigrid problem, though here just one level) the >>>>>>>>>>>>> differences add up such that I might reach global convergence in 35 >>>>>>>>>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>>>>>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>>>>>>>>> would be. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The initial KSP residual is different, so its the PC. >>>>>>>>>>>> Please send the output of -snes_view. If your ASM is using direct >>>>>>>>>>>> factorization, then it >>>>>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>>>>> [...] >>>>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>>>>>>>> different >>>>>>>>>>>>> >>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>>>>> [...] >>>>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>>>> apart? That is the first couple of KSP iterations they are almost identical >>>>>>>>>>>>>> but then for each iteration get a bit further. Similar for the SNES >>>>>>>>>>>>>> iterations, starting close and then for more iterations and more solves >>>>>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very different? You >>>>>>>>>>>>>> can run with -snes_monitor -ksp_monitor >>>>>>>>>>>>>> >>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, >>>>>>>>>>>>>> was just guessing there. But the solutions/residuals are slightly different >>>>>>>>>>>>>> from run to run. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should >>>>>>>>>>>>>> expect bitwise identical results? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> No, the coloring should be identical every time. Do you >>>>>>>>>>>>>>> see differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an >>>>>>>>>>>>>>> MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, >>>>>>>>>>>>>>> with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in >>>>>>>>>>>>>>> residuals from run to run. I'm wondering where randomness might enter here >>>>>>>>>>>>>>> -- does the jacobian coloring use a random seed? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri May 5 15:58:35 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 5 May 2023 16:58:35 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> Message-ID: <316AF677-3AE7-4797-9457-71705BAFF9DC@petsc.dev> > On May 5, 2023, at 4:45 PM, Mark Lohry wrote: > > wow. leaving -O3 and turning off -march=native seems to have made it repeatable. this is on an avx2 cpu if it matters. > >> out-of-order instructions may be performed thus, two runs may have different order of operations >> > > this is terrifying if true. the source code path is exactly the same every time but the cpu does different things? Sure. And you will see more of it in the future, not less. It is not so much the CPU does different things each time but that the same things happen in a different order (and different order for floating point arithmetic means different results). > > On Fri, May 5, 2023 at 10:55?AM Barry Smith > wrote: >> >> Mark, >> >> Thank you. You do have aggressive optimizations: -O3 -march=native, which means out-of-order instructions may be performed thus, two runs may have different order of operations and possibly different round-off values. >> >> You could try turning off all of this with -O0 for an experiment and see what happens. My guess is that you will see much smaller differences in the residuals. >> >> Barry >> >> >>> On May 5, 2023, at 8:11 AM, Mark Lohry > wrote: >>> >>> >>> >>> On Thu, May 4, 2023 at 9:51?PM Barry Smith > wrote: >>>> >>>> Send configure.log >>>> >>>> >>>>> On May 4, 2023, at 5:35 PM, Mark Lohry > wrote: >>>>> >>>>>> Sure, but why only once and why save to disk? Why not just use that computed approximate Jacobian at each Newton step to drive the Newton solves along for a bunch of time steps? >>>>> >>>>> Ah I get what you mean. Okay I did three newton steps with the same LHS, with a few repeated manual tests. 3 out of 4 times i got the same exact history. is it in the realm of possibility that a hardware error could cause something this subtle, bad memory bit or something? >>>>> >>>>> 2 runs of 3 newton solves below, ever-so-slightly different. >>>>> >>>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.886124328003e+04 >>>>> 2 KSP Residual norm 2.504664994246e+04 >>>>> 3 KSP Residual norm 2.104615835161e+04 >>>>> 4 KSP Residual norm 1.938102896632e+04 >>>>> 5 KSP Residual norm 1.793774642408e+04 >>>>> 6 KSP Residual norm 1.671392566980e+04 >>>>> 7 KSP Residual norm 1.501504103873e+04 >>>>> 8 KSP Residual norm 1.366362900747e+04 >>>>> 9 KSP Residual norm 1.240398500429e+04 >>>>> 10 KSP Residual norm 1.156293733914e+04 >>>>> 11 KSP Residual norm 1.066296477958e+04 >>>>> 12 KSP Residual norm 9.835601966950e+03 >>>>> 13 KSP Residual norm 9.017480191491e+03 >>>>> 14 KSP Residual norm 8.415336139780e+03 >>>>> 15 KSP Residual norm 7.807497808435e+03 >>>>> 16 KSP Residual norm 7.341703768294e+03 >>>>> 17 KSP Residual norm 6.979298049282e+03 >>>>> 18 KSP Residual norm 6.521277772081e+03 >>>>> 19 KSP Residual norm 6.174842408773e+03 >>>>> 20 KSP Residual norm 5.889819665003e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 1 SNES Function norm 1.000525348433e+04 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=2 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 0 SNES Function norm 1.000525348433e+04 >>>>> 0 KSP Residual norm 1.000525348433e+04 >>>>> 1 KSP Residual norm 7.908741564765e+03 >>>>> 2 KSP Residual norm 6.825263536686e+03 >>>>> 3 KSP Residual norm 6.224930664968e+03 >>>>> 4 KSP Residual norm 6.095547180532e+03 >>>>> 5 KSP Residual norm 5.952968230430e+03 >>>>> 6 KSP Residual norm 5.861251998116e+03 >>>>> 7 KSP Residual norm 5.712439327755e+03 >>>>> 8 KSP Residual norm 5.583056913266e+03 >>>>> 9 KSP Residual norm 5.461768804626e+03 >>>>> 10 KSP Residual norm 5.351937611098e+03 >>>>> 11 KSP Residual norm 5.224288337578e+03 >>>>> 12 KSP Residual norm 5.129863847081e+03 >>>>> 13 KSP Residual norm 5.010818237218e+03 >>>>> 14 KSP Residual norm 4.907162936199e+03 >>>>> 15 KSP Residual norm 4.789564773955e+03 >>>>> 16 KSP Residual norm 4.695173370720e+03 >>>>> 17 KSP Residual norm 4.584070962171e+03 >>>>> 18 KSP Residual norm 4.483061424742e+03 >>>>> 19 KSP Residual norm 4.373384070745e+03 >>>>> 20 KSP Residual norm 4.260704657592e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 1 SNES Function norm 4.662386014882e+03 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=2 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 0 SNES Function norm 4.662386014882e+03 >>>>> 0 KSP Residual norm 4.662386014882e+03 >>>>> 1 KSP Residual norm 4.408316259864e+03 >>>>> 2 KSP Residual norm 4.184867769829e+03 >>>>> 3 KSP Residual norm 4.079091244351e+03 >>>>> 4 KSP Residual norm 4.009247390166e+03 >>>>> 5 KSP Residual norm 3.928417371428e+03 >>>>> 6 KSP Residual norm 3.865152075780e+03 >>>>> 7 KSP Residual norm 3.795606446033e+03 >>>>> 8 KSP Residual norm 3.735294554158e+03 >>>>> 9 KSP Residual norm 3.674393726487e+03 >>>>> 10 KSP Residual norm 3.617795166786e+03 >>>>> 11 KSP Residual norm 3.563807982274e+03 >>>>> 12 KSP Residual norm 3.512269444921e+03 >>>>> 13 KSP Residual norm 3.455110223236e+03 >>>>> 14 KSP Residual norm 3.407141247372e+03 >>>>> 15 KSP Residual norm 3.356562415982e+03 >>>>> 16 KSP Residual norm 3.312720047685e+03 >>>>> 17 KSP Residual norm 3.263690150810e+03 >>>>> 18 KSP Residual norm 3.219359862444e+03 >>>>> 19 KSP Residual norm 3.173500955995e+03 >>>>> 20 KSP Residual norm 3.127528790155e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 1 SNES Function norm 3.186752172556e+03 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=2 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> >>>>> >>>>> >>>>> 0 SNES Function norm 3.424003312857e+04 >>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>> 1 KSP Residual norm 2.886124328003e+04 >>>>> 2 KSP Residual norm 2.504664994221e+04 >>>>> 3 KSP Residual norm 2.104615835130e+04 >>>>> 4 KSP Residual norm 1.938102896610e+04 >>>>> 5 KSP Residual norm 1.793774642406e+04 >>>>> 6 KSP Residual norm 1.671392566981e+04 >>>>> 7 KSP Residual norm 1.501504103854e+04 >>>>> 8 KSP Residual norm 1.366362900726e+04 >>>>> 9 KSP Residual norm 1.240398500414e+04 >>>>> 10 KSP Residual norm 1.156293733914e+04 >>>>> 11 KSP Residual norm 1.066296477972e+04 >>>>> 12 KSP Residual norm 9.835601967036e+03 >>>>> 13 KSP Residual norm 9.017480191500e+03 >>>>> 14 KSP Residual norm 8.415336139732e+03 >>>>> 15 KSP Residual norm 7.807497808414e+03 >>>>> 16 KSP Residual norm 7.341703768300e+03 >>>>> 17 KSP Residual norm 6.979298049244e+03 >>>>> 18 KSP Residual norm 6.521277772042e+03 >>>>> 19 KSP Residual norm 6.174842408713e+03 >>>>> 20 KSP Residual norm 5.889819664983e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 1 SNES Function norm 1.000525348435e+04 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=2 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 0 SNES Function norm 1.000525348435e+04 >>>>> 0 KSP Residual norm 1.000525348435e+04 >>>>> 1 KSP Residual norm 7.908741565645e+03 >>>>> 2 KSP Residual norm 6.825263536988e+03 >>>>> 3 KSP Residual norm 6.224930664967e+03 >>>>> 4 KSP Residual norm 6.095547180474e+03 >>>>> 5 KSP Residual norm 5.952968230397e+03 >>>>> 6 KSP Residual norm 5.861251998127e+03 >>>>> 7 KSP Residual norm 5.712439327726e+03 >>>>> 8 KSP Residual norm 5.583056913167e+03 >>>>> 9 KSP Residual norm 5.461768804526e+03 >>>>> 10 KSP Residual norm 5.351937611030e+03 >>>>> 11 KSP Residual norm 5.224288337536e+03 >>>>> 12 KSP Residual norm 5.129863847028e+03 >>>>> 13 KSP Residual norm 5.010818237161e+03 >>>>> 14 KSP Residual norm 4.907162936143e+03 >>>>> 15 KSP Residual norm 4.789564773923e+03 >>>>> 16 KSP Residual norm 4.695173370709e+03 >>>>> 17 KSP Residual norm 4.584070962145e+03 >>>>> 18 KSP Residual norm 4.483061424714e+03 >>>>> 19 KSP Residual norm 4.373384070713e+03 >>>>> 20 KSP Residual norm 4.260704657576e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 1 SNES Function norm 4.662386014874e+03 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=2 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 0 SNES Function norm 4.662386014874e+03 >>>>> 0 KSP Residual norm 4.662386014874e+03 >>>>> 1 KSP Residual norm 4.408316259834e+03 >>>>> 2 KSP Residual norm 4.184867769891e+03 >>>>> 3 KSP Residual norm 4.079091244367e+03 >>>>> 4 KSP Residual norm 4.009247390184e+03 >>>>> 5 KSP Residual norm 3.928417371457e+03 >>>>> 6 KSP Residual norm 3.865152075802e+03 >>>>> 7 KSP Residual norm 3.795606446041e+03 >>>>> 8 KSP Residual norm 3.735294554160e+03 >>>>> 9 KSP Residual norm 3.674393726485e+03 >>>>> 10 KSP Residual norm 3.617795166775e+03 >>>>> 11 KSP Residual norm 3.563807982249e+03 >>>>> 12 KSP Residual norm 3.512269444873e+03 >>>>> 13 KSP Residual norm 3.455110223193e+03 >>>>> 14 KSP Residual norm 3.407141247334e+03 >>>>> 15 KSP Residual norm 3.356562415949e+03 >>>>> 16 KSP Residual norm 3.312720047652e+03 >>>>> 17 KSP Residual norm 3.263690150782e+03 >>>>> 18 KSP Residual norm 3.219359862425e+03 >>>>> 19 KSP Residual norm 3.173500955997e+03 >>>>> 20 KSP Residual norm 3.127528790156e+03 >>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> 1 SNES Function norm 3.186752172503e+03 >>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>> SNES Object: 1 MPI process >>>>> type: newtonls >>>>> maximum iterations=1, maximum function evaluations=-1 >>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>> total number of linear solver iterations=20 >>>>> total number of function evaluations=2 >>>>> norm schedule ALWAYS >>>>> Jacobian is never rebuilt >>>>> Jacobian is built using finite differences with coloring >>>>> SNESLineSearch Object: 1 MPI process >>>>> type: basic >>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>> maximum iterations=40 >>>>> KSP Object: 1 MPI process >>>>> type: gmres >>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> happy breakdown tolerance 1e-30 >>>>> maximum iterations=20, initial guess is zero >>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI process >>>>> type: none >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI process >>>>> type: seqbaij >>>>> rows=16384, cols=16384, bs=16 >>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>> total number of mallocs used during MatSetValues calls=0 >>>>> block size is 16 >>>>> >>>>> On Thu, May 4, 2023 at 5:22?PM Matthew Knepley > wrote: >>>>>> On Thu, May 4, 2023 at 5:03?PM Mark Lohry > wrote: >>>>>>>> Do you get different results (in different runs) without -snes_mf_operator? So just using an explicit matrix? >>>>>>> >>>>>>> Unfortunately I don't have an explicit matrix available for this, hence the MFFD/JFNK. >>>>>> >>>>>> I don't mean the actual matrix, I mean a representative matrix. >>>>>> >>>>>>>> >>>>>>>> (Note: I am not convinced there is even a problem and think it may be simply different order of floating point operations in different runs.) >>>>>>> >>>>>>> I'm not convinced either, but running explicit RK for 10,000 iterations i get exactly the same results every time so i'm fairly confident it's not the residual evaluation. >>>>>>> How would there be a different order of floating point ops in different runs in serial? >>>>>>> >>>>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence >>>>>>>> that nothing in the solver is variable. >>>>>>>> >>>>>>> I could do the sparse finite difference jacobian once, save it to disk, and then use that system each time. >>>>>> >>>>>> Yes. That would work. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>>> On Thu, May 4, 2023 at 4:57?PM Matthew Knepley > wrote: >>>>>>>> On Thu, May 4, 2023 at 4:44?PM Mark Lohry > wrote: >>>>>>>>>> Is your code valgrind clean? >>>>>>>>> >>>>>>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not using anything uninitialized. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >>>>>>>>> >>>>>>>>> I think I did what you're asking. I have -snes_mf_operator set, and then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is a matrix with ones on the diagonal. Two runs below, still with differences but sometimes identical. >>>>>>>> >>>>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running that solver with a sparse matrix. This would give me confidence >>>>>>>> that nothing in the solver is variable. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>>>> 3 KSP Residual norm 2.131675872968e+04 >>>>>>>>> 4 KSP Residual norm 1.973129814235e+04 >>>>>>>>> 5 KSP Residual norm 1.832377856317e+04 >>>>>>>>> 6 KSP Residual norm 1.716783617436e+04 >>>>>>>>> 7 KSP Residual norm 1.583963149542e+04 >>>>>>>>> 8 KSP Residual norm 1.482272170304e+04 >>>>>>>>> 9 KSP Residual norm 1.380312106742e+04 >>>>>>>>> 10 KSP Residual norm 1.297793480658e+04 >>>>>>>>> 11 KSP Residual norm 1.208599123244e+04 >>>>>>>>> 12 KSP Residual norm 1.137345655227e+04 >>>>>>>>> 13 KSP Residual norm 1.059676909366e+04 >>>>>>>>> 14 KSP Residual norm 1.003823862398e+04 >>>>>>>>> 15 KSP Residual norm 9.425879221354e+03 >>>>>>>>> 16 KSP Residual norm 8.954805890038e+03 >>>>>>>>> 17 KSP Residual norm 8.592372470456e+03 >>>>>>>>> 18 KSP Residual norm 8.060707175821e+03 >>>>>>>>> 19 KSP Residual norm 7.782057728723e+03 >>>>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> KSP Object: 1 MPI process >>>>>>>>> type: gmres >>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>> left preconditioning >>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: mffd >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> Matrix-free approximation: >>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>> Using wp compute h routine >>>>>>>>> Does not compute normU >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: seqaij >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>> not using I-node routines >>>>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>> SNES Object: 1 MPI process >>>>>>>>> type: newtonls >>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>> total number of linear solver iterations=20 >>>>>>>>> total number of function evaluations=23 >>>>>>>>> norm schedule ALWAYS >>>>>>>>> Jacobian is never rebuilt >>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>> type: basic >>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>>>>>> maximum iterations=40 >>>>>>>>> KSP Object: 1 MPI process >>>>>>>>> type: gmres >>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>> left preconditioning >>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: mffd >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> Matrix-free approximation: >>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>> Using wp compute h routine >>>>>>>>> Does not compute normU >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: seqaij >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>> not using I-node routines >>>>>>>>> >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>> 3 KSP Residual norm 2.131675873776e+04 >>>>>>>>> 4 KSP Residual norm 1.973129814908e+04 >>>>>>>>> 5 KSP Residual norm 1.832377852186e+04 >>>>>>>>> 6 KSP Residual norm 1.716783608174e+04 >>>>>>>>> 7 KSP Residual norm 1.583963128956e+04 >>>>>>>>> 8 KSP Residual norm 1.482272160069e+04 >>>>>>>>> 9 KSP Residual norm 1.380312087005e+04 >>>>>>>>> 10 KSP Residual norm 1.297793458796e+04 >>>>>>>>> 11 KSP Residual norm 1.208599115602e+04 >>>>>>>>> 12 KSP Residual norm 1.137345657533e+04 >>>>>>>>> 13 KSP Residual norm 1.059676906197e+04 >>>>>>>>> 14 KSP Residual norm 1.003823857515e+04 >>>>>>>>> 15 KSP Residual norm 9.425879177747e+03 >>>>>>>>> 16 KSP Residual norm 8.954805850825e+03 >>>>>>>>> 17 KSP Residual norm 8.592372413320e+03 >>>>>>>>> 18 KSP Residual norm 8.060706994110e+03 >>>>>>>>> 19 KSP Residual norm 7.782057560782e+03 >>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> KSP Object: 1 MPI process >>>>>>>>> type: gmres >>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>> left preconditioning >>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: mffd >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> Matrix-free approximation: >>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>> Using wp compute h routine >>>>>>>>> Does not compute normU >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: seqaij >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>> not using I-node routines >>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>> SNES Object: 1 MPI process >>>>>>>>> type: newtonls >>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>> total number of linear solver iterations=20 >>>>>>>>> total number of function evaluations=23 >>>>>>>>> norm schedule ALWAYS >>>>>>>>> Jacobian is never rebuilt >>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>> type: basic >>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>>>>>> maximum iterations=40 >>>>>>>>> KSP Object: 1 MPI process >>>>>>>>> type: gmres >>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>> left preconditioning >>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: mffd >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> Matrix-free approximation: >>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>> Using wp compute h routine >>>>>>>>> Does not compute normU >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: seqaij >>>>>>>>> rows=16384, cols=16384 >>>>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>> not using I-node routines >>>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley > wrote: >>>>>>>>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry > wrote: >>>>>>>>>>>> Try -pc_type none. >>>>>>>>>>> >>>>>>>>>>> With -pc_type none the 0 KSP residual looks identical. But *sometimes* it's producing exactly the same history and others it's gradually changing. I'm reasonably confident my residual evaluation has no randomness, see info after the petsc output. >>>>>>>>>> >>>>>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix and run. Do you see any variability? >>>>>>>>>> >>>>>>>>>> If not, then it could be your routine, or it could be MatMFFD. So run a few with -snes_view, and we can see if the >>>>>>>>>> "w" parameter changes. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>>> solve history 1: >>>>>>>>>>> >>>>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>>>> ... >>>>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>>>> >>>>>>>>>>> solve history 2, identical to 1: >>>>>>>>>>> >>>>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>>>> ... >>>>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>>>> >>>>>>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing difference to the end: >>>>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>>>>>> ... >>>>>>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 iterations, so 30 calls of the same residual evaluation, identical residuals every time >>>>>>>>>>> >>>>>>>>>>> run 1: >>>>>>>>>>> >>>>>>>>>>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>>>>>>>>>> # >>>>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.34834e-01 >>>>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.40063e-01 >>>>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.45166e-01 >>>>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.50494e-01 >>>>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.55656e-01 >>>>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.60872e-01 >>>>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.66041e-01 >>>>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.71316e-01 >>>>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.76447e-01 >>>>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.81716e-01 >>>>>>>>>>> >>>>>>>>>>> run N: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> # >>>>>>>>>>> # iteration rho rhou rhov rhoE abs_res rel_res umin vmax vmin elapsed_time >>>>>>>>>>> # >>>>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 4.482867643761e+00 2.993435920340e+02 2.04353e+02 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 6.23316e-01 >>>>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 3.958323921837e+00 5.058927165686e+02 2.58647e+02 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 6.28510e-01 >>>>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 6.130016323357e+00 4.688968362579e+02 2.36201e+02 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 6.33558e-01 >>>>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 6.313917437428e+00 4.054310291628e+02 2.03612e+02 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 6.38773e-01 >>>>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 6.237106158479e+00 3.539201037156e+02 1.77577e+02 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 6.43887e-01 >>>>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 5.996962200407e+00 3.148280178142e+02 1.57913e+02 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 6.49073e-01 >>>>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 5.786910705204e+00 2.848717011033e+02 1.42872e+02 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 6.54167e-01 >>>>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 5.627281158306e+00 2.606623371229e+02 1.30728e+02 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 6.59394e-01 >>>>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 5.514933264437e+00 2.401524522393e+02 1.20444e+02 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 6.64516e-01 >>>>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 5.433183087037e+00 2.222572900473e+02 1.11475e+02 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 6.69677e-01 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams > wrote: >>>>>>>>>>>> ASM is just the sub PC with one proc but gets weaker with more procs unless you use jacobi. (maybe I am missing something). >>>>>>>>>>>> >>>>>>>>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry > wrote: >>>>>>>>>>>>>> Please send the output of -snes_view. >>>>>>>>>>>>> pasted below. anything stand out? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> SNES Object: 1 MPI process >>>>>>>>>>>>> type: newtonls >>>>>>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>>>>>> total number of linear solver iterations=20 >>>>>>>>>>>>> total number of function evaluations=22 >>>>>>>>>>>>> norm schedule ALWAYS >>>>>>>>>>>>> Jacobian is never rebuilt >>>>>>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>>>>>> Preconditioning Jacobian is built using finite differences with coloring >>>>>>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>>>>>> type: basic >>>>>>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 >>>>>>>>>>>>> maximum iterations=40 >>>>>>>>>>>>> KSP Object: 1 MPI process >>>>>>>>>>>>> type: gmres >>>>>>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>>>>>> left preconditioning >>>>>>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>>>>>> PC Object: 1 MPI process >>>>>>>>>>>>> type: asm >>>>>>>>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>>>>>>>> restriction/interpolation type - RESTRICT >>>>>>>>>>>>> Local solver information for first block is in the following KSP and PC objects on rank 0: >>>>>>>>>>>>> Use -ksp_view ::ascii_info_detail to display information for all blocks >>>>>>>>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>>>>>>>> type: preonly >>>>>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>>>>>> left preconditioning >>>>>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>>>>> PC Object: (sub_) 1 MPI process >>>>>>>>>>>>> type: ilu >>>>>>>>>>>>> out-of-place factorization >>>>>>>>>>>>> 0 levels of fill >>>>>>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>>>>>> matrix ordering: natural >>>>>>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>>>>>> Factored matrix follows: >>>>>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>>>>> type: seqbaij >>>>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>>>> package used to perform factorization: petsc >>>>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>>>> block size is 16 >>>>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>>>>> type: seqbaij >>>>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>>>>> block size is 16 >>>>>>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>>>> type: mffd >>>>>>>>>>>>> rows=16384, cols=16384 >>>>>>>>>>>>> Matrix-free approximation: >>>>>>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>>>>>> Using wp compute h routine >>>>>>>>>>>>> Does not compute normU >>>>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>>>> type: seqbaij >>>>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>>>>> block size is 16 >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams > wrote: >>>>>>>>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>>>>>>>> -snes_view might give you that. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley > wrote: >>>>>>>>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry > wrote: >>>>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further apart? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the end. The difference for one solve is slight (final SNES norm is identical to 5 digits), but in the context I'm using it in (repeated applications to solve a steady state multigrid problem, though here just one level) the differences add up such that I might reach global convergence in 35 iterations or 38. It's not the end of the world, but I was expecting that with -np 1 these would be identical and I'm not sure where the root cause would be. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The initial KSP residual is different, so its the PC. Please send the output of -snes_view. If your ASM is using direct factorization, then it >>>>>>>>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>>>>>>>> [...] >>>>>>>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly different >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>>>>>>>> [...] >>>>>>>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith > wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further apart? That is the first couple of KSP iterations they are almost identical but then for each iteration get a bit further. Similar for the SNES iterations, starting close and then for more iterations and more solves they start moving apart. Or do they suddenly jump to be very different? You can run with -snes_monitor -ksp_monitor >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry > wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, was just guessing there. But the solutions/residuals are slightly different from run to run. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect bitwise identical results? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith > wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> No, the coloring should be identical every time. Do you see differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry > wrote: >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals from run to run. I'm wondering where randomness might enter here -- does the jacobian coloring use a random seed? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Fri May 5 16:02:21 2023 From: mlohry at gmail.com (Mark Lohry) Date: Fri, 5 May 2023 17:02:21 -0400 Subject: [petsc-users] sources of floating point randomness in JFNK in serial In-Reply-To: <316AF677-3AE7-4797-9457-71705BAFF9DC@petsc.dev> References: <5318727D-B9F9-48BC-A7CE-94EBDB08566F@petsc.dev> <316AF677-3AE7-4797-9457-71705BAFF9DC@petsc.dev> Message-ID: are there any safe subsets of -march=whatever? i had it on to take advantage of simd ops on avx512 chips but never looked so close at the exact results. On Fri, May 5, 2023 at 4:58?PM Barry Smith wrote: > > > On May 5, 2023, at 4:45 PM, Mark Lohry wrote: > > wow. leaving -O3 and turning off -march=native seems to have made it > repeatable. this is on an avx2 cpu if it matters. > > out-of-order instructions may be performed thus, two runs may have >> different order of operations >> >> > this is terrifying if true. the source code path is exactly the same every > time but the cpu does different things? > > > Sure. And you will see more of it in the future, not less. It is not so > much the CPU does different things each time but that the same things > happen in a different order (and different order for floating point > arithmetic means different results). > > > On Fri, May 5, 2023 at 10:55?AM Barry Smith wrote: > >> >> Mark, >> >> Thank you. You do have aggressive optimizations: -O3 -march=native, >> which means out-of-order instructions may be performed thus, two runs may >> have different order of operations and possibly different round-off values. >> >> You could try turning off all of this with -O0 for an experiment and >> see what happens. My guess is that you will see much smaller differences in >> the residuals. >> >> Barry >> >> >> On May 5, 2023, at 8:11 AM, Mark Lohry wrote: >> >> >> >> On Thu, May 4, 2023 at 9:51?PM Barry Smith wrote: >> >>> >>> Send configure.log >>> >>> >>> On May 4, 2023, at 5:35 PM, Mark Lohry wrote: >>> >>> Sure, but why only once and why save to disk? Why not just use that >>>> computed approximate Jacobian at each Newton step to drive the Newton >>>> solves along for a bunch of time steps? >>> >>> >>> Ah I get what you mean. Okay I did three newton steps with the same LHS, >>> with a few repeated manual tests. 3 out of 4 times i got the same exact >>> history. is it in the realm of possibility that a hardware error could >>> cause something this subtle, bad memory bit or something? >>> >>> 2 runs of 3 newton solves below, ever-so-slightly different. >>> >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.886124328003e+04 >>> 2 KSP Residual norm 2.504664994246e+04 >>> 3 KSP Residual norm 2.104615835161e+04 >>> 4 KSP Residual norm 1.938102896632e+04 >>> 5 KSP Residual norm 1.793774642408e+04 >>> 6 KSP Residual norm 1.671392566980e+04 >>> 7 KSP Residual norm 1.501504103873e+04 >>> 8 KSP Residual norm 1.366362900747e+04 >>> 9 KSP Residual norm 1.240398500429e+04 >>> 10 KSP Residual norm 1.156293733914e+04 >>> 11 KSP Residual norm 1.066296477958e+04 >>> 12 KSP Residual norm 9.835601966950e+03 >>> 13 KSP Residual norm 9.017480191491e+03 >>> 14 KSP Residual norm 8.415336139780e+03 >>> 15 KSP Residual norm 7.807497808435e+03 >>> 16 KSP Residual norm 7.341703768294e+03 >>> 17 KSP Residual norm 6.979298049282e+03 >>> 18 KSP Residual norm 6.521277772081e+03 >>> 19 KSP Residual norm 6.174842408773e+03 >>> 20 KSP Residual norm 5.889819665003e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 1.000525348433e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 1.000525348433e+04 >>> 0 KSP Residual norm 1.000525348433e+04 >>> 1 KSP Residual norm 7.908741564765e+03 >>> 2 KSP Residual norm 6.825263536686e+03 >>> 3 KSP Residual norm 6.224930664968e+03 >>> 4 KSP Residual norm 6.095547180532e+03 >>> 5 KSP Residual norm 5.952968230430e+03 >>> 6 KSP Residual norm 5.861251998116e+03 >>> 7 KSP Residual norm 5.712439327755e+03 >>> 8 KSP Residual norm 5.583056913266e+03 >>> 9 KSP Residual norm 5.461768804626e+03 >>> 10 KSP Residual norm 5.351937611098e+03 >>> 11 KSP Residual norm 5.224288337578e+03 >>> 12 KSP Residual norm 5.129863847081e+03 >>> 13 KSP Residual norm 5.010818237218e+03 >>> 14 KSP Residual norm 4.907162936199e+03 >>> 15 KSP Residual norm 4.789564773955e+03 >>> 16 KSP Residual norm 4.695173370720e+03 >>> 17 KSP Residual norm 4.584070962171e+03 >>> 18 KSP Residual norm 4.483061424742e+03 >>> 19 KSP Residual norm 4.373384070745e+03 >>> 20 KSP Residual norm 4.260704657592e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 4.662386014882e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 4.662386014882e+03 >>> 0 KSP Residual norm 4.662386014882e+03 >>> 1 KSP Residual norm 4.408316259864e+03 >>> 2 KSP Residual norm 4.184867769829e+03 >>> 3 KSP Residual norm 4.079091244351e+03 >>> 4 KSP Residual norm 4.009247390166e+03 >>> 5 KSP Residual norm 3.928417371428e+03 >>> 6 KSP Residual norm 3.865152075780e+03 >>> 7 KSP Residual norm 3.795606446033e+03 >>> 8 KSP Residual norm 3.735294554158e+03 >>> 9 KSP Residual norm 3.674393726487e+03 >>> 10 KSP Residual norm 3.617795166786e+03 >>> 11 KSP Residual norm 3.563807982274e+03 >>> 12 KSP Residual norm 3.512269444921e+03 >>> 13 KSP Residual norm 3.455110223236e+03 >>> 14 KSP Residual norm 3.407141247372e+03 >>> 15 KSP Residual norm 3.356562415982e+03 >>> 16 KSP Residual norm 3.312720047685e+03 >>> 17 KSP Residual norm 3.263690150810e+03 >>> 18 KSP Residual norm 3.219359862444e+03 >>> 19 KSP Residual norm 3.173500955995e+03 >>> 20 KSP Residual norm 3.127528790155e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 3.186752172556e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> >>> >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.886124328003e+04 >>> 2 KSP Residual norm 2.504664994221e+04 >>> 3 KSP Residual norm 2.104615835130e+04 >>> 4 KSP Residual norm 1.938102896610e+04 >>> 5 KSP Residual norm 1.793774642406e+04 >>> 6 KSP Residual norm 1.671392566981e+04 >>> 7 KSP Residual norm 1.501504103854e+04 >>> 8 KSP Residual norm 1.366362900726e+04 >>> 9 KSP Residual norm 1.240398500414e+04 >>> 10 KSP Residual norm 1.156293733914e+04 >>> 11 KSP Residual norm 1.066296477972e+04 >>> 12 KSP Residual norm 9.835601967036e+03 >>> 13 KSP Residual norm 9.017480191500e+03 >>> 14 KSP Residual norm 8.415336139732e+03 >>> 15 KSP Residual norm 7.807497808414e+03 >>> 16 KSP Residual norm 7.341703768300e+03 >>> 17 KSP Residual norm 6.979298049244e+03 >>> 18 KSP Residual norm 6.521277772042e+03 >>> 19 KSP Residual norm 6.174842408713e+03 >>> 20 KSP Residual norm 5.889819664983e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 1.000525348435e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 1.000525348435e+04 >>> 0 KSP Residual norm 1.000525348435e+04 >>> 1 KSP Residual norm 7.908741565645e+03 >>> 2 KSP Residual norm 6.825263536988e+03 >>> 3 KSP Residual norm 6.224930664967e+03 >>> 4 KSP Residual norm 6.095547180474e+03 >>> 5 KSP Residual norm 5.952968230397e+03 >>> 6 KSP Residual norm 5.861251998127e+03 >>> 7 KSP Residual norm 5.712439327726e+03 >>> 8 KSP Residual norm 5.583056913167e+03 >>> 9 KSP Residual norm 5.461768804526e+03 >>> 10 KSP Residual norm 5.351937611030e+03 >>> 11 KSP Residual norm 5.224288337536e+03 >>> 12 KSP Residual norm 5.129863847028e+03 >>> 13 KSP Residual norm 5.010818237161e+03 >>> 14 KSP Residual norm 4.907162936143e+03 >>> 15 KSP Residual norm 4.789564773923e+03 >>> 16 KSP Residual norm 4.695173370709e+03 >>> 17 KSP Residual norm 4.584070962145e+03 >>> 18 KSP Residual norm 4.483061424714e+03 >>> 19 KSP Residual norm 4.373384070713e+03 >>> 20 KSP Residual norm 4.260704657576e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 4.662386014874e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 4.662386014874e+03 >>> 0 KSP Residual norm 4.662386014874e+03 >>> 1 KSP Residual norm 4.408316259834e+03 >>> 2 KSP Residual norm 4.184867769891e+03 >>> 3 KSP Residual norm 4.079091244367e+03 >>> 4 KSP Residual norm 4.009247390184e+03 >>> 5 KSP Residual norm 3.928417371457e+03 >>> 6 KSP Residual norm 3.865152075802e+03 >>> 7 KSP Residual norm 3.795606446041e+03 >>> 8 KSP Residual norm 3.735294554160e+03 >>> 9 KSP Residual norm 3.674393726485e+03 >>> 10 KSP Residual norm 3.617795166775e+03 >>> 11 KSP Residual norm 3.563807982249e+03 >>> 12 KSP Residual norm 3.512269444873e+03 >>> 13 KSP Residual norm 3.455110223193e+03 >>> 14 KSP Residual norm 3.407141247334e+03 >>> 15 KSP Residual norm 3.356562415949e+03 >>> 16 KSP Residual norm 3.312720047652e+03 >>> 17 KSP Residual norm 3.263690150782e+03 >>> 18 KSP Residual norm 3.219359862425e+03 >>> 19 KSP Residual norm 3.173500955997e+03 >>> 20 KSP Residual norm 3.127528790156e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 3.186752172503e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> >>> On Thu, May 4, 2023 at 5:22?PM Matthew Knepley >>> wrote: >>> >>>> On Thu, May 4, 2023 at 5:03?PM Mark Lohry wrote: >>>> >>>>> Do you get different results (in different runs) without >>>>>> -snes_mf_operator? So just using an explicit matrix? >>>>> >>>>> >>>>> Unfortunately I don't have an explicit matrix available for this, >>>>> hence the MFFD/JFNK. >>>>> >>>> >>>> I don't mean the actual matrix, I mean a representative matrix. >>>> >>>> >>>>> >>>>>> (Note: I am not convinced there is even a problem and think it may >>>>>> be simply different order of floating point operations in different runs.) >>>>>> >>>>> >>>>> I'm not convinced either, but running explicit RK for 10,000 >>>>> iterations i get exactly the same results every time so i'm fairly >>>>> confident it's not the residual evaluation. >>>>> How would there be a different order of floating point ops in >>>>> different runs in serial? >>>>> >>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>>>>> that solver with a sparse matrix. This would give me confidence >>>>>> that nothing in the solver is variable. >>>>>> >>>>>> I could do the sparse finite difference jacobian once, save it to >>>>> disk, and then use that system each time. >>>>> >>>> >>>> Yes. That would work. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> On Thu, May 4, 2023 at 4:57?PM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, May 4, 2023 at 4:44?PM Mark Lohry wrote: >>>>>> >>>>>>> Is your code valgrind clean? >>>>>>>> >>>>>>> >>>>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not >>>>>>> using anything uninitialized. >>>>>>> >>>>>>> >>>>>>>> We can try and test this. Replace your MatMFFD with an actual >>>>>>>> matrix and run. Do you see any variability? >>>>>>>> >>>>>>> >>>>>>> I think I did what you're asking. I have -snes_mf_operator set, and >>>>>>> then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where >>>>>>> diag_ones is a matrix with ones on the diagonal. Two runs below, still with >>>>>>> differences but sometimes identical. >>>>>>> >>>>>> >>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>>>>> that solver with a sparse matrix. This would give me confidence >>>>>> that nothing in the solver is variable. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>> 3 KSP Residual norm 2.131675872968e+04 >>>>>>> 4 KSP Residual norm 1.973129814235e+04 >>>>>>> 5 KSP Residual norm 1.832377856317e+04 >>>>>>> 6 KSP Residual norm 1.716783617436e+04 >>>>>>> 7 KSP Residual norm 1.583963149542e+04 >>>>>>> 8 KSP Residual norm 1.482272170304e+04 >>>>>>> 9 KSP Residual norm 1.380312106742e+04 >>>>>>> 10 KSP Residual norm 1.297793480658e+04 >>>>>>> 11 KSP Residual norm 1.208599123244e+04 >>>>>>> 12 KSP Residual norm 1.137345655227e+04 >>>>>>> 13 KSP Residual norm 1.059676909366e+04 >>>>>>> 14 KSP Residual norm 1.003823862398e+04 >>>>>>> 15 KSP Residual norm 9.425879221354e+03 >>>>>>> 16 KSP Residual norm 8.954805890038e+03 >>>>>>> 17 KSP Residual norm 8.592372470456e+03 >>>>>>> 18 KSP Residual norm 8.060707175821e+03 >>>>>>> 19 KSP Residual norm 7.782057728723e+03 >>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>> SNES Object: 1 MPI process >>>>>>> type: newtonls >>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>> total number of linear solver iterations=20 >>>>>>> total number of function evaluations=23 >>>>>>> norm schedule ALWAYS >>>>>>> Jacobian is never rebuilt >>>>>>> Jacobian is applied matrix-free with differencing >>>>>>> Preconditioning Jacobian is built using finite differences with >>>>>>> coloring >>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>> type: basic >>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>> lambda=1.000000e-08 >>>>>>> maximum iterations=40 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>> 3 KSP Residual norm 2.131675873776e+04 >>>>>>> 4 KSP Residual norm 1.973129814908e+04 >>>>>>> 5 KSP Residual norm 1.832377852186e+04 >>>>>>> 6 KSP Residual norm 1.716783608174e+04 >>>>>>> 7 KSP Residual norm 1.583963128956e+04 >>>>>>> 8 KSP Residual norm 1.482272160069e+04 >>>>>>> 9 KSP Residual norm 1.380312087005e+04 >>>>>>> 10 KSP Residual norm 1.297793458796e+04 >>>>>>> 11 KSP Residual norm 1.208599115602e+04 >>>>>>> 12 KSP Residual norm 1.137345657533e+04 >>>>>>> 13 KSP Residual norm 1.059676906197e+04 >>>>>>> 14 KSP Residual norm 1.003823857515e+04 >>>>>>> 15 KSP Residual norm 9.425879177747e+03 >>>>>>> 16 KSP Residual norm 8.954805850825e+03 >>>>>>> 17 KSP Residual norm 8.592372413320e+03 >>>>>>> 18 KSP Residual norm 8.060706994110e+03 >>>>>>> 19 KSP Residual norm 7.782057560782e+03 >>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>> SNES Object: 1 MPI process >>>>>>> type: newtonls >>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>> total number of linear solver iterations=20 >>>>>>> total number of function evaluations=23 >>>>>>> norm schedule ALWAYS >>>>>>> Jacobian is never rebuilt >>>>>>> Jacobian is applied matrix-free with differencing >>>>>>> Preconditioning Jacobian is built using finite differences with >>>>>>> coloring >>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>> type: basic >>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>> lambda=1.000000e-08 >>>>>>> maximum iterations=40 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> >>>>>>> On Thu, May 4, 2023 at 10:10?AM Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:54?AM Mark Lohry wrote: >>>>>>>> >>>>>>>>> Try -pc_type none. >>>>>>>>>> >>>>>>>>> >>>>>>>>> With -pc_type none the 0 KSP residual looks identical. But >>>>>>>>> *sometimes* it's producing exactly the same history and others it's >>>>>>>>> gradually changing. I'm reasonably confident my residual evaluation has no >>>>>>>>> randomness, see info after the petsc output. >>>>>>>>> >>>>>>>> >>>>>>>> We can try and test this. Replace your MatMFFD with an actual >>>>>>>> matrix and run. Do you see any variability? >>>>>>>> >>>>>>>> If not, then it could be your routine, or it could be MatMFFD. So >>>>>>>> run a few with -snes_view, and we can see if the >>>>>>>> "w" parameter changes. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> solve history 1: >>>>>>>>> >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>> >>>>>>>>> solve history 2, identical to 1: >>>>>>>>> >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>> >>>>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, >>>>>>>>> growing difference to the end: >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>>>> >>>>>>>>> >>>>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for >>>>>>>>> 10 iterations, so 30 calls of the same residual evaluation, identical >>>>>>>>> residuals every time >>>>>>>>> >>>>>>>>> run 1: >>>>>>>>> >>>>>>>>> # iteration rho rhou >>>>>>>>> rhov rhoE abs_res rel_res >>>>>>>>> umin vmax vmin >>>>>>>>> elapsed_time >>>>>>>>> # >>>>>>>>> >>>>>>>>> >>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>>>>>> 6.34834e-01 >>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>>>>>> 6.40063e-01 >>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>>>>>> 6.45166e-01 >>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>>>>>> 6.50494e-01 >>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>>>>>> 6.55656e-01 >>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>>>>>> 6.60872e-01 >>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>>>>>> 6.66041e-01 >>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>>>>>> 6.71316e-01 >>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>>>>>> 6.76447e-01 >>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>>>>>> 6.81716e-01 >>>>>>>>> >>>>>>>>> run N: >>>>>>>>> >>>>>>>>> >>>>>>>>> # >>>>>>>>> >>>>>>>>> >>>>>>>>> # iteration rho rhou >>>>>>>>> rhov rhoE abs_res rel_res >>>>>>>>> umin vmax vmin >>>>>>>>> elapsed_time >>>>>>>>> # >>>>>>>>> >>>>>>>>> >>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14 >>>>>>>>> 6.23316e-01 >>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14 >>>>>>>>> 6.28510e-01 >>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14 >>>>>>>>> 6.33558e-01 >>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14 >>>>>>>>> 6.38773e-01 >>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14 >>>>>>>>> 6.43887e-01 >>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14 >>>>>>>>> 6.49073e-01 >>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14 >>>>>>>>> 6.54167e-01 >>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14 >>>>>>>>> 6.59394e-01 >>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13 >>>>>>>>> 6.64516e-01 >>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13 >>>>>>>>> 6.69677e-01 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 8:41?AM Mark Adams wrote: >>>>>>>>> >>>>>>>>>> ASM is just the sub PC with one proc but gets weaker with more >>>>>>>>>> procs unless you use jacobi. (maybe I am missing something). >>>>>>>>>> >>>>>>>>>> On Thu, May 4, 2023 at 8:31?AM Mark Lohry >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Please send the output of -snes_view. >>>>>>>>>>>> >>>>>>>>>>> pasted below. anything stand out? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> SNES Object: 1 MPI process >>>>>>>>>>> type: newtonls >>>>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>>>> total number of linear solver iterations=20 >>>>>>>>>>> total number of function evaluations=22 >>>>>>>>>>> norm schedule ALWAYS >>>>>>>>>>> Jacobian is never rebuilt >>>>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>>>> Preconditioning Jacobian is built using finite differences >>>>>>>>>>> with coloring >>>>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>>>> type: basic >>>>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>>>>>> lambda=1.000000e-08 >>>>>>>>>>> maximum iterations=40 >>>>>>>>>>> KSP Object: 1 MPI process >>>>>>>>>>> type: gmres >>>>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>>>>>> Orthogonalization with no iterative refinement >>>>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: 1 MPI process >>>>>>>>>>> type: asm >>>>>>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>>>>>> restriction/interpolation type - RESTRICT >>>>>>>>>>> Local solver information for first block is in the >>>>>>>>>>> following KSP and PC objects on rank 0: >>>>>>>>>>> Use -ksp_view ::ascii_info_detail to display information >>>>>>>>>>> for all blocks >>>>>>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>>>>>> type: preonly >>>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>>>>>>>> divergence=10000. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>>> PC Object: (sub_) 1 MPI process >>>>>>>>>>> type: ilu >>>>>>>>>>> out-of-place factorization >>>>>>>>>>> 0 levels of fill >>>>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>>>> matrix ordering: natural >>>>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>>>> Factored matrix follows: >>>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> package used to perform factorization: petsc >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> block size is 16 >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>>> block size is 16 >>>>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: mffd >>>>>>>>>>> rows=16384, cols=16384 >>>>>>>>>>> Matrix-free approximation: >>>>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>>>> Using wp compute h routine >>>>>>>>>>> Does not compute normU >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>>> block size is 16 >>>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 8:30?AM Mark Adams >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>>>>>> -snes_view might give you that. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, May 4, 2023 at 8:25?AM Matthew Knepley < >>>>>>>>>>>> knepley at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Thu, May 4, 2023 at 8:21?AM Mark Lohry >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>>>>> apart? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>>>>>> >>>>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the >>>>>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is identical >>>>>>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated applications to >>>>>>>>>>>>>> solve a steady state multigrid problem, though here just one level) the >>>>>>>>>>>>>> differences add up such that I might reach global convergence in 35 >>>>>>>>>>>>>> iterations or 38. It's not the end of the world, but I was expecting that >>>>>>>>>>>>>> with -np 1 these would be identical and I'm not sure where the root cause >>>>>>>>>>>>>> would be. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> The initial KSP residual is different, so its the PC. >>>>>>>>>>>>> Please send the output of -snes_view. If your ASM is using direct >>>>>>>>>>>>> factorization, then it >>>>>>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>>>>>> [...] >>>>>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>>>>>>>>> different >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>>>>>> [...] >>>>>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 3, 2023 at 11:05?PM Barry Smith >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>>>>> apart? That is the first couple of KSP iterations they are almost identical >>>>>>>>>>>>>>> but then for each iteration get a bit further. Similar for the SNES >>>>>>>>>>>>>>> iterations, starting close and then for more iterations and more solves >>>>>>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very different? You >>>>>>>>>>>>>>> can run with -snes_monitor -ksp_monitor >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the >>>>>>>>>>>>>>> coloring, was just guessing there. But the solutions/residuals are slightly >>>>>>>>>>>>>>> different from run to run. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should >>>>>>>>>>>>>>> expect bitwise identical results? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> No, the coloring should be identical every time. Do you >>>>>>>>>>>>>>>> see differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an >>>>>>>>>>>>>>>> MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres, >>>>>>>>>>>>>>>> with SNESSetLagJacobian -2 (compute once and then frozen jacobian). >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in >>>>>>>>>>>>>>>> residuals from run to run. I'm wondering where randomness might enter here >>>>>>>>>>>>>>>> -- does the jacobian coloring use a random seed? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaochenyi14 at 163.com Sat May 6 02:16:23 2023 From: gaochenyi14 at 163.com (gaochenyi14) Date: Sat, 6 May 2023 15:16:23 +0800 (CST) Subject: [petsc-users] Is it necessary to call MatAssembly*() with MatSetValue() Message-ID: <5b4f75c7.5c48.187efeb3981.Coremail.gaochenyi14@163.com> Hi, By `find` and `grep`, I find that in many PETSc examples `MatAssembly*()` are called after `MatSetValue()`. But in the C/Fortran API manual, the man page of `MatAssembly*()` does not say it is necessary for `MatSetValue()`. And the man page of `MatSetValue()` does not say it is a must to call `MatAssembly*()` afterwards. Is the man page of `MatSetValue()` up-to-date, or it lacks the reminder for the necessity of calling `MatAssembly*()`? Best regards, C.-Y. GAO -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sat May 6 04:35:50 2023 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sat, 6 May 2023 11:35:50 +0200 Subject: [petsc-users] Is it necessary to call MatAssembly*() with MatSetValue() In-Reply-To: <5b4f75c7.5c48.187efeb3981.Coremail.gaochenyi14@163.com> References: <5b4f75c7.5c48.187efeb3981.Coremail.gaochenyi14@163.com> Message-ID: Fixed in https://gitlab.com/petsc/petsc/-/merge_requests/6423 Jose > El 6 may 2023, a las 9:16, gaochenyi14 escribi?: > > Hi, > > By `find` and `grep`, I find that in many PETSc examples `MatAssembly*()` are called after `MatSetValue()`. But in the C/Fortran API manual, the man page of `MatAssembly*()` does not say it is necessary for `MatSetValue()`. And the man page of `MatSetValue()` does not say it is a must to call `MatAssembly*()` afterwards. > > Is the man page of `MatSetValue()` up-to-date, or it lacks the reminder for the necessity of calling `MatAssembly*()`? > > Best regards, > C.-Y. GAO From huidong.yang at ricam.oeaw.ac.at Sat May 6 10:47:21 2023 From: huidong.yang at ricam.oeaw.ac.at (Huidong Yang) Date: Sat, 06 May 2023 17:47:21 +0200 Subject: [petsc-users] question about leap-frog for wave equation in petsc Message-ID: <6cb7-64567680-3-1806c220@37997608> Hi Petsc developer. may I ask if there is any available implementations in petsc using leap-frog scheme? Thanks. From knepley at gmail.com Sat May 6 11:26:48 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 6 May 2023 12:26:48 -0400 Subject: [petsc-users] question about leap-frog for wave equation in petsc In-Reply-To: <6cb7-64567680-3-1806c220@37997608> References: <6cb7-64567680-3-1806c220@37997608> Message-ID: On Sat, May 6, 2023 at 11:47?AM Huidong Yang wrote: > Hi Petsc developer. > > may I ask if there is any available implementations in petsc > using leap-frog scheme? > I don't think we have leapfrog, but we do have Stormer-Verlet, which is also a 2nd order symplectic method. Thanks, Matt > Thanks. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From zjorti at lanl.gov Sat May 6 18:24:59 2023 From: zjorti at lanl.gov (Jorti, Zakariae) Date: Sat, 6 May 2023 23:24:59 +0000 Subject: [petsc-users] Step size setting in TS Message-ID: <1c78d9b4bba544d5bf26683dc039c5bb@lanl.gov> Hello, I have a time-dependent model that I solve using TSSolve. And I am trying to adaptively change the step size (dt). I found that there are some TSAdapt schemes already available. I have tried TSADAPTBASIC and TSADAPTCFL. The former runs without any problems, whereas the latter yields the following error: " [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Step rejection not implemented. The CFL implementation is incomplete/unusable " How/where the step rejection should be implemented? Are there any examples available? Besides, I was also attempting to change the step size through the monitor: " SNES snes; TSGetSNES(ts, & snes); SNESConvergedReason reason = SNES_CONVERGED_ITERATING; PetscCall(SNESGetConvergedReason(snes, &reason)); if(reason < 0){ TSSetTimeStep(ts,28618.7); } else{ TSSetTimeStep(ts,57237.4); } " But when I try this solution, the TSStep seems to diverge even though the SNES solver converges (see log file below). Am I doing something wrong here by changing the value of the step size inside the monitor? Thank you. Best, Zakariae ----------------------------------------------------------------- Timestep 0: step size = 57237.4, time = 0., 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 8.393431982129e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.985106864162e-01 1 KSP Residual norm 2.509248490095e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.985106864162e-01 0 KSP Residual norm 1.985106864162e-01 1 KSP Residual norm 2.739697200542e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.885623182909e-01 1 KSP Residual norm 8.784869181255e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.664114929279e-01 1 KSP Residual norm 2.277701872167e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.664114929279e-01 0 KSP Residual norm 1.664114929279e-01 1 KSP Residual norm 2.062369048201e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.155996487462e-02 1 KSP Residual norm 8.830688649637e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 6.155996487462e-02 0 KSP Residual norm 6.155996487462e-02 1 KSP Residual norm 6.231091742747e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.828806540636e-02 1 KSP Residual norm 3.623270770725e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 1.828806540636e-02 0 KSP Residual norm 1.828806540636e-02 1 KSP Residual norm 7.390325185746e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.659571588896e-02 1 KSP Residual norm 5.280284947190e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 1.659571588896e-02 0 KSP Residual norm 1.659571588896e-02 1 KSP Residual norm 8.311611006756e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.475636107890e-02 1 KSP Residual norm 1.890387034462e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635129206894e-02 1 KSP Residual norm 8.192897103059e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 6 SNES Function norm 1.635129206894e-02 0 KSP Residual norm 1.635129206894e-02 1 KSP Residual norm 9.049749505225e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.634946381608e-02 1 KSP Residual norm 9.038879304284e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635085731721e-02 1 KSP Residual norm 9.050535714933e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 7 SNES Function norm 1.635085731721e-02 0 KSP Residual norm 1.635085731721e-02 1 KSP Residual norm 1.248783858778e-03 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635003937816e-02 1 KSP Residual norm 1.248746884526e-03 Linear solve converged due to CONVERGED_RTOL iterations 1 8 SNES Function norm 1.635003937816e-02 0 KSP Residual norm 1.635003937816e-02 1 KSP Residual norm 1.631242803481e-02 2 KSP Residual norm 1.551854905139e-02 3 KSP Residual norm 7.586491369138e-03 4 KSP Residual norm 2.304721992110e-04 Linear solve converged due to CONVERGED_RTOL iterations 4 0 KSP Residual norm 1.635003936717e-02 1 KSP Residual norm 1.631242801091e-02 2 KSP Residual norm 1.551784322549e-02 3 KSP Residual norm 7.706018759197e-03 4 KSP Residual norm 2.652435205967e-04 Linear solve converged due to CONVERGED_RTOL iterations 4 9 SNES Function norm 1.282692400720e+06 Nonlinear solve did not converge due to DIVERGED_DTOL iterations 9 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 5.698410682663e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.618385788411e-02 1 KSP Residual norm 9.179214337527e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 8.618385788411e-02 0 KSP Residual norm 8.618385788411e-02 1 KSP Residual norm 9.659611612697e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.763975792565e-02 1 KSP Residual norm 2.674728432525e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.763975792565e-02 0 KSP Residual norm 2.763975792565e-02 1 KSP Residual norm 1.836844517206e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.773540750989e-03 1 KSP Residual norm 1.340353004310e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.773540750989e-03 0 KSP Residual norm 2.773540750989e-03 1 KSP Residual norm 5.382019472069e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.441017533038e-04 1 KSP Residual norm 5.218147408275e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 3.441017533038e-04 0 KSP Residual norm 3.441017533038e-04 1 KSP Residual norm 4.811955757758e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.991606932979e-06 1 KSP Residual norm 8.670005394864e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 3.991606932979e-06 0 KSP Residual norm 3.991606932979e-06 1 KSP Residual norm 8.920137762569e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.982782045945e-10 1 KSP Residual norm 3.185856555648e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 6 SNES Function norm 7.982782045945e-10 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 0 SNES Function norm 3.475518085003e-04 0 KSP Residual norm 3.475518085003e-04 1 KSP Residual norm 2.328087365783e-04 2 KSP Residual norm 5.062641920970e-09 Linear solve converged due to CONVERGED_RTOL iterations 2 0 KSP Residual norm 3.256441726122e-02 1 KSP Residual norm 8.957860841275e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 3.256441726122e-02 0 KSP Residual norm 3.256441726122e-02 1 KSP Residual norm 5.886144296665e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 4.876417342600e-03 1 KSP Residual norm 5.064113356579e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 4.876417342600e-03 0 KSP Residual norm 4.876417342600e-03 1 KSP Residual norm 7.010281488349e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.144574635801e-04 1 KSP Residual norm 9.768499579865e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 6.144574635801e-04 0 KSP Residual norm 6.144574635801e-04 1 KSP Residual norm 8.324370478600e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.721913471286e-06 1 KSP Residual norm 1.370545428753e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 8.721913471286e-06 0 KSP Residual norm 8.721913471286e-06 1 KSP Residual norm 1.353680905072e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.399977638261e-09 1 KSP Residual norm 5.989016835446e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 3.399977638261e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 5 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.733028652796e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.490800616133e-03 1 KSP Residual norm 8.231034393757e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.490800616133e-03 0 KSP Residual norm 7.490800616133e-03 1 KSP Residual norm 8.349300836114e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.394207859364e-04 1 KSP Residual norm 5.991798196295e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.394207859364e-04 0 KSP Residual norm 1.394207859364e-04 1 KSP Residual norm 5.941984242417e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.985895331367e-08 1 KSP Residual norm 1.522240176483e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.985895331367e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 1.026110861069e-03 0 KSP Residual norm 1.026110861069e-03 1 KSP Residual norm 1.768297003974e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.742520550463e-03 1 KSP Residual norm 4.379214132544e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.742520550463e-03 0 KSP Residual norm 7.742520550463e-03 1 KSP Residual norm 4.526371657285e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.574467856482e-05 1 KSP Residual norm 2.285626422848e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.574467856482e-05 0 KSP Residual norm 2.574467856482e-05 1 KSP Residual norm 2.274220735999e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 4.797870687759e-09 1 KSP Residual norm 2.841037171618e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 4.797870687759e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 5.979305049211e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.195748548281e-04 1 KSP Residual norm 1.017772307359e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 8.195748548281e-04 0 KSP Residual norm 8.195748548281e-04 1 KSP Residual norm 1.026961867317e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.717269750886e-06 1 KSP Residual norm 4.675717893044e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.717269750886e-06 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.325295461903e-03 0 KSP Residual norm 1.325295461903e-03 1 KSP Residual norm 7.421950279125e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.330639207844e-03 1 KSP Residual norm 3.764830059967e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.330639207844e-03 0 KSP Residual norm 1.330639207844e-03 1 KSP Residual norm 3.764587772248e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.146996236879e-07 1 KSP Residual norm 1.459926588396e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.146996236879e-07 0 KSP Residual norm 2.146996236879e-07 1 KSP Residual norm 1.470407475325e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.171052440924e-11 1 KSP Residual norm 1.397958787200e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.171052440924e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 3.067250018894e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.608120509564e-04 1 KSP Residual norm 2.171575389779e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.608120509564e-04 0 KSP Residual norm 1.608120509564e-04 1 KSP Residual norm 2.240227170220e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 9.106906778906e-08 1 KSP Residual norm 1.799636201836e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 9.106906778906e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.418151440967e-03 0 KSP Residual norm 1.418151440967e-03 1 KSP Residual norm 3.515151918636e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.938727941997e-04 1 KSP Residual norm 6.040565904176e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.938727941997e-04 0 KSP Residual norm 2.938727941997e-04 1 KSP Residual norm 6.089297895562e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.093551284505e-08 1 KSP Residual norm 2.891462873664e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.093551284505e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 2.050867218313e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.054570961709e-05 1 KSP Residual norm 4.596885059357e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 3.054570961709e-05 0 KSP Residual norm 3.054570961709e-05 1 KSP Residual norm 4.828909801995e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.248104282690e-08 1 KSP Residual norm 8.831951540120e-13 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.248104282690e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.460190420374e-03 0 KSP Residual norm 1.460190420374e-03 1 KSP Residual norm 1.580519450452e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.884986293527e-05 1 KSP Residual norm 9.020937669734e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.884986293527e-05 0 KSP Residual norm 5.884986293527e-05 1 KSP Residual norm 9.054367061995e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.698647259662e-09 1 KSP Residual norm 1.767366286599e-13 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.698647259662e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.834492501580e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.663374849972e-06 1 KSP Residual norm 9.752865227095e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.663374849972e-06 0 KSP Residual norm 5.663374849972e-06 1 KSP Residual norm 1.014567990183e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.600155584863e-09 1 KSP Residual norm 8.580477366667e-14 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.600155584863e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.479145244919e-03 0 KSP Residual norm 1.479145244919e-03 1 KSP Residual norm 6.804553002336e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.118291636442e-05 1 KSP Residual norm 1.397149309582e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.118291636442e-05 0 KSP Residual norm 1.118291636442e-05 1 KSP Residual norm 1.386385106529e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.800461695540e-10 1 KSP Residual norm 3.217319587127e-14 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.800461695540e-10 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.776151497065e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.076105708636e-06 1 KSP Residual norm 2.155085223926e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.076105708636e-06 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.489507915803e-03 0 KSP Residual norm 1.489507915803e-03 1 KSP Residual norm 2.921538148329e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.172405782412e-06 1 KSP Residual norm 2.719699969205e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.172405782412e-06 0 KSP Residual norm 2.172405782412e-06 1 KSP Residual norm 2.731813669182e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.323734019212e-11 1 KSP Residual norm 6.095298326433e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 5.323734019212e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.746846293829e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.376406135686e-07 1 KSP Residual norm 5.310287317591e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.376406135686e-07 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.497239199277e-03 0 KSP Residual norm 1.497239199277e-03 1 KSP Residual norm 1.251830706962e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.023668872625e-07 1 KSP Residual norm 7.656373075632e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.023668872625e-07 0 KSP Residual norm 5.023668872625e-07 1 KSP Residual norm 7.473091471413e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.003331377435e-11 1 KSP Residual norm 1.347408932811e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.003331377435e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.757419385096e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.310279661922e-08 1 KSP Residual norm 1.417918576189e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.310279661922e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.503392083665e-03 0 KSP Residual norm 1.503392083665e-03 1 KSP Residual norm 5.327548050375e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.611770751552e-07 1 KSP Residual norm 2.611036263545e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.611770751552e-07 0 KSP Residual norm 1.611770751552e-07 1 KSP Residual norm 2.607692240234e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.627978707015e-11 1 KSP Residual norm 4.254078683385e-16 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.627978707015e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.747957348554e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.999731614900e-08 1 KSP Residual norm 3.643836796648e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.999731614900e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.507740473908e-03 0 KSP Residual norm 1.507740473908e-03 1 KSP Residual norm 2.263876724496e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.390049329095e-08 1 KSP Residual norm 9.745723464126e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 6.390049329095e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat May 6 18:34:37 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 6 May 2023 19:34:37 -0400 Subject: [petsc-users] Step size setting in TS In-Reply-To: <1c78d9b4bba544d5bf26683dc039c5bb@lanl.gov> References: <1c78d9b4bba544d5bf26683dc039c5bb@lanl.gov> Message-ID: On Sat, May 6, 2023 at 7:25?PM Jorti, Zakariae via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > > I have a time-dependent model that I solve using TSSolve. > > And I am trying to adaptively change the step size (dt). > > I found that there are some TSAdapt schemes already available. > > I have tried TSADAPTBASIC and TSADAPTCFL. > > The former runs without any problems, whereas the latter yields the > following error: > " > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Step rejection not implemented. The CFL implementation is > incomplete/unusable > " > > > How/where the step rejection should be implemented? Are there any examples > available? > I would not recommend CFL, which is partially implemented at best. The DSP adaptivity is great. There is a recent paper by David Ketcheson, Lisandro Dalcin, Matteo Parsani, and collaborators which shows that you can set this up to completely handle complicated flows, and it is much better than a simple CFL condition. Thanks, Matt > Besides, I was also attempting to change the step size through the > monitor: > > > " > SNES snes; > TSGetSNES(ts, & snes); > SNESConvergedReason reason = SNES_CONVERGED_ITERATING; > PetscCall(SNESGetConvergedReason(snes, &reason)); > > if(reason < 0){ > TSSetTimeStep(ts,28618.7); > } > else{ > TSSetTimeStep(ts,57237.4); > } > " > > But when I try this solution, the TSStep seems to diverge even though the > SNES solver converges (see log file below). > Am I doing something wrong here by changing the value of the step size > inside the monitor? > > > Thank you. > > Best, > > > Zakariae > > > > > > > > ----------------------------------------------------------------- > > Timestep 0: step size = 57237.4, time = 0., > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 8.393431982129e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.985106864162e-01 > 1 KSP Residual norm 2.509248490095e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 1.985106864162e-01 > 0 KSP Residual norm 1.985106864162e-01 > 1 KSP Residual norm 2.739697200542e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.885623182909e-01 > 1 KSP Residual norm 8.784869181255e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.664114929279e-01 > 1 KSP Residual norm 2.277701872167e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 1.664114929279e-01 > 0 KSP Residual norm 1.664114929279e-01 > 1 KSP Residual norm 2.062369048201e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 6.155996487462e-02 > 1 KSP Residual norm 8.830688649637e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 3 SNES Function norm 6.155996487462e-02 > 0 KSP Residual norm 6.155996487462e-02 > 1 KSP Residual norm 6.231091742747e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.828806540636e-02 > 1 KSP Residual norm 3.623270770725e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 4 SNES Function norm 1.828806540636e-02 > 0 KSP Residual norm 1.828806540636e-02 > 1 KSP Residual norm 7.390325185746e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.659571588896e-02 > 1 KSP Residual norm 5.280284947190e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 5 SNES Function norm 1.659571588896e-02 > 0 KSP Residual norm 1.659571588896e-02 > 1 KSP Residual norm 8.311611006756e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.475636107890e-02 > 1 KSP Residual norm 1.890387034462e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.635129206894e-02 > 1 KSP Residual norm 8.192897103059e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 6 SNES Function norm 1.635129206894e-02 > 0 KSP Residual norm 1.635129206894e-02 > 1 KSP Residual norm 9.049749505225e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.634946381608e-02 > 1 KSP Residual norm 9.038879304284e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.635085731721e-02 > 1 KSP Residual norm 9.050535714933e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 7 SNES Function norm 1.635085731721e-02 > 0 KSP Residual norm 1.635085731721e-02 > 1 KSP Residual norm 1.248783858778e-03 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.635003937816e-02 > 1 KSP Residual norm 1.248746884526e-03 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 8 SNES Function norm 1.635003937816e-02 > 0 KSP Residual norm 1.635003937816e-02 > 1 KSP Residual norm 1.631242803481e-02 > 2 KSP Residual norm 1.551854905139e-02 > 3 KSP Residual norm 7.586491369138e-03 > 4 KSP Residual norm 2.304721992110e-04 > Linear solve converged due to CONVERGED_RTOL iterations 4 > 0 KSP Residual norm 1.635003936717e-02 > 1 KSP Residual norm 1.631242801091e-02 > 2 KSP Residual norm 1.551784322549e-02 > 3 KSP Residual norm 7.706018759197e-03 > 4 KSP Residual norm 2.652435205967e-04 > Linear solve converged due to CONVERGED_RTOL iterations 4 > 9 SNES Function norm 1.282692400720e+06 > Nonlinear solve did not converge due to DIVERGED_DTOL iterations 9 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 5.698410682663e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 8.618385788411e-02 > 1 KSP Residual norm 9.179214337527e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 8.618385788411e-02 > 0 KSP Residual norm 8.618385788411e-02 > 1 KSP Residual norm 9.659611612697e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.763975792565e-02 > 1 KSP Residual norm 2.674728432525e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 2.763975792565e-02 > 0 KSP Residual norm 2.763975792565e-02 > 1 KSP Residual norm 1.836844517206e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.773540750989e-03 > 1 KSP Residual norm 1.340353004310e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 3 SNES Function norm 2.773540750989e-03 > 0 KSP Residual norm 2.773540750989e-03 > 1 KSP Residual norm 5.382019472069e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 3.441017533038e-04 > 1 KSP Residual norm 5.218147408275e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 4 SNES Function norm 3.441017533038e-04 > 0 KSP Residual norm 3.441017533038e-04 > 1 KSP Residual norm 4.811955757758e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 3.991606932979e-06 > 1 KSP Residual norm 8.670005394864e-09 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 5 SNES Function norm 3.991606932979e-06 > 0 KSP Residual norm 3.991606932979e-06 > 1 KSP Residual norm 8.920137762569e-09 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 7.982782045945e-10 > 1 KSP Residual norm 3.185856555648e-12 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 6 SNES Function norm 7.982782045945e-10 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 > 0 SNES Function norm 3.475518085003e-04 > 0 KSP Residual norm 3.475518085003e-04 > 1 KSP Residual norm 2.328087365783e-04 > 2 KSP Residual norm 5.062641920970e-09 > Linear solve converged due to CONVERGED_RTOL iterations 2 > 0 KSP Residual norm 3.256441726122e-02 > 1 KSP Residual norm 8.957860841275e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 3.256441726122e-02 > 0 KSP Residual norm 3.256441726122e-02 > 1 KSP Residual norm 5.886144296665e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 4.876417342600e-03 > 1 KSP Residual norm 5.064113356579e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 4.876417342600e-03 > 0 KSP Residual norm 4.876417342600e-03 > 1 KSP Residual norm 7.010281488349e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 6.144574635801e-04 > 1 KSP Residual norm 9.768499579865e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 3 SNES Function norm 6.144574635801e-04 > 0 KSP Residual norm 6.144574635801e-04 > 1 KSP Residual norm 8.324370478600e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 8.721913471286e-06 > 1 KSP Residual norm 1.370545428753e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 4 SNES Function norm 8.721913471286e-06 > 0 KSP Residual norm 8.721913471286e-06 > 1 KSP Residual norm 1.353680905072e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 3.399977638261e-09 > 1 KSP Residual norm 5.989016835446e-12 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 5 SNES Function norm 3.399977638261e-09 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 5 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 1.733028652796e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 7.490800616133e-03 > 1 KSP Residual norm 8.231034393757e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 7.490800616133e-03 > 0 KSP Residual norm 7.490800616133e-03 > 1 KSP Residual norm 8.349300836114e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.394207859364e-04 > 1 KSP Residual norm 5.991798196295e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 1.394207859364e-04 > 0 KSP Residual norm 1.394207859364e-04 > 1 KSP Residual norm 5.941984242417e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.985895331367e-08 > 1 KSP Residual norm 1.522240176483e-11 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 3 SNES Function norm 2.985895331367e-08 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 > 0 SNES Function norm 1.026110861069e-03 > 0 KSP Residual norm 1.026110861069e-03 > 1 KSP Residual norm 1.768297003974e-04 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 7.742520550463e-03 > 1 KSP Residual norm 4.379214132544e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 7.742520550463e-03 > 0 KSP Residual norm 7.742520550463e-03 > 1 KSP Residual norm 4.526371657285e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.574467856482e-05 > 1 KSP Residual norm 2.285626422848e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 2.574467856482e-05 > 0 KSP Residual norm 2.574467856482e-05 > 1 KSP Residual norm 2.274220735999e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 4.797870687759e-09 > 1 KSP Residual norm 2.841037171618e-12 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 3 SNES Function norm 4.797870687759e-09 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 5.979305049211e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 8.195748548281e-04 > 1 KSP Residual norm 1.017772307359e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 8.195748548281e-04 > 0 KSP Residual norm 8.195748548281e-04 > 1 KSP Residual norm 1.026961867317e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.717269750886e-06 > 1 KSP Residual norm 4.675717893044e-10 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 1.717269750886e-06 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 1.325295461903e-03 > 0 KSP Residual norm 1.325295461903e-03 > 1 KSP Residual norm 7.421950279125e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.330639207844e-03 > 1 KSP Residual norm 3.764830059967e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 1.330639207844e-03 > 0 KSP Residual norm 1.330639207844e-03 > 1 KSP Residual norm 3.764587772248e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.146996236879e-07 > 1 KSP Residual norm 1.459926588396e-10 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 2.146996236879e-07 > 0 KSP Residual norm 2.146996236879e-07 > 1 KSP Residual norm 1.470407475325e-10 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.171052440924e-11 > 1 KSP Residual norm 1.397958787200e-15 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 3 SNES Function norm 2.171052440924e-11 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 3.067250018894e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.608120509564e-04 > 1 KSP Residual norm 2.171575389779e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 1.608120509564e-04 > 0 KSP Residual norm 1.608120509564e-04 > 1 KSP Residual norm 2.240227170220e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 9.106906778906e-08 > 1 KSP Residual norm 1.799636201836e-11 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 9.106906778906e-08 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 1.418151440967e-03 > 0 KSP Residual norm 1.418151440967e-03 > 1 KSP Residual norm 3.515151918636e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.938727941997e-04 > 1 KSP Residual norm 6.040565904176e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 2.938727941997e-04 > 0 KSP Residual norm 2.938727941997e-04 > 1 KSP Residual norm 6.089297895562e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.093551284505e-08 > 1 KSP Residual norm 2.891462873664e-12 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 1.093551284505e-08 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 2.050867218313e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 3.054570961709e-05 > 1 KSP Residual norm 4.596885059357e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 3.054570961709e-05 > 0 KSP Residual norm 3.054570961709e-05 > 1 KSP Residual norm 4.828909801995e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.248104282690e-08 > 1 KSP Residual norm 8.831951540120e-13 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 1.248104282690e-08 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 1.460190420374e-03 > 0 KSP Residual norm 1.460190420374e-03 > 1 KSP Residual norm 1.580519450452e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 5.884986293527e-05 > 1 KSP Residual norm 9.020937669734e-09 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 5.884986293527e-05 > 0 KSP Residual norm 5.884986293527e-05 > 1 KSP Residual norm 9.054367061995e-09 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.698647259662e-09 > 1 KSP Residual norm 1.767366286599e-13 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 1.698647259662e-09 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 1.834492501580e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 5.663374849972e-06 > 1 KSP Residual norm 9.752865227095e-09 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 5.663374849972e-06 > 0 KSP Residual norm 5.663374849972e-06 > 1 KSP Residual norm 1.014567990183e-08 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.600155584863e-09 > 1 KSP Residual norm 8.580477366667e-14 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 2.600155584863e-09 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 1.479145244919e-03 > 0 KSP Residual norm 1.479145244919e-03 > 1 KSP Residual norm 6.804553002336e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.118291636442e-05 > 1 KSP Residual norm 1.397149309582e-09 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 1.118291636442e-05 > 0 KSP Residual norm 1.118291636442e-05 > 1 KSP Residual norm 1.386385106529e-09 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.800461695540e-10 > 1 KSP Residual norm 3.217319587127e-14 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 2.800461695540e-10 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 1.776151497065e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.076105708636e-06 > 1 KSP Residual norm 2.155085223926e-09 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 1.076105708636e-06 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 1.489507915803e-03 > 0 KSP Residual norm 1.489507915803e-03 > 1 KSP Residual norm 2.921538148329e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.172405782412e-06 > 1 KSP Residual norm 2.719699969205e-10 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 2.172405782412e-06 > 0 KSP Residual norm 2.172405782412e-06 > 1 KSP Residual norm 2.731813669182e-10 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 5.323734019212e-11 > 1 KSP Residual norm 6.095298326433e-15 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 5.323734019212e-11 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 1.746846293829e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.376406135686e-07 > 1 KSP Residual norm 5.310287317591e-10 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 2.376406135686e-07 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 1.497239199277e-03 > 0 KSP Residual norm 1.497239199277e-03 > 1 KSP Residual norm 1.251830706962e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 5.023668872625e-07 > 1 KSP Residual norm 7.656373075632e-11 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 5.023668872625e-07 > 0 KSP Residual norm 5.023668872625e-07 > 1 KSP Residual norm 7.473091471413e-11 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.003331377435e-11 > 1 KSP Residual norm 1.347408932811e-15 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 2.003331377435e-11 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 1.757419385096e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 7.310279661922e-08 > 1 KSP Residual norm 1.417918576189e-10 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 7.310279661922e-08 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 1.503392083665e-03 > 0 KSP Residual norm 1.503392083665e-03 > 1 KSP Residual norm 5.327548050375e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.611770751552e-07 > 1 KSP Residual norm 2.611036263545e-11 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 1.611770751552e-07 > 0 KSP Residual norm 1.611770751552e-07 > 1 KSP Residual norm 2.607692240234e-11 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 1.627978707015e-11 > 1 KSP Residual norm 4.254078683385e-16 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 2 SNES Function norm 1.627978707015e-11 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > 0 SNES Function norm 2.854104379157e-02 > 0 KSP Residual norm 2.854104379157e-02 > 1 KSP Residual norm 1.747957348554e-05 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.999731614900e-08 > 1 KSP Residual norm 3.643836796648e-11 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 2.999731614900e-08 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 1.507740473908e-03 > 0 KSP Residual norm 1.507740473908e-03 > 1 KSP Residual norm 2.263876724496e-07 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 6.390049329095e-08 > 1 KSP Residual norm 9.745723464126e-12 > Linear solve converged due to CONVERGED_RTOL iterations 1 > 1 SNES Function norm 6.390049329095e-08 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Sat May 6 19:12:27 2023 From: danyang.su at gmail.com (Danyang Su) Date: Sat, 06 May 2023 17:12:27 -0700 Subject: [petsc-users] Fortran preprocessor not work in pets-dev Message-ID: <4FD9CD30-8E71-4A71-85FA-0A14C3BBA46D@gmail.com> Hi All, My code has some FPP. It works fine in PETSc 3.18 and earlier version, but stops working in the latest PETSc-Dev. For example the following FPP STANDARD_FORTRAN is not recognized. #ifdef STANDARD_FORTRAN ??? 1 format(15x,1000a15) ??? 2 format(1pe15.6e3,1000(1pe15.6e3)) #else ??? 1 format(15x,a15)??? ????2 format(1pe15.6e3,(1pe15.6e3)) #endif In the makefile, I define the preprocessor as PPFLAGS. PPFLAGS := -DLINUX -DRELEASE -DRELEASE_X64 -DSTANDARD_FORTRAN ? exe: $(OBJS) chkopts ??????????????? -${FLINKER} $(FFLAGS) $(FPPFLAGS) $(CPPFLAGS) -o $(EXENAME) $(OBJS) ${PETSC_LIB} ${LIS_LIB} ${DLIB} ${SLIB} Any idea on this problem? All the best, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure-dev.log Type: application/octet-stream Size: 7341624 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure-3.18.log Type: application/octet-stream Size: 11768439 bytes Desc: not available URL: From balay at mcs.anl.gov Sun May 7 00:22:34 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 7 May 2023 00:22:34 -0500 (CDT) Subject: [petsc-users] Fortran preprocessor not work in pets-dev In-Reply-To: <4FD9CD30-8E71-4A71-85FA-0A14C3BBA46D@gmail.com> References: <4FD9CD30-8E71-4A71-85FA-0A14C3BBA46D@gmail.com> Message-ID: <8983bfe9-f87e-a3ec-278f-2bfd4ba56838@mcs.anl.gov> On Sat, 6 May 2023, Danyang Su wrote: > Hi All, > > > > My code has some FPP. It works fine in PETSc 3.18 and earlier version, but stops working in the latest PETSc-Dev. For example the following FPP STANDARD_FORTRAN is not recognized. > > > > #ifdef STANDARD_FORTRAN > > ??? 1 format(15x,1000a15) > > ??? 2 format(1pe15.6e3,1000(1pe15.6e3)) > > #else > > ??? 1 format(15x,a15)??? > > ????2 format(1pe15.6e3,(1pe15.6e3)) > > #endif > > > > In the makefile, I define the preprocessor as PPFLAGS. > > > > PPFLAGS := -DLINUX -DRELEASE -DRELEASE_X64 -DSTANDARD_FORTRAN Shouldn't this be FPPFLAGS? Can you send us a simple test case [with the makefile] that we can try to demonstrate this problem? Satish > > ? > > exe: $(OBJS) chkopts > > ??????????????? -${FLINKER} $(FFLAGS) $(FPPFLAGS) $(CPPFLAGS) -o $(EXENAME) $(OBJS) ${PETSC_LIB} ${LIS_LIB} ${DLIB} ${SLIB} > > > > Any idea on this problem? > > > > All the best, > > > > > > From danyang.su at gmail.com Sun May 7 02:06:17 2023 From: danyang.su at gmail.com (Danyang Su) Date: Sun, 07 May 2023 00:06:17 -0700 Subject: [petsc-users] Fortran preprocessor not work in pets-dev In-Reply-To: <8983bfe9-f87e-a3ec-278f-2bfd4ba56838@mcs.anl.gov> References: <4FD9CD30-8E71-4A71-85FA-0A14C3BBA46D@gmail.com> <8983bfe9-f87e-a3ec-278f-2bfd4ba56838@mcs.anl.gov> Message-ID: Hi Satish, Sorry, this is a typo when copy to the email. I use FPPFLAGS in the makefile. Not sure why this occurs. Actually not only the preprocessor fails, the petsc initialize does not work either. Attached is a very simple fortran code and below is the test results. Looks like the petsc is not properly installed. I am working on macOS Monterey version 12.5 (Intel Xeon W processor). Compiled using petsc-3.18 (base) ? petsc-dev-fppflags mpiexec -n 4 ./petsc_fppflags compiled by STANDARD_FORTRAN compiler called by rank 0 called by rank 1 called by rank 2 called by rank 3 compiled using petsc-dev (base) ? petsc-dev-fppflags mpiexec -n 4 ./petsc_fppflags called by rank 2 called by rank 2 called by rank 2 called by rank 2 Thanks, Danyang ?On 2023-05-06, 10:22 PM, "Satish Balay" > wrote: On Sat, 6 May 2023, Danyang Su wrote: > Hi All, > > > > My code has some FPP. It works fine in PETSc 3.18 and earlier version, but stops working in the latest PETSc-Dev. For example the following FPP STANDARD_FORTRAN is not recognized. > > > > #ifdef STANDARD_FORTRAN > > 1 format(15x,1000a15) > > 2 format(1pe15.6e3,1000(1pe15.6e3)) > > #else > > 1 format(15x,a15) > > 2 format(1pe15.6e3,(1pe15.6e3)) > > #endif > > > > In the makefile, I define the preprocessor as PPFLAGS. > > > > PPFLAGS := -DLINUX -DRELEASE -DRELEASE_X64 -DSTANDARD_FORTRAN Shouldn't this be FPPFLAGS? Can you send us a simple test case [with the makefile] that we can try to demonstrate this problem? Satish > > ? > > exe: $(OBJS) chkopts > > -${FLINKER} $(FFLAGS) $(FPPFLAGS) $(CPPFLAGS) -o $(EXENAME) $(OBJS) ${PETSC_LIB} ${LIS_LIB} ${DLIB} ${SLIB} > > > > Any idea on this problem? > > > > All the best, > > > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: driver_pc.F90 Type: application/octet-stream Size: 1167 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: makefile Type: application/octet-stream Size: 5344 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc_mpi_common.F90 Type: application/octet-stream Size: 2386 bytes Desc: not available URL: From edoardo.alinovi at gmail.com Sun May 7 08:21:35 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Sun, 7 May 2023 15:21:35 +0200 Subject: [petsc-users] Help with KSPSetConvergenceTest Message-ID: Hello guys, Today I am about to write a custom convergence test for KSP doing the following job: - if the number of ksp iterations is less than a given threshold, iterate until that threshold is met - if the number of ksp iterations is bigger than the threshold, use the standard convergence checks. As far as I understood from the examples, this is the function I need to use: KSPSetConvergenceTest (ksp,MyKSPConverged,0,PETSC_NULL_FUNCTION,ierr) With MyKSPConverge a subroutine (I am in fortran) where my custom checks are implemented. Here is how I have coded MyKSPConverged: * subroutine MyKSPConverged(ksp,n,rnorm,flag,dummy,ierr) KSP ksp PetscErrorCode ierr PetscInt n,dummy KSPConvergedReason flag PetscReal rnorm if (n>1) then call KSPConvergedDefault(ksp, n, rnorm, flag, ierr) else flag = 0 endif ierr = 0 end subroutine MyKSPConverged* However I get this error: [0]PETSC ERROR: KSPSolve has not converged, reason DIVERGED_DTOL which I believe to be triggered by *KSPConvergedDefault. *This is odd, as this simulation converge perfectly if I comment ou my dodgy convergence test. *Do you have any advice for me?* *thank you! * -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun May 7 08:32:18 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 7 May 2023 09:32:18 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: Message-ID: On Sun, May 7, 2023 at 9:21?AM Edoardo alinovi wrote: > Hello guys, > > Today I am about to write a custom convergence test for KSP doing the > following job: > > - if the number of ksp iterations is less than a given threshold, iterate > until that threshold is met > - if the number of ksp iterations is bigger than the threshold, use the > standard convergence checks. > > As far as I understood from the examples, this is the function I need to > use: > > KSPSetConvergenceTest (ksp,MyKSPConverged,0,PETSC_NULL_FUNCTION,ierr) > > With MyKSPConverge a subroutine (I am in fortran) where my custom checks are implemented. > > Here is how I have coded MyKSPConverged: > > > > > > > > > > > > * subroutine MyKSPConverged(ksp,n,rnorm,flag,dummy,ierr) KSP ksp PetscErrorCode ierr PetscInt n,dummy KSPConvergedReason flag PetscReal rnorm if (n>1) then call KSPConvergedDefault(ksp, n, rnorm, flag, ierr)* > > Isn't this call missing an argument? THanks, Matt > > > > > > * else flag = 0 endif ierr = 0 end subroutine MyKSPConverged* > > However I get this error: > [0]PETSC ERROR: KSPSolve has not converged, reason DIVERGED_DTOL > which I believe to be triggered by *KSPConvergedDefault. *This is odd, as this simulation converge perfectly if I comment ou my dodgy convergence test. > > *Do you have any advice for me?* > > *thank you! * > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Sun May 7 08:42:03 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Sun, 7 May 2023 15:42:03 +0200 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: Message-ID: Hi Matt, mmmmh, what if I do: KSPConvergedDefault(ksp, n, rnorm, flag, PETSC_NULL_FUNCTION, ierr) That looks to behave OK, but I am not sure about what I am doing -.- Cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun May 7 08:51:05 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 7 May 2023 09:51:05 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: Message-ID: On Sun, May 7, 2023 at 9:42?AM Edoardo alinovi wrote: > Hi Matt, > > mmmmh, what if I do: > > KSPConvergedDefault(ksp, n, rnorm, flag, PETSC_NULL_FUNCTION, ierr) > > That looks to behave OK, but I am not sure about what I am doing -.- > You are saying that no convergence context was passed in THanks Matt > Cheers > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Sun May 7 09:02:09 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Sun, 7 May 2023 16:02:09 +0200 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: Message-ID: Thanks, Is this a reasonable thing to do if I want to replicate what KSP is doing by default? -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun May 7 09:09:36 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 7 May 2023 10:09:36 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: Message-ID: On Sun, May 7, 2023 at 10:02?AM Edoardo alinovi wrote: > Thanks, > > Is this a reasonable thing to do if I want to replicate what KSP is doing > by default? > Yes. The other option is to pass along 'dummy' Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sun May 7 09:14:42 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 7 May 2023 09:14:42 -0500 (CDT) Subject: [petsc-users] Fortran preprocessor not work in pets-dev In-Reply-To: References: <4FD9CD30-8E71-4A71-85FA-0A14C3BBA46D@gmail.com> <8983bfe9-f87e-a3ec-278f-2bfd4ba56838@mcs.anl.gov> Message-ID: <6fb10d7b-9b19-b4c7-64a3-188a9eb63b97@mcs.anl.gov> Perhaps you are not using the latest 'main' (or release) branch? I get (with current main): $ mpiexec -n 4 ./petsc_fppflags compiled by STANDARD_FORTRAN compiler called by rank 0 called by rank 1 called by rank 2 called by rank 3 There was a issue with early petsc-3.19 release - here one had to reorder the lines from: FPPFLAGS = include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules to include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules FPPFLAGS = But this is fixed in latest release and main branches. Satish On Sun, 7 May 2023, Danyang Su wrote: > Hi Satish, > > Sorry, this is a typo when copy to the email. I use FPPFLAGS in the makefile. Not sure why this occurs. > > Actually not only the preprocessor fails, the petsc initialize does not work either. Attached is a very simple fortran code and below is the test results. Looks like the petsc is not properly installed. I am working on macOS Monterey version 12.5 (Intel Xeon W processor). > > Compiled using petsc-3.18 > (base) ? petsc-dev-fppflags mpiexec -n 4 ./petsc_fppflags > compiled by STANDARD_FORTRAN compiler > called by rank 0 > called by rank 1 > called by rank 2 > called by rank 3 > > compiled using petsc-dev > (base) ? petsc-dev-fppflags mpiexec -n 4 ./petsc_fppflags > called by rank 2 > called by rank 2 > called by rank 2 > called by rank 2 > > Thanks, > > Danyang > > ?On 2023-05-06, 10:22 PM, "Satish Balay" > wrote: > > > On Sat, 6 May 2023, Danyang Su wrote: > > > > Hi All, > > > > > > > > My code has some FPP. It works fine in PETSc 3.18 and earlier version, but stops working in the latest PETSc-Dev. For example the following FPP STANDARD_FORTRAN is not recognized. > > > > > > > > #ifdef STANDARD_FORTRAN > > > > 1 format(15x,1000a15) > > > > 2 format(1pe15.6e3,1000(1pe15.6e3)) > > > > #else > > > > 1 format(15x,a15) > > > > 2 format(1pe15.6e3,(1pe15.6e3)) > > > > #endif > > > > > > > > In the makefile, I define the preprocessor as PPFLAGS. > > > > > > > > PPFLAGS := -DLINUX -DRELEASE -DRELEASE_X64 -DSTANDARD_FORTRAN > > > Shouldn't this be FPPFLAGS? > > > > > Can you send us a simple test case [with the makefile] that we can try to demonstrate this problem? > > > Satish > > > > > > ? > > > > exe: $(OBJS) chkopts > > > > -${FLINKER} $(FFLAGS) $(FPPFLAGS) $(CPPFLAGS) -o $(EXENAME) $(OBJS) ${PETSC_LIB} ${LIS_LIB} ${DLIB} ${SLIB} > > > > > > > > Any idea on this problem? > > > > > > > > All the best, > > > > > > > > > > > > > > > > From tangqi at msu.edu Sun May 7 09:43:34 2023 From: tangqi at msu.edu (Tang, Qi) Date: Sun, 7 May 2023 14:43:34 +0000 Subject: [petsc-users] Step size setting in TS In-Reply-To: References: <1c78d9b4bba544d5bf26683dc039c5bb@lanl.gov> Message-ID: <49FF8939-EEB3-4001-ACC1-DAE61AA674E9@msu.edu> Agree. Not sure adapt_type cfl is what we want. For fully implicit, we normally increase the time step by certain percent when it successfully solves the problem for n time steps. While if it fails, we reduce time step by half and re-solve. Can we achieve something similar using TS adapt? Qi On May 6, 2023, at 7:35 PM, Matthew Knepley wrote: ? On Sat, May 6, 2023 at 7:25?PM Jorti, Zakariae via petsc-users > wrote: Hello, I have a time-dependent model that I solve using TSSolve. And I am trying to adaptively change the step size (dt). I found that there are some TSAdapt schemes already available. I have tried TSADAPTBASIC and TSADAPTCFL. The former runs without any problems, whereas the latter yields the following error: " [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Step rejection not implemented. The CFL implementation is incomplete/unusable " How/where the step rejection should be implemented? Are there any examples available? I would not recommend CFL, which is partially implemented at best. The DSP adaptivity is great. There is a recent paper by David Ketcheson, Lisandro Dalcin, Matteo Parsani, and collaborators which shows that you can set this up to completely handle complicated flows, and it is much better than a simple CFL condition. Thanks, Matt Besides, I was also attempting to change the step size through the monitor: " SNES snes; TSGetSNES(ts, & snes); SNESConvergedReason reason = SNES_CONVERGED_ITERATING; PetscCall(SNESGetConvergedReason(snes, &reason)); if(reason < 0){ TSSetTimeStep(ts,28618.7); } else{ TSSetTimeStep(ts,57237.4); } " But when I try this solution, the TSStep seems to diverge even though the SNES solver converges (see log file below). Am I doing something wrong here by changing the value of the step size inside the monitor? Thank you. Best, Zakariae ----------------------------------------------------------------- Timestep 0: step size = 57237.4, time = 0., 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 8.393431982129e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.985106864162e-01 1 KSP Residual norm 2.509248490095e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.985106864162e-01 0 KSP Residual norm 1.985106864162e-01 1 KSP Residual norm 2.739697200542e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.885623182909e-01 1 KSP Residual norm 8.784869181255e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.664114929279e-01 1 KSP Residual norm 2.277701872167e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.664114929279e-01 0 KSP Residual norm 1.664114929279e-01 1 KSP Residual norm 2.062369048201e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.155996487462e-02 1 KSP Residual norm 8.830688649637e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 6.155996487462e-02 0 KSP Residual norm 6.155996487462e-02 1 KSP Residual norm 6.231091742747e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.828806540636e-02 1 KSP Residual norm 3.623270770725e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 1.828806540636e-02 0 KSP Residual norm 1.828806540636e-02 1 KSP Residual norm 7.390325185746e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.659571588896e-02 1 KSP Residual norm 5.280284947190e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 1.659571588896e-02 0 KSP Residual norm 1.659571588896e-02 1 KSP Residual norm 8.311611006756e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.475636107890e-02 1 KSP Residual norm 1.890387034462e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635129206894e-02 1 KSP Residual norm 8.192897103059e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 6 SNES Function norm 1.635129206894e-02 0 KSP Residual norm 1.635129206894e-02 1 KSP Residual norm 9.049749505225e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.634946381608e-02 1 KSP Residual norm 9.038879304284e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635085731721e-02 1 KSP Residual norm 9.050535714933e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 7 SNES Function norm 1.635085731721e-02 0 KSP Residual norm 1.635085731721e-02 1 KSP Residual norm 1.248783858778e-03 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635003937816e-02 1 KSP Residual norm 1.248746884526e-03 Linear solve converged due to CONVERGED_RTOL iterations 1 8 SNES Function norm 1.635003937816e-02 0 KSP Residual norm 1.635003937816e-02 1 KSP Residual norm 1.631242803481e-02 2 KSP Residual norm 1.551854905139e-02 3 KSP Residual norm 7.586491369138e-03 4 KSP Residual norm 2.304721992110e-04 Linear solve converged due to CONVERGED_RTOL iterations 4 0 KSP Residual norm 1.635003936717e-02 1 KSP Residual norm 1.631242801091e-02 2 KSP Residual norm 1.551784322549e-02 3 KSP Residual norm 7.706018759197e-03 4 KSP Residual norm 2.652435205967e-04 Linear solve converged due to CONVERGED_RTOL iterations 4 9 SNES Function norm 1.282692400720e+06 Nonlinear solve did not converge due to DIVERGED_DTOL iterations 9 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 5.698410682663e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.618385788411e-02 1 KSP Residual norm 9.179214337527e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 8.618385788411e-02 0 KSP Residual norm 8.618385788411e-02 1 KSP Residual norm 9.659611612697e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.763975792565e-02 1 KSP Residual norm 2.674728432525e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.763975792565e-02 0 KSP Residual norm 2.763975792565e-02 1 KSP Residual norm 1.836844517206e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.773540750989e-03 1 KSP Residual norm 1.340353004310e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.773540750989e-03 0 KSP Residual norm 2.773540750989e-03 1 KSP Residual norm 5.382019472069e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.441017533038e-04 1 KSP Residual norm 5.218147408275e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 3.441017533038e-04 0 KSP Residual norm 3.441017533038e-04 1 KSP Residual norm 4.811955757758e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.991606932979e-06 1 KSP Residual norm 8.670005394864e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 3.991606932979e-06 0 KSP Residual norm 3.991606932979e-06 1 KSP Residual norm 8.920137762569e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.982782045945e-10 1 KSP Residual norm 3.185856555648e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 6 SNES Function norm 7.982782045945e-10 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 0 SNES Function norm 3.475518085003e-04 0 KSP Residual norm 3.475518085003e-04 1 KSP Residual norm 2.328087365783e-04 2 KSP Residual norm 5.062641920970e-09 Linear solve converged due to CONVERGED_RTOL iterations 2 0 KSP Residual norm 3.256441726122e-02 1 KSP Residual norm 8.957860841275e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 3.256441726122e-02 0 KSP Residual norm 3.256441726122e-02 1 KSP Residual norm 5.886144296665e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 4.876417342600e-03 1 KSP Residual norm 5.064113356579e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 4.876417342600e-03 0 KSP Residual norm 4.876417342600e-03 1 KSP Residual norm 7.010281488349e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.144574635801e-04 1 KSP Residual norm 9.768499579865e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 6.144574635801e-04 0 KSP Residual norm 6.144574635801e-04 1 KSP Residual norm 8.324370478600e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.721913471286e-06 1 KSP Residual norm 1.370545428753e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 8.721913471286e-06 0 KSP Residual norm 8.721913471286e-06 1 KSP Residual norm 1.353680905072e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.399977638261e-09 1 KSP Residual norm 5.989016835446e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 3.399977638261e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 5 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.733028652796e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.490800616133e-03 1 KSP Residual norm 8.231034393757e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.490800616133e-03 0 KSP Residual norm 7.490800616133e-03 1 KSP Residual norm 8.349300836114e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.394207859364e-04 1 KSP Residual norm 5.991798196295e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.394207859364e-04 0 KSP Residual norm 1.394207859364e-04 1 KSP Residual norm 5.941984242417e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.985895331367e-08 1 KSP Residual norm 1.522240176483e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.985895331367e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 1.026110861069e-03 0 KSP Residual norm 1.026110861069e-03 1 KSP Residual norm 1.768297003974e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.742520550463e-03 1 KSP Residual norm 4.379214132544e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.742520550463e-03 0 KSP Residual norm 7.742520550463e-03 1 KSP Residual norm 4.526371657285e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.574467856482e-05 1 KSP Residual norm 2.285626422848e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.574467856482e-05 0 KSP Residual norm 2.574467856482e-05 1 KSP Residual norm 2.274220735999e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 4.797870687759e-09 1 KSP Residual norm 2.841037171618e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 4.797870687759e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 5.979305049211e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.195748548281e-04 1 KSP Residual norm 1.017772307359e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 8.195748548281e-04 0 KSP Residual norm 8.195748548281e-04 1 KSP Residual norm 1.026961867317e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.717269750886e-06 1 KSP Residual norm 4.675717893044e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.717269750886e-06 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.325295461903e-03 0 KSP Residual norm 1.325295461903e-03 1 KSP Residual norm 7.421950279125e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.330639207844e-03 1 KSP Residual norm 3.764830059967e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.330639207844e-03 0 KSP Residual norm 1.330639207844e-03 1 KSP Residual norm 3.764587772248e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.146996236879e-07 1 KSP Residual norm 1.459926588396e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.146996236879e-07 0 KSP Residual norm 2.146996236879e-07 1 KSP Residual norm 1.470407475325e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.171052440924e-11 1 KSP Residual norm 1.397958787200e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.171052440924e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 3.067250018894e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.608120509564e-04 1 KSP Residual norm 2.171575389779e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.608120509564e-04 0 KSP Residual norm 1.608120509564e-04 1 KSP Residual norm 2.240227170220e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 9.106906778906e-08 1 KSP Residual norm 1.799636201836e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 9.106906778906e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.418151440967e-03 0 KSP Residual norm 1.418151440967e-03 1 KSP Residual norm 3.515151918636e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.938727941997e-04 1 KSP Residual norm 6.040565904176e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.938727941997e-04 0 KSP Residual norm 2.938727941997e-04 1 KSP Residual norm 6.089297895562e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.093551284505e-08 1 KSP Residual norm 2.891462873664e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.093551284505e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 2.050867218313e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.054570961709e-05 1 KSP Residual norm 4.596885059357e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 3.054570961709e-05 0 KSP Residual norm 3.054570961709e-05 1 KSP Residual norm 4.828909801995e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.248104282690e-08 1 KSP Residual norm 8.831951540120e-13 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.248104282690e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.460190420374e-03 0 KSP Residual norm 1.460190420374e-03 1 KSP Residual norm 1.580519450452e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.884986293527e-05 1 KSP Residual norm 9.020937669734e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.884986293527e-05 0 KSP Residual norm 5.884986293527e-05 1 KSP Residual norm 9.054367061995e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.698647259662e-09 1 KSP Residual norm 1.767366286599e-13 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.698647259662e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.834492501580e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.663374849972e-06 1 KSP Residual norm 9.752865227095e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.663374849972e-06 0 KSP Residual norm 5.663374849972e-06 1 KSP Residual norm 1.014567990183e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.600155584863e-09 1 KSP Residual norm 8.580477366667e-14 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.600155584863e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.479145244919e-03 0 KSP Residual norm 1.479145244919e-03 1 KSP Residual norm 6.804553002336e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.118291636442e-05 1 KSP Residual norm 1.397149309582e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.118291636442e-05 0 KSP Residual norm 1.118291636442e-05 1 KSP Residual norm 1.386385106529e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.800461695540e-10 1 KSP Residual norm 3.217319587127e-14 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.800461695540e-10 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.776151497065e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.076105708636e-06 1 KSP Residual norm 2.155085223926e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.076105708636e-06 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.489507915803e-03 0 KSP Residual norm 1.489507915803e-03 1 KSP Residual norm 2.921538148329e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.172405782412e-06 1 KSP Residual norm 2.719699969205e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.172405782412e-06 0 KSP Residual norm 2.172405782412e-06 1 KSP Residual norm 2.731813669182e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.323734019212e-11 1 KSP Residual norm 6.095298326433e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 5.323734019212e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.746846293829e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.376406135686e-07 1 KSP Residual norm 5.310287317591e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.376406135686e-07 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.497239199277e-03 0 KSP Residual norm 1.497239199277e-03 1 KSP Residual norm 1.251830706962e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.023668872625e-07 1 KSP Residual norm 7.656373075632e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.023668872625e-07 0 KSP Residual norm 5.023668872625e-07 1 KSP Residual norm 7.473091471413e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.003331377435e-11 1 KSP Residual norm 1.347408932811e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.003331377435e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.757419385096e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.310279661922e-08 1 KSP Residual norm 1.417918576189e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.310279661922e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.503392083665e-03 0 KSP Residual norm 1.503392083665e-03 1 KSP Residual norm 5.327548050375e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.611770751552e-07 1 KSP Residual norm 2.611036263545e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.611770751552e-07 0 KSP Residual norm 1.611770751552e-07 1 KSP Residual norm 2.607692240234e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.627978707015e-11 1 KSP Residual norm 4.254078683385e-16 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.627978707015e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.747957348554e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.999731614900e-08 1 KSP Residual norm 3.643836796648e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.999731614900e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.507740473908e-03 0 KSP Residual norm 1.507740473908e-03 1 KSP Residual norm 2.263876724496e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.390049329095e-08 1 KSP Residual norm 9.745723464126e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 6.390049329095e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun May 7 09:52:24 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 7 May 2023 10:52:24 -0400 Subject: [petsc-users] Step size setting in TS In-Reply-To: <49FF8939-EEB3-4001-ACC1-DAE61AA674E9@msu.edu> References: <1c78d9b4bba544d5bf26683dc039c5bb@lanl.gov> <49FF8939-EEB3-4001-ACC1-DAE61AA674E9@msu.edu> Message-ID: On Sun, May 7, 2023 at 10:43?AM Tang, Qi wrote: > Agree. Not sure adapt_type cfl is what we want. > > For fully implicit, we normally increase the time step by certain percent > when it successfully solves the problem for n time steps. While if it > fails, we reduce time step by half and re-solve. Can we achieve something > similar using TS adapt? > You can certainly write your own adaptor.I do this in TS ex53 since it has an impulsive start, and to be accurate, I need a first timestep that is very small and consistently so. Also the DSP is a PID controller, so it is doing something morally similar. Thanks, Matt > Qi > > On May 6, 2023, at 7:35 PM, Matthew Knepley wrote: > > ? > On Sat, May 6, 2023 at 7:25?PM Jorti, Zakariae via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> >> I have a time-dependent model that I solve using TSSolve. >> >> And I am trying to adaptively change the step size (dt). >> >> I found that there are some TSAdapt schemes already available. >> >> I have tried TSADAPTBASIC and TSADAPTCFL. >> >> The former runs without any problems, whereas the latter yields the >> following error: >> " >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: No support for this operation for this object type >> [0]PETSC ERROR: Step rejection not implemented. The CFL implementation is >> incomplete/unusable >> " >> >> >> How/where the step rejection should be implemented? Are there any >> examples available? >> > > I would not recommend CFL, which is partially implemented at best. The DSP > adaptivity is great. There is a recent paper by > David Ketcheson, Lisandro Dalcin, Matteo Parsani, and collaborators which > shows that you can set this up to completely > handle complicated flows, and it is much better than a simple CFL > condition. > > Thanks, > > Matt > > >> Besides, I was also attempting to change the step size through the >> monitor: >> >> >> " >> SNES snes; >> TSGetSNES(ts, & snes); >> SNESConvergedReason reason = SNES_CONVERGED_ITERATING; >> PetscCall(SNESGetConvergedReason(snes, &reason)); >> >> if(reason < 0){ >> TSSetTimeStep(ts,28618.7); >> } >> else{ >> TSSetTimeStep(ts,57237.4); >> } >> " >> >> But when I try this solution, the TSStep seems to diverge even though the >> SNES solver converges (see log file below). >> Am I doing something wrong here by changing the value of the step size >> inside the monitor? >> >> >> Thank you. >> >> Best, >> >> >> Zakariae >> >> >> >> >> >> >> >> ----------------------------------------------------------------- >> >> Timestep 0: step size = 57237.4, time = 0., >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 8.393431982129e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.985106864162e-01 >> 1 KSP Residual norm 2.509248490095e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 1.985106864162e-01 >> 0 KSP Residual norm 1.985106864162e-01 >> 1 KSP Residual norm 2.739697200542e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.885623182909e-01 >> 1 KSP Residual norm 8.784869181255e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.664114929279e-01 >> 1 KSP Residual norm 2.277701872167e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 1.664114929279e-01 >> 0 KSP Residual norm 1.664114929279e-01 >> 1 KSP Residual norm 2.062369048201e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 6.155996487462e-02 >> 1 KSP Residual norm 8.830688649637e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 3 SNES Function norm 6.155996487462e-02 >> 0 KSP Residual norm 6.155996487462e-02 >> 1 KSP Residual norm 6.231091742747e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.828806540636e-02 >> 1 KSP Residual norm 3.623270770725e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 4 SNES Function norm 1.828806540636e-02 >> 0 KSP Residual norm 1.828806540636e-02 >> 1 KSP Residual norm 7.390325185746e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.659571588896e-02 >> 1 KSP Residual norm 5.280284947190e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 5 SNES Function norm 1.659571588896e-02 >> 0 KSP Residual norm 1.659571588896e-02 >> 1 KSP Residual norm 8.311611006756e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.475636107890e-02 >> 1 KSP Residual norm 1.890387034462e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.635129206894e-02 >> 1 KSP Residual norm 8.192897103059e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 6 SNES Function norm 1.635129206894e-02 >> 0 KSP Residual norm 1.635129206894e-02 >> 1 KSP Residual norm 9.049749505225e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.634946381608e-02 >> 1 KSP Residual norm 9.038879304284e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.635085731721e-02 >> 1 KSP Residual norm 9.050535714933e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 7 SNES Function norm 1.635085731721e-02 >> 0 KSP Residual norm 1.635085731721e-02 >> 1 KSP Residual norm 1.248783858778e-03 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.635003937816e-02 >> 1 KSP Residual norm 1.248746884526e-03 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 8 SNES Function norm 1.635003937816e-02 >> 0 KSP Residual norm 1.635003937816e-02 >> 1 KSP Residual norm 1.631242803481e-02 >> 2 KSP Residual norm 1.551854905139e-02 >> 3 KSP Residual norm 7.586491369138e-03 >> 4 KSP Residual norm 2.304721992110e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 4 >> 0 KSP Residual norm 1.635003936717e-02 >> 1 KSP Residual norm 1.631242801091e-02 >> 2 KSP Residual norm 1.551784322549e-02 >> 3 KSP Residual norm 7.706018759197e-03 >> 4 KSP Residual norm 2.652435205967e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 4 >> 9 SNES Function norm 1.282692400720e+06 >> Nonlinear solve did not converge due to DIVERGED_DTOL iterations 9 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 5.698410682663e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 8.618385788411e-02 >> 1 KSP Residual norm 9.179214337527e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 8.618385788411e-02 >> 0 KSP Residual norm 8.618385788411e-02 >> 1 KSP Residual norm 9.659611612697e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.763975792565e-02 >> 1 KSP Residual norm 2.674728432525e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 2.763975792565e-02 >> 0 KSP Residual norm 2.763975792565e-02 >> 1 KSP Residual norm 1.836844517206e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.773540750989e-03 >> 1 KSP Residual norm 1.340353004310e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 3 SNES Function norm 2.773540750989e-03 >> 0 KSP Residual norm 2.773540750989e-03 >> 1 KSP Residual norm 5.382019472069e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 3.441017533038e-04 >> 1 KSP Residual norm 5.218147408275e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 4 SNES Function norm 3.441017533038e-04 >> 0 KSP Residual norm 3.441017533038e-04 >> 1 KSP Residual norm 4.811955757758e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 3.991606932979e-06 >> 1 KSP Residual norm 8.670005394864e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 5 SNES Function norm 3.991606932979e-06 >> 0 KSP Residual norm 3.991606932979e-06 >> 1 KSP Residual norm 8.920137762569e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 7.982782045945e-10 >> 1 KSP Residual norm 3.185856555648e-12 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 6 SNES Function norm 7.982782045945e-10 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 >> 0 SNES Function norm 3.475518085003e-04 >> 0 KSP Residual norm 3.475518085003e-04 >> 1 KSP Residual norm 2.328087365783e-04 >> 2 KSP Residual norm 5.062641920970e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 2 >> 0 KSP Residual norm 3.256441726122e-02 >> 1 KSP Residual norm 8.957860841275e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 3.256441726122e-02 >> 0 KSP Residual norm 3.256441726122e-02 >> 1 KSP Residual norm 5.886144296665e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 4.876417342600e-03 >> 1 KSP Residual norm 5.064113356579e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 4.876417342600e-03 >> 0 KSP Residual norm 4.876417342600e-03 >> 1 KSP Residual norm 7.010281488349e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 6.144574635801e-04 >> 1 KSP Residual norm 9.768499579865e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 3 SNES Function norm 6.144574635801e-04 >> 0 KSP Residual norm 6.144574635801e-04 >> 1 KSP Residual norm 8.324370478600e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 8.721913471286e-06 >> 1 KSP Residual norm 1.370545428753e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 4 SNES Function norm 8.721913471286e-06 >> 0 KSP Residual norm 8.721913471286e-06 >> 1 KSP Residual norm 1.353680905072e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 3.399977638261e-09 >> 1 KSP Residual norm 5.989016835446e-12 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 5 SNES Function norm 3.399977638261e-09 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 5 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 1.733028652796e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 7.490800616133e-03 >> 1 KSP Residual norm 8.231034393757e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 7.490800616133e-03 >> 0 KSP Residual norm 7.490800616133e-03 >> 1 KSP Residual norm 8.349300836114e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.394207859364e-04 >> 1 KSP Residual norm 5.991798196295e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 1.394207859364e-04 >> 0 KSP Residual norm 1.394207859364e-04 >> 1 KSP Residual norm 5.941984242417e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.985895331367e-08 >> 1 KSP Residual norm 1.522240176483e-11 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 3 SNES Function norm 2.985895331367e-08 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 >> 0 SNES Function norm 1.026110861069e-03 >> 0 KSP Residual norm 1.026110861069e-03 >> 1 KSP Residual norm 1.768297003974e-04 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 7.742520550463e-03 >> 1 KSP Residual norm 4.379214132544e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 7.742520550463e-03 >> 0 KSP Residual norm 7.742520550463e-03 >> 1 KSP Residual norm 4.526371657285e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.574467856482e-05 >> 1 KSP Residual norm 2.285626422848e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 2.574467856482e-05 >> 0 KSP Residual norm 2.574467856482e-05 >> 1 KSP Residual norm 2.274220735999e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 4.797870687759e-09 >> 1 KSP Residual norm 2.841037171618e-12 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 3 SNES Function norm 4.797870687759e-09 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 5.979305049211e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 8.195748548281e-04 >> 1 KSP Residual norm 1.017772307359e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 8.195748548281e-04 >> 0 KSP Residual norm 8.195748548281e-04 >> 1 KSP Residual norm 1.026961867317e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.717269750886e-06 >> 1 KSP Residual norm 4.675717893044e-10 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 1.717269750886e-06 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 1.325295461903e-03 >> 0 KSP Residual norm 1.325295461903e-03 >> 1 KSP Residual norm 7.421950279125e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.330639207844e-03 >> 1 KSP Residual norm 3.764830059967e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 1.330639207844e-03 >> 0 KSP Residual norm 1.330639207844e-03 >> 1 KSP Residual norm 3.764587772248e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.146996236879e-07 >> 1 KSP Residual norm 1.459926588396e-10 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 2.146996236879e-07 >> 0 KSP Residual norm 2.146996236879e-07 >> 1 KSP Residual norm 1.470407475325e-10 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.171052440924e-11 >> 1 KSP Residual norm 1.397958787200e-15 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 3 SNES Function norm 2.171052440924e-11 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 3.067250018894e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.608120509564e-04 >> 1 KSP Residual norm 2.171575389779e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 1.608120509564e-04 >> 0 KSP Residual norm 1.608120509564e-04 >> 1 KSP Residual norm 2.240227170220e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 9.106906778906e-08 >> 1 KSP Residual norm 1.799636201836e-11 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 9.106906778906e-08 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 1.418151440967e-03 >> 0 KSP Residual norm 1.418151440967e-03 >> 1 KSP Residual norm 3.515151918636e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.938727941997e-04 >> 1 KSP Residual norm 6.040565904176e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 2.938727941997e-04 >> 0 KSP Residual norm 2.938727941997e-04 >> 1 KSP Residual norm 6.089297895562e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.093551284505e-08 >> 1 KSP Residual norm 2.891462873664e-12 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 1.093551284505e-08 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 2.050867218313e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 3.054570961709e-05 >> 1 KSP Residual norm 4.596885059357e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 3.054570961709e-05 >> 0 KSP Residual norm 3.054570961709e-05 >> 1 KSP Residual norm 4.828909801995e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.248104282690e-08 >> 1 KSP Residual norm 8.831951540120e-13 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 1.248104282690e-08 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 1.460190420374e-03 >> 0 KSP Residual norm 1.460190420374e-03 >> 1 KSP Residual norm 1.580519450452e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 5.884986293527e-05 >> 1 KSP Residual norm 9.020937669734e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 5.884986293527e-05 >> 0 KSP Residual norm 5.884986293527e-05 >> 1 KSP Residual norm 9.054367061995e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.698647259662e-09 >> 1 KSP Residual norm 1.767366286599e-13 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 1.698647259662e-09 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 1.834492501580e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 5.663374849972e-06 >> 1 KSP Residual norm 9.752865227095e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 5.663374849972e-06 >> 0 KSP Residual norm 5.663374849972e-06 >> 1 KSP Residual norm 1.014567990183e-08 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.600155584863e-09 >> 1 KSP Residual norm 8.580477366667e-14 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 2.600155584863e-09 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 1.479145244919e-03 >> 0 KSP Residual norm 1.479145244919e-03 >> 1 KSP Residual norm 6.804553002336e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.118291636442e-05 >> 1 KSP Residual norm 1.397149309582e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 1.118291636442e-05 >> 0 KSP Residual norm 1.118291636442e-05 >> 1 KSP Residual norm 1.386385106529e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.800461695540e-10 >> 1 KSP Residual norm 3.217319587127e-14 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 2.800461695540e-10 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 1.776151497065e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.076105708636e-06 >> 1 KSP Residual norm 2.155085223926e-09 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 1.076105708636e-06 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 >> 0 SNES Function norm 1.489507915803e-03 >> 0 KSP Residual norm 1.489507915803e-03 >> 1 KSP Residual norm 2.921538148329e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.172405782412e-06 >> 1 KSP Residual norm 2.719699969205e-10 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 2.172405782412e-06 >> 0 KSP Residual norm 2.172405782412e-06 >> 1 KSP Residual norm 2.731813669182e-10 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 5.323734019212e-11 >> 1 KSP Residual norm 6.095298326433e-15 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 5.323734019212e-11 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 1.746846293829e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.376406135686e-07 >> 1 KSP Residual norm 5.310287317591e-10 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 2.376406135686e-07 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 >> 0 SNES Function norm 1.497239199277e-03 >> 0 KSP Residual norm 1.497239199277e-03 >> 1 KSP Residual norm 1.251830706962e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 5.023668872625e-07 >> 1 KSP Residual norm 7.656373075632e-11 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 5.023668872625e-07 >> 0 KSP Residual norm 5.023668872625e-07 >> 1 KSP Residual norm 7.473091471413e-11 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.003331377435e-11 >> 1 KSP Residual norm 1.347408932811e-15 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 2.003331377435e-11 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 1.757419385096e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 7.310279661922e-08 >> 1 KSP Residual norm 1.417918576189e-10 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 7.310279661922e-08 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 >> 0 SNES Function norm 1.503392083665e-03 >> 0 KSP Residual norm 1.503392083665e-03 >> 1 KSP Residual norm 5.327548050375e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.611770751552e-07 >> 1 KSP Residual norm 2.611036263545e-11 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 1.611770751552e-07 >> 0 KSP Residual norm 1.611770751552e-07 >> 1 KSP Residual norm 2.607692240234e-11 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 1.627978707015e-11 >> 1 KSP Residual norm 4.254078683385e-16 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 2 SNES Function norm 1.627978707015e-11 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> 0 SNES Function norm 2.854104379157e-02 >> 0 KSP Residual norm 2.854104379157e-02 >> 1 KSP Residual norm 1.747957348554e-05 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 2.999731614900e-08 >> 1 KSP Residual norm 3.643836796648e-11 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 2.999731614900e-08 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 >> 0 SNES Function norm 1.507740473908e-03 >> 0 KSP Residual norm 1.507740473908e-03 >> 1 KSP Residual norm 2.263876724496e-07 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 0 KSP Residual norm 6.390049329095e-08 >> 1 KSP Residual norm 9.745723464126e-12 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> 1 SNES Function norm 6.390049329095e-08 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun May 7 10:38:55 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 May 2023 11:38:55 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: Message-ID: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> The code will not work as written because KSPConvergedDefault() requires a context created with KSPConvergedDefaultCreate(). Here is a starting point for what you need, in main integer*8 defaultctx extern MyKSPConverged, KSPConvergedDefaultDestroy KSPDefaultConvergedCreate(defaultctx,ierr) KSPSetConvergenceTest (ksp,MyKSPConverged,defaultctx,KSPConvergedDefaultDestroy, ierr) subroutine MyKSPConverged(ksp,n,rnorm,flag,defaultctx,ierr) KSP ksp PetscErrorCode ierr PetscInt n integer*8 defaultctx KSPConvergedReason flag PetscReal rnorm if (n>1) then call KSPConvergedDefault(ksp, n, rnorm, flag,defaultctx, ierr) else flag = 0 endif ierr = 0 end subroutine MyKSPConverged > On May 7, 2023, at 10:09 AM, Matthew Knepley wrote: > > On Sun, May 7, 2023 at 10:02?AM Edoardo alinovi > wrote: >> Thanks, >> >> Is this a reasonable thing to do if I want to replicate what KSP is doing by default? > > Yes. The other option is to pass along 'dummy' > > Thanks, > > Matt > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun May 7 10:47:07 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 May 2023 11:47:07 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> Message-ID: Note, you must call the default test on iteration 0. This is how it determines the initial residual for relative residual tests etc. I recommend not having the if () test at all. Instead, always call the default convergence test first and then change the flag value it provides if needed before doing any custom checking. > On May 7, 2023, at 11:38 AM, Barry Smith wrote: > > > The code will not work as written because KSPConvergedDefault() requires a context created with KSPConvergedDefaultCreate(). > > Here is a starting point for what you need, in main > > integer*8 defaultctx > extern MyKSPConverged, KSPConvergedDefaultDestroy > > KSPDefaultConvergedCreate(defaultctx,ierr) > KSPSetConvergenceTest (ksp,MyKSPConverged,defaultctx,KSPConvergedDefaultDestroy, ierr) > > > subroutine MyKSPConverged(ksp,n,rnorm,flag,defaultctx,ierr) > > KSP ksp > PetscErrorCode ierr > PetscInt n > integer*8 defaultctx > KSPConvergedReason flag > PetscReal rnorm > > if (n>1) then > call KSPConvergedDefault(ksp, n, rnorm, flag,defaultctx, ierr) > else > flag = 0 > endif > ierr = 0 > > end subroutine MyKSPConverged > > > >> On May 7, 2023, at 10:09 AM, Matthew Knepley wrote: >> >> On Sun, May 7, 2023 at 10:02?AM Edoardo alinovi > wrote: >>> Thanks, >>> >>> Is this a reasonable thing to do if I want to replicate what KSP is doing by default? >> >> Yes. The other option is to pass along 'dummy' >> >> Thanks, >> >> Matt >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun May 7 10:47:07 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 May 2023 11:47:07 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> Message-ID: Note, you must call the default test on iteration 0. This is how it determines the initial residual for relative residual tests etc. I recommend not having the if () test at all. Instead, always call the default convergence test first and then change the flag value it provides if needed before doing any custom checking. > On May 7, 2023, at 11:38 AM, Barry Smith wrote: > > > The code will not work as written because KSPConvergedDefault() requires a context created with KSPConvergedDefaultCreate(). > > Here is a starting point for what you need, in main > > integer*8 defaultctx > extern MyKSPConverged, KSPConvergedDefaultDestroy > > KSPDefaultConvergedCreate(defaultctx,ierr) > KSPSetConvergenceTest (ksp,MyKSPConverged,defaultctx,KSPConvergedDefaultDestroy, ierr) > > > subroutine MyKSPConverged(ksp,n,rnorm,flag,defaultctx,ierr) > > KSP ksp > PetscErrorCode ierr > PetscInt n > integer*8 defaultctx > KSPConvergedReason flag > PetscReal rnorm > > if (n>1) then > call KSPConvergedDefault(ksp, n, rnorm, flag,defaultctx, ierr) > else > flag = 0 > endif > ierr = 0 > > end subroutine MyKSPConverged > > > >> On May 7, 2023, at 10:09 AM, Matthew Knepley wrote: >> >> On Sun, May 7, 2023 at 10:02?AM Edoardo alinovi > wrote: >>> Thanks, >>> >>> Is this a reasonable thing to do if I want to replicate what KSP is doing by default? >> >> Yes. The other option is to pass along 'dummy' >> >> Thanks, >> >> Matt >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun May 7 11:40:27 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 May 2023 12:40:27 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> Message-ID: <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Working example in https://gitlab.com/petsc/petsc/-/merge_requests/6430 > On May 7, 2023, at 11:47 AM, Barry Smith wrote: > > > Note, you must call the default test on iteration 0. This is how it determines the initial residual for relative residual tests etc. > > I recommend not having the if () test at all. Instead, always call the default convergence test first and then change the flag value it provides if needed before doing any custom checking. > >> On May 7, 2023, at 11:38 AM, Barry Smith wrote: >> >> >> The code will not work as written because KSPConvergedDefault() requires a context created with KSPConvergedDefaultCreate(). >> >> Here is a starting point for what you need, in main >> >> integer*8 defaultctx >> extern MyKSPConverged, KSPConvergedDefaultDestroy >> >> KSPDefaultConvergedCreate(defaultctx,ierr) >> KSPSetConvergenceTest (ksp,MyKSPConverged,defaultctx,KSPConvergedDefaultDestroy, ierr) >> >> >> subroutine MyKSPConverged(ksp,n,rnorm,flag,defaultctx,ierr) >> >> KSP ksp >> PetscErrorCode ierr >> PetscInt n >> integer*8 defaultctx >> KSPConvergedReason flag >> PetscReal rnorm >> >> if (n>1) then >> call KSPConvergedDefault(ksp, n, rnorm, flag,defaultctx, ierr) >> else >> flag = 0 >> endif >> ierr = 0 >> >> end subroutine MyKSPConverged >> >> >> >>> On May 7, 2023, at 10:09 AM, Matthew Knepley wrote: >>> >>> On Sun, May 7, 2023 at 10:02?AM Edoardo alinovi > wrote: >>>> Thanks, >>>> >>>> Is this a reasonable thing to do if I want to replicate what KSP is doing by default? >>> >>> Yes. The other option is to pass along 'dummy' >>> >>> Thanks, >>> >>> Matt >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun May 7 11:40:27 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 May 2023 12:40:27 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> Message-ID: <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Working example in https://gitlab.com/petsc/petsc/-/merge_requests/6430 > On May 7, 2023, at 11:47 AM, Barry Smith wrote: > > > Note, you must call the default test on iteration 0. This is how it determines the initial residual for relative residual tests etc. > > I recommend not having the if () test at all. Instead, always call the default convergence test first and then change the flag value it provides if needed before doing any custom checking. > >> On May 7, 2023, at 11:38 AM, Barry Smith wrote: >> >> >> The code will not work as written because KSPConvergedDefault() requires a context created with KSPConvergedDefaultCreate(). >> >> Here is a starting point for what you need, in main >> >> integer*8 defaultctx >> extern MyKSPConverged, KSPConvergedDefaultDestroy >> >> KSPDefaultConvergedCreate(defaultctx,ierr) >> KSPSetConvergenceTest (ksp,MyKSPConverged,defaultctx,KSPConvergedDefaultDestroy, ierr) >> >> >> subroutine MyKSPConverged(ksp,n,rnorm,flag,defaultctx,ierr) >> >> KSP ksp >> PetscErrorCode ierr >> PetscInt n >> integer*8 defaultctx >> KSPConvergedReason flag >> PetscReal rnorm >> >> if (n>1) then >> call KSPConvergedDefault(ksp, n, rnorm, flag,defaultctx, ierr) >> else >> flag = 0 >> endif >> ierr = 0 >> >> end subroutine MyKSPConverged >> >> >> >>> On May 7, 2023, at 10:09 AM, Matthew Knepley wrote: >>> >>> On Sun, May 7, 2023 at 10:02?AM Edoardo alinovi > wrote: >>>> Thanks, >>>> >>>> Is this a reasonable thing to do if I want to replicate what KSP is doing by default? >>> >>> Yes. The other option is to pass along 'dummy' >>> >>> Thanks, >>> >>> Matt >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Sun May 7 13:22:57 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Sun, 7 May 2023 20:22:57 +0200 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Message-ID: Hello Barry, Mega! Thank you Berry much for providing me with a working example! I ended up in writing this: * call KSPConvergedDefault(ksp, n, rnorm, flag, PETSC_NULL_FUNCTION, ierr) if (n From bsmith at petsc.dev Sun May 7 13:31:28 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 May 2023 14:31:28 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Message-ID: > On May 7, 2023, at 2:22 PM, Edoardo alinovi wrote: > > Hello Barry, > > Mega! Thank you Berry much for providing me with a working example! I ended up in writing this: > > call KSPConvergedDefault(ksp, n, rnorm, flag, PETSC_NULL_FUNCTION, ierr) This should not work. The argument to KSPConvergedDefault is a context, not a function > if (n flag = 0 > endif > ierr = 0 > > and it looks working but I'll take advantage of your suggestion ;) Is KSPConvergedDefaultDestroy mandatory? You could pass PETSC_NULL_FUNCTION instead of KSPConvergedDefaultDestroy, but there is no reason to. > > I know it's easy code, but maybe you might have a think to add this control and expose it as the max number of iterations in KSP. I can tell you it is very much used, I found myself many times in need to tell the solver to iterate regardless of the tolerances criteria. It happens for example in the RANS equation, especially omega. Sometimes they tend to stall and you do want to tighten the tolerances for a bunch of iters, or you might not know if they do while iterating! > > Cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Sun May 7 14:03:27 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Sun, 7 May 2023 21:03:27 +0200 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Message-ID: Thanks Barry, I have fixed the error. Indeed, It's odd it is working. Compiler magic I guess! -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Mon May 8 01:11:09 2023 From: danyang.su at gmail.com (Danyang Su) Date: Sun, 07 May 2023 23:11:09 -0700 Subject: [petsc-users] Fortran preprocessor not work in pets-dev In-Reply-To: <6fb10d7b-9b19-b4c7-64a3-188a9eb63b97@mcs.anl.gov> References: <4FD9CD30-8E71-4A71-85FA-0A14C3BBA46D@gmail.com> <8983bfe9-f87e-a3ec-278f-2bfd4ba56838@mcs.anl.gov> <6fb10d7b-9b19-b4c7-64a3-188a9eb63b97@mcs.anl.gov> Message-ID: Hi Satish, Exactly. Something went wrong when I pull the remote content last week. I made a clean download of dev version and the problem is solved. Thanks, Danyang ?On 2023-05-07, 7:14 AM, "Satish Balay" > wrote: Perhaps you are not using the latest 'main' (or release) branch? I get (with current main): $ mpiexec -n 4 ./petsc_fppflags compiled by STANDARD_FORTRAN compiler called by rank 0 called by rank 1 called by rank 2 called by rank 3 There was a issue with early petsc-3.19 release - here one had to reorder the lines from: FPPFLAGS = include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules to include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules FPPFLAGS = But this is fixed in latest release and main branches. Satish On Sun, 7 May 2023, Danyang Su wrote: > Hi Satish, > > Sorry, this is a typo when copy to the email. I use FPPFLAGS in the makefile. Not sure why this occurs. > > Actually not only the preprocessor fails, the petsc initialize does not work either. Attached is a very simple fortran code and below is the test results. Looks like the petsc is not properly installed. I am working on macOS Monterey version 12.5 (Intel Xeon W processor). > > Compiled using petsc-3.18 > (base) ? petsc-dev-fppflags mpiexec -n 4 ./petsc_fppflags > compiled by STANDARD_FORTRAN compiler > called by rank 0 > called by rank 1 > called by rank 2 > called by rank 3 > > compiled using petsc-dev > (base) ? petsc-dev-fppflags mpiexec -n 4 ./petsc_fppflags > called by rank 2 > called by rank 2 > called by rank 2 > called by rank 2 > > Thanks, > > Danyang > > ?On 2023-05-06, 10:22 PM, "Satish Balay" >> wrote: > > > On Sat, 6 May 2023, Danyang Su wrote: > > > > Hi All, > > > > > > > > My code has some FPP. It works fine in PETSc 3.18 and earlier version, but stops working in the latest PETSc-Dev. For example the following FPP STANDARD_FORTRAN is not recognized. > > > > > > > > #ifdef STANDARD_FORTRAN > > > > 1 format(15x,1000a15) > > > > 2 format(1pe15.6e3,1000(1pe15.6e3)) > > > > #else > > > > 1 format(15x,a15) > > > > 2 format(1pe15.6e3,(1pe15.6e3)) > > > > #endif > > > > > > > > In the makefile, I define the preprocessor as PPFLAGS. > > > > > > > > PPFLAGS := -DLINUX -DRELEASE -DRELEASE_X64 -DSTANDARD_FORTRAN > > > Shouldn't this be FPPFLAGS? > > > > > Can you send us a simple test case [with the makefile] that we can try to demonstrate this problem? > > > Satish > > > > > > ? > > > > exe: $(OBJS) chkopts > > > > -${FLINKER} $(FFLAGS) $(FPPFLAGS) $(CPPFLAGS) -o $(EXENAME) $(OBJS) ${PETSC_LIB} ${LIS_LIB} ${DLIB} ${SLIB} > > > > > > > > Any idea on this problem? > > > > > > > > All the best, > > > > > > > > > > > > > > > > From sebastian.blauth at itwm.fraunhofer.de Mon May 8 01:31:55 2023 From: sebastian.blauth at itwm.fraunhofer.de (Sebastian Blauth) Date: Mon, 8 May 2023 08:31:55 +0200 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: <3287ff5f-5ac1-fdff-52d1-97888568c098@itwm.fraunhofer.de> References: <87cz3i7fj1.fsf@jedbrown.org> <3287ff5f-5ac1-fdff-52d1-97888568c098@itwm.fraunhofer.de> Message-ID: Hello everyone, I wanted to briefly follow up on my question (see my last reply). Does anyone know / have an idea why the LSC preconditioner in PETSc does not seem to scale well with the problem size (the outer fgmres solver I am using nearly scale nearly linearly with the problem size in my example). I have also already tried using -ksp_diagonal_scale but the results are identical. Any help is really appreciated. Thanks a lot, Sebastian On 03.05.2023 09:07, Sebastian Blauth wrote: > First of all, yes you are correct that I am trying to solve the > stationary incompressible Navier Stokes equations. > > On 02.05.2023 21:33, Matthew Knepley wrote: >> On Tue, May 2, 2023 at 2:29?PM Jed Brown > > wrote: >> >> ??? Sebastian Blauth > ??? > writes: >> >> ???? > I agree with your comment for the Stokes equations - for these, I >> ??? have >> ???? > already tried and used the pressure mass matrix as part of a >> ??? (additive) >> ???? > block preconditioner and it gave mesh independent results. >> ???? > >> ???? > However, for the Navier Stokes equations, is the Schur complement >> ??? really >> ???? > spectrally equivalent to the pressure mass matrix? >> >> ??? No, it's not. You'd want something like PCD (better, but not >> ??? algebraic) or LSC. >> > > I would like to take a look at the LSC preconditioner. For this, I did > also not achieve mesh-independent results. I am using the following > options (I know that the tolerances are too high at the moment, but it > should just illustrate the behavior w.r.t. mesh refinement). Again, I am > using a simple 2D channel problem for testing purposes. > > I am using the following options > > -ksp_type fgmres > -ksp_gmres_restart 100 > -ksp_gmres_cgs_refinement_type refine_ifneeded > -ksp_max_it 1000 > -ksp_rtol 1e-10 > -ksp_atol 1e-30 > -pc_type fieldsplit > -pc_fieldsplit_type schur > -pc_fieldsplit_schur_fact_type full > -pc_fieldsplit_schur_precondition self > -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type lu > -fieldsplit_1_ksp_type gmres > -fieldsplit_1_ksp_pc_side right > -fieldsplit_1_ksp_gmres_restart 100 > -fieldsplit_1_ksp_gmres_cgs_refinement_type refine_ifneeded > -fieldsplit_1_ksp_max_it 1000 > -fieldsplit_1_ksp_rtol 1e-10 > -fieldsplit_1_ksp_atol 1e-30 > -fieldsplit_1_pc_type lsc > -fieldsplit_1_lsc_ksp_type preonly > -fieldsplit_1_lsc_pc_type lu > -fieldsplit_1_ksp_converged_reason > > Again, the direct solvers are used so that only the influence of the LSC > preconditioner is seen. I have suitable preconditioners for all of these > available (using boomeramg). > > At the bottom, I attach the output for different discretizations. As you > can see there, the number of iterations increases nearly linearly with > the problem size. > > I think that the problem could occur due to a wrong scaling. In your > docs https://petsc.org/release/manualpages/PC/PCLSC/ , you write that > the LSC preconditioner is implemented as > > ?? inv(S) \approx inv(A10 A01) A10 A00 A01 inv(A10 A01) > > However, in the book of Elman, Sylvester and Wathen (Finite Elements and > Fast Iterative Solvers), the LSC preconditioner is defined as > > ??? inv(S) \approx inv(A10 inv(T) A01) A10 inv(T) A00 inv(T) A01 > inv(A10 inv(T) A01) > > where T = diag(Q) and Q is the velocity mass matrix. > > There is an options -pc_lsc_scale_diag, which states that it uses the > diagonal of A for scaling. I suppose, that this means, that the diagonal > of the A00 block is used for scaling - however, this is not the right > scaling, is it? Even in the source code for the LSC preconditioner, in > /src/ksp/pc/impls/lsc/lsc.c it is mentioned, that a mass matrix should > be used... > Is there any way to implement this in PETSc? Maybe by supplying the mass > matrix as Pmat? > > Thanks a lot in advance, > Sebastian > >> >> I think you can do a better job than that using something like >> >> https://arxiv.org/abs/1810.03315 >> >> Basically, you use an augmented Lagrangian thing to make the Schur >> complement well-conditioned, >> and then use a special smoother to handle that perturbation. >> >> ???? > And even if it is, the convergence is only good for small >> ??? Reynolds numbers, for moderately high ones the convergence really >> ??? deteriorates. This is why I am trying to make >> ??? fieldsplit_schur_precondition selfp work better (this is, if I >> ??? understand it correctly, a SIMPLE type preconditioner). >> >> ??? SIMPLE is for short time steps (not too far from resolving CFL) and >> ??? bad for steady. This taxonomy is useful, though the problems are >> ??? super academic and they don't use high aspect ratio. >> > > Okay, I get that I cannot expect the SIMPLE preconditioning > (schur_precondition selfp) to work efficiently. I guess the reason it > works for small time steps (for the instationary equations) is that the > velocity block becomes diagonally dominant in this case, so that diag(A) > is a good approximation of A. > > >> ??? https://doi.org/10.1016/j.jcp.2007.09.026 >> ??? >> >> >> ?? ?Thanks, >> >> ?? ? ? Matt >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > > > And here is the output of my scaling tests > > 8x8 discretization > > Newton solver:? iter,? abs. residual (abs. tol),? rel. residual (rel. tol) > > Newton solver:???? 0,????? 1.023e+03 (1.00e-30),????? 1.000e+00 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 38 > Newton solver:???? 1,????? 1.313e+03 (1.00e-30),????? 1.283e+00 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 76 > Newton solver:???? 2,????? 1.198e+02 (1.00e-30),????? 1.171e-01 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 74 > Newton solver:???? 3,????? 7.249e-01 (1.00e-30),????? 7.084e-04 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 74 > Newton solver:???? 4,????? 3.883e-05 (1.00e-30),????? 3.795e-08 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 74 > Newton solver:???? 5,????? 2.778e-12 (1.00e-30),????? 2.714e-15 (1.00e-10) > > > > 16x16 discretization > > Newton solver:? iter,? abs. residual (abs. tol),? rel. residual (rel. tol) > > Newton solver:???? 0,????? 1.113e+03 (1.00e-30),????? 1.000e+00 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 62 > Newton solver:???? 1,????? 8.316e+02 (1.00e-30),????? 7.475e-01 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 141 > Newton solver:???? 2,????? 5.806e+01 (1.00e-30),????? 5.218e-02 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 119 > Newton solver:???? 3,????? 3.309e-01 (1.00e-30),????? 2.974e-04 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 118 > Newton solver:???? 4,????? 9.085e-06 (1.00e-30),????? 8.166e-09 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 120 > Newton solver:???? 5,????? 3.475e-12 (1.00e-30),????? 3.124e-15 (1.00e-10) > > > > 32x32 discretization > > Newton solver:? iter,? abs. residual (abs. tol),? rel. residual (rel. tol) > > Newton solver:???? 0,????? 1.330e+03 (1.00e-30),????? 1.000e+00 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 98 > Newton solver:???? 1,????? 5.913e+02 (1.00e-30),????? 4.445e-01 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 183 > Newton solver:???? 2,????? 3.214e+01 (1.00e-30),????? 2.416e-02 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 152 > Newton solver:???? 3,????? 2.059e-01 (1.00e-30),????? 1.547e-04 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 151 > Newton solver:???? 4,????? 6.949e-06 (1.00e-30),????? 5.223e-09 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 149 > Newton solver:???? 5,????? 5.300e-12 (1.00e-30),????? 3.983e-15 (1.00e-10) > > > > 64x64 discretization > > Newton solver:? iter,? abs. residual (abs. tol),? rel. residual (rel. tol) > > Newton solver:???? 0,????? 1.707e+03 (1.00e-30),????? 1.000e+00 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 198 > Newton solver:???? 1,????? 4.259e+02 (1.00e-30),????? 2.494e-01 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 357 > Newton solver:???? 2,????? 1.706e+01 (1.00e-30),????? 9.993e-03 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 266 > Newton solver:???? 3,????? 1.134e-01 (1.00e-30),????? 6.639e-05 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 261 > Newton solver:???? 4,????? 4.285e-06 (1.00e-30),????? 2.510e-09 (1.00e-10) > ? Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 263 > Newton solver:???? 5,????? 9.650e-12 (1.00e-30),????? 5.652e-15 (1.00e-10) > -- Dr. Sebastian Blauth Fraunhofer-Institut f?r Techno- und Wirtschaftsmathematik ITWM Abteilung Transportvorg?nge Fraunhofer-Platz 1, 67663 Kaiserslautern Telefon: +49 631 31600-4968 sebastian.blauth at itwm.fraunhofer.de www.itwm.fraunhofer.de From mfadams at lbl.gov Mon May 8 06:27:32 2023 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 8 May 2023 07:27:32 -0400 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: References: <87cz3i7fj1.fsf@jedbrown.org> <3287ff5f-5ac1-fdff-52d1-97888568c098@itwm.fraunhofer.de> Message-ID: On Mon, May 8, 2023 at 2:32?AM Sebastian Blauth < sebastian.blauth at itwm.fraunhofer.de> wrote: > Hello everyone, > > I wanted to briefly follow up on my question (see my last reply). > Does anyone know / have an idea why the LSC preconditioner in PETSc does > not seem to scale well with the problem size (the outer fgmres solver I > am using nearly scale nearly linearly with the problem size in my example). > I have also already tried using -ksp_diagonal_scale but the results are > identical. > Any help is really appreciated. > I would start by finding results in the literature that you like, in that they are on a problem similar to yours and you like the performance, and reproduce them in your code. If you can do that then you have a research problem to see how to get your problem to work well. If your solver does not reproduce published results then we might be able to provide some advice. Mark > > Thanks a lot, > Sebastian > > On 03.05.2023 09:07, Sebastian Blauth wrote: > > First of all, yes you are correct that I am trying to solve the > > stationary incompressible Navier Stokes equations. > > > > On 02.05.2023 21:33, Matthew Knepley wrote: > >> On Tue, May 2, 2023 at 2:29?PM Jed Brown >> > wrote: > >> > >> Sebastian Blauth >> > writes: > >> > >> > I agree with your comment for the Stokes equations - for these, I > >> have > >> > already tried and used the pressure mass matrix as part of a > >> (additive) > >> > block preconditioner and it gave mesh independent results. > >> > > >> > However, for the Navier Stokes equations, is the Schur complement > >> really > >> > spectrally equivalent to the pressure mass matrix? > >> > >> No, it's not. You'd want something like PCD (better, but not > >> algebraic) or LSC. > >> > > > > I would like to take a look at the LSC preconditioner. For this, I did > > also not achieve mesh-independent results. I am using the following > > options (I know that the tolerances are too high at the moment, but it > > should just illustrate the behavior w.r.t. mesh refinement). Again, I am > > using a simple 2D channel problem for testing purposes. > > > > I am using the following options > > > > -ksp_type fgmres > > -ksp_gmres_restart 100 > > -ksp_gmres_cgs_refinement_type refine_ifneeded > > -ksp_max_it 1000 > > -ksp_rtol 1e-10 > > -ksp_atol 1e-30 > > -pc_type fieldsplit > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type full > > -pc_fieldsplit_schur_precondition self > > -fieldsplit_0_ksp_type preonly > > -fieldsplit_0_pc_type lu > > -fieldsplit_1_ksp_type gmres > > -fieldsplit_1_ksp_pc_side right > > -fieldsplit_1_ksp_gmres_restart 100 > > -fieldsplit_1_ksp_gmres_cgs_refinement_type refine_ifneeded > > -fieldsplit_1_ksp_max_it 1000 > > -fieldsplit_1_ksp_rtol 1e-10 > > -fieldsplit_1_ksp_atol 1e-30 > > -fieldsplit_1_pc_type lsc > > -fieldsplit_1_lsc_ksp_type preonly > > -fieldsplit_1_lsc_pc_type lu > > -fieldsplit_1_ksp_converged_reason > > > > Again, the direct solvers are used so that only the influence of the LSC > > preconditioner is seen. I have suitable preconditioners for all of these > > available (using boomeramg). > > > > At the bottom, I attach the output for different discretizations. As you > > can see there, the number of iterations increases nearly linearly with > > the problem size. > > > > I think that the problem could occur due to a wrong scaling. In your > > docs https://petsc.org/release/manualpages/PC/PCLSC/ , you write that > > the LSC preconditioner is implemented as > > > > inv(S) \approx inv(A10 A01) A10 A00 A01 inv(A10 A01) > > > > However, in the book of Elman, Sylvester and Wathen (Finite Elements and > > Fast Iterative Solvers), the LSC preconditioner is defined as > > > > inv(S) \approx inv(A10 inv(T) A01) A10 inv(T) A00 inv(T) A01 > > inv(A10 inv(T) A01) > > > > where T = diag(Q) and Q is the velocity mass matrix. > > > > There is an options -pc_lsc_scale_diag, which states that it uses the > > diagonal of A for scaling. I suppose, that this means, that the diagonal > > of the A00 block is used for scaling - however, this is not the right > > scaling, is it? Even in the source code for the LSC preconditioner, in > > /src/ksp/pc/impls/lsc/lsc.c it is mentioned, that a mass matrix should > > be used... > > Is there any way to implement this in PETSc? Maybe by supplying the mass > > matrix as Pmat? > > > > Thanks a lot in advance, > > Sebastian > > > >> > >> I think you can do a better job than that using something like > >> > >> https://arxiv.org/abs/1810.03315 > >> > >> Basically, you use an augmented Lagrangian thing to make the Schur > >> complement well-conditioned, > >> and then use a special smoother to handle that perturbation. > >> > >> > And even if it is, the convergence is only good for small > >> Reynolds numbers, for moderately high ones the convergence really > >> deteriorates. This is why I am trying to make > >> fieldsplit_schur_precondition selfp work better (this is, if I > >> understand it correctly, a SIMPLE type preconditioner). > >> > >> SIMPLE is for short time steps (not too far from resolving CFL) and > >> bad for steady. This taxonomy is useful, though the problems are > >> super academic and they don't use high aspect ratio. > >> > > > > Okay, I get that I cannot expect the SIMPLE preconditioning > > (schur_precondition selfp) to work efficiently. I guess the reason it > > works for small time steps (for the instationary equations) is that the > > velocity block becomes diagonally dominant in this case, so that diag(A) > > is a good approximation of A. > > > > > >> https://doi.org/10.1016/j.jcp.2007.09.026 > >> > >> > >> > >> Thanks, > >> > >> Matt > >> > >> -- > >> What most experimenters take for granted before they begin their > >> experiments is infinitely more interesting than any results to which > >> their experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > >> > > > > > > And here is the output of my scaling tests > > > > 8x8 discretization > > > > Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. > tol) > > > > Newton solver: 0, 1.023e+03 (1.00e-30), 1.000e+00 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 38 > > Newton solver: 1, 1.313e+03 (1.00e-30), 1.283e+00 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 76 > > Newton solver: 2, 1.198e+02 (1.00e-30), 1.171e-01 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 74 > > Newton solver: 3, 7.249e-01 (1.00e-30), 7.084e-04 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 74 > > Newton solver: 4, 3.883e-05 (1.00e-30), 3.795e-08 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 74 > > Newton solver: 5, 2.778e-12 (1.00e-30), 2.714e-15 > (1.00e-10) > > > > > > > > 16x16 discretization > > > > Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. > tol) > > > > Newton solver: 0, 1.113e+03 (1.00e-30), 1.000e+00 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 62 > > Newton solver: 1, 8.316e+02 (1.00e-30), 7.475e-01 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 141 > > Newton solver: 2, 5.806e+01 (1.00e-30), 5.218e-02 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 119 > > Newton solver: 3, 3.309e-01 (1.00e-30), 2.974e-04 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 118 > > Newton solver: 4, 9.085e-06 (1.00e-30), 8.166e-09 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 120 > > Newton solver: 5, 3.475e-12 (1.00e-30), 3.124e-15 > (1.00e-10) > > > > > > > > 32x32 discretization > > > > Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. > tol) > > > > Newton solver: 0, 1.330e+03 (1.00e-30), 1.000e+00 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > 98 > > Newton solver: 1, 5.913e+02 (1.00e-30), 4.445e-01 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 183 > > Newton solver: 2, 3.214e+01 (1.00e-30), 2.416e-02 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 152 > > Newton solver: 3, 2.059e-01 (1.00e-30), 1.547e-04 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 151 > > Newton solver: 4, 6.949e-06 (1.00e-30), 5.223e-09 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 149 > > Newton solver: 5, 5.300e-12 (1.00e-30), 3.983e-15 > (1.00e-10) > > > > > > > > 64x64 discretization > > > > Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. > tol) > > > > Newton solver: 0, 1.707e+03 (1.00e-30), 1.000e+00 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 198 > > Newton solver: 1, 4.259e+02 (1.00e-30), 2.494e-01 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 357 > > Newton solver: 2, 1.706e+01 (1.00e-30), 9.993e-03 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 266 > > Newton solver: 3, 1.134e-01 (1.00e-30), 6.639e-05 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 261 > > Newton solver: 4, 4.285e-06 (1.00e-30), 2.510e-09 > (1.00e-10) > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations > > 263 > > Newton solver: 5, 9.650e-12 (1.00e-30), 5.652e-15 > (1.00e-10) > > > > -- > Dr. Sebastian Blauth > Fraunhofer-Institut f?r > Techno- und Wirtschaftsmathematik ITWM > Abteilung Transportvorg?nge > Fraunhofer-Platz 1, 67663 Kaiserslautern > Telefon: +49 631 31600-4968 > sebastian.blauth at itwm.fraunhofer.de > www.itwm.fraunhofer.de > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 8 06:42:09 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 8 May 2023 07:42:09 -0400 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: References: <87cz3i7fj1.fsf@jedbrown.org> <3287ff5f-5ac1-fdff-52d1-97888568c098@itwm.fraunhofer.de> Message-ID: On Mon, May 8, 2023 at 7:27?AM Mark Adams wrote: > > > On Mon, May 8, 2023 at 2:32?AM Sebastian Blauth < > sebastian.blauth at itwm.fraunhofer.de> wrote: > >> Hello everyone, >> >> I wanted to briefly follow up on my question (see my last reply). >> Does anyone know / have an idea why the LSC preconditioner in PETSc does >> not seem to scale well with the problem size (the outer fgmres solver I >> am using nearly scale nearly linearly with the problem size in my >> example). >> I have also already tried using -ksp_diagonal_scale but the results are >> identical. >> Any help is really appreciated. >> > > I would start by finding results in the literature that you like, in that > they are on a problem similar to yours and you like the performance, and > reproduce them in your code. > If you can do that then you have a research problem to see how to get your > problem to work well. > If your solver does not reproduce published results then we might be > able to provide some advice. > I have not seen LSC scale well. The only thing I have seen to actually work is patch smoothing (I gave a paper reference). It could be that we have a bug in LSC, but I thought we verified it with the Shuttleworth paper. THanks, Matt > Mark > > >> >> Thanks a lot, >> Sebastian >> >> On 03.05.2023 09:07, Sebastian Blauth wrote: >> > First of all, yes you are correct that I am trying to solve the >> > stationary incompressible Navier Stokes equations. >> > >> > On 02.05.2023 21:33, Matthew Knepley wrote: >> >> On Tue, May 2, 2023 at 2:29?PM Jed Brown > >> > wrote: >> >> >> >> Sebastian Blauth > >> > writes: >> >> >> >> > I agree with your comment for the Stokes equations - for these, >> I >> >> have >> >> > already tried and used the pressure mass matrix as part of a >> >> (additive) >> >> > block preconditioner and it gave mesh independent results. >> >> > >> >> > However, for the Navier Stokes equations, is the Schur >> complement >> >> really >> >> > spectrally equivalent to the pressure mass matrix? >> >> >> >> No, it's not. You'd want something like PCD (better, but not >> >> algebraic) or LSC. >> >> >> > >> > I would like to take a look at the LSC preconditioner. For this, I did >> > also not achieve mesh-independent results. I am using the following >> > options (I know that the tolerances are too high at the moment, but it >> > should just illustrate the behavior w.r.t. mesh refinement). Again, I >> am >> > using a simple 2D channel problem for testing purposes. >> > >> > I am using the following options >> > >> > -ksp_type fgmres >> > -ksp_gmres_restart 100 >> > -ksp_gmres_cgs_refinement_type refine_ifneeded >> > -ksp_max_it 1000 >> > -ksp_rtol 1e-10 >> > -ksp_atol 1e-30 >> > -pc_type fieldsplit >> > -pc_fieldsplit_type schur >> > -pc_fieldsplit_schur_fact_type full >> > -pc_fieldsplit_schur_precondition self >> > -fieldsplit_0_ksp_type preonly >> > -fieldsplit_0_pc_type lu >> > -fieldsplit_1_ksp_type gmres >> > -fieldsplit_1_ksp_pc_side right >> > -fieldsplit_1_ksp_gmres_restart 100 >> > -fieldsplit_1_ksp_gmres_cgs_refinement_type refine_ifneeded >> > -fieldsplit_1_ksp_max_it 1000 >> > -fieldsplit_1_ksp_rtol 1e-10 >> > -fieldsplit_1_ksp_atol 1e-30 >> > -fieldsplit_1_pc_type lsc >> > -fieldsplit_1_lsc_ksp_type preonly >> > -fieldsplit_1_lsc_pc_type lu >> > -fieldsplit_1_ksp_converged_reason >> > >> > Again, the direct solvers are used so that only the influence of the >> LSC >> > preconditioner is seen. I have suitable preconditioners for all of >> these >> > available (using boomeramg). >> > >> > At the bottom, I attach the output for different discretizations. As >> you >> > can see there, the number of iterations increases nearly linearly with >> > the problem size. >> > >> > I think that the problem could occur due to a wrong scaling. In your >> > docs https://petsc.org/release/manualpages/PC/PCLSC/ , you write that >> > the LSC preconditioner is implemented as >> > >> > inv(S) \approx inv(A10 A01) A10 A00 A01 inv(A10 A01) >> > >> > However, in the book of Elman, Sylvester and Wathen (Finite Elements >> and >> > Fast Iterative Solvers), the LSC preconditioner is defined as >> > >> > inv(S) \approx inv(A10 inv(T) A01) A10 inv(T) A00 inv(T) A01 >> > inv(A10 inv(T) A01) >> > >> > where T = diag(Q) and Q is the velocity mass matrix. >> > >> > There is an options -pc_lsc_scale_diag, which states that it uses the >> > diagonal of A for scaling. I suppose, that this means, that the >> diagonal >> > of the A00 block is used for scaling - however, this is not the right >> > scaling, is it? Even in the source code for the LSC preconditioner, in >> > /src/ksp/pc/impls/lsc/lsc.c it is mentioned, that a mass matrix should >> > be used... >> > Is there any way to implement this in PETSc? Maybe by supplying the >> mass >> > matrix as Pmat? >> > >> > Thanks a lot in advance, >> > Sebastian >> > >> >> >> >> I think you can do a better job than that using something like >> >> >> >> https://arxiv.org/abs/1810.03315 >> >> >> >> Basically, you use an augmented Lagrangian thing to make the Schur >> >> complement well-conditioned, >> >> and then use a special smoother to handle that perturbation. >> >> >> >> > And even if it is, the convergence is only good for small >> >> Reynolds numbers, for moderately high ones the convergence really >> >> deteriorates. This is why I am trying to make >> >> fieldsplit_schur_precondition selfp work better (this is, if I >> >> understand it correctly, a SIMPLE type preconditioner). >> >> >> >> SIMPLE is for short time steps (not too far from resolving CFL) and >> >> bad for steady. This taxonomy is useful, though the problems are >> >> super academic and they don't use high aspect ratio. >> >> >> > >> > Okay, I get that I cannot expect the SIMPLE preconditioning >> > (schur_precondition selfp) to work efficiently. I guess the reason it >> > works for small time steps (for the instationary equations) is that the >> > velocity block becomes diagonally dominant in this case, so that >> diag(A) >> > is a good approximation of A. >> > >> > >> >> https://doi.org/10.1016/j.jcp.2007.09.026 >> >> >> >> >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> >> experiments is infinitely more interesting than any results to which >> >> their experiments lead. >> >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> > >> > >> > And here is the output of my scaling tests >> > >> > 8x8 discretization >> > >> > Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. >> tol) >> > >> > Newton solver: 0, 1.023e+03 (1.00e-30), 1.000e+00 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations 38 >> > Newton solver: 1, 1.313e+03 (1.00e-30), 1.283e+00 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations 76 >> > Newton solver: 2, 1.198e+02 (1.00e-30), 1.171e-01 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations 74 >> > Newton solver: 3, 7.249e-01 (1.00e-30), 7.084e-04 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations 74 >> > Newton solver: 4, 3.883e-05 (1.00e-30), 3.795e-08 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations 74 >> > Newton solver: 5, 2.778e-12 (1.00e-30), 2.714e-15 >> (1.00e-10) >> > >> > >> > >> > 16x16 discretization >> > >> > Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. >> tol) >> > >> > Newton solver: 0, 1.113e+03 (1.00e-30), 1.000e+00 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations 62 >> > Newton solver: 1, 8.316e+02 (1.00e-30), 7.475e-01 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 141 >> > Newton solver: 2, 5.806e+01 (1.00e-30), 5.218e-02 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 119 >> > Newton solver: 3, 3.309e-01 (1.00e-30), 2.974e-04 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 118 >> > Newton solver: 4, 9.085e-06 (1.00e-30), 8.166e-09 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 120 >> > Newton solver: 5, 3.475e-12 (1.00e-30), 3.124e-15 >> (1.00e-10) >> > >> > >> > >> > 32x32 discretization >> > >> > Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. >> tol) >> > >> > Newton solver: 0, 1.330e+03 (1.00e-30), 1.000e+00 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations 98 >> > Newton solver: 1, 5.913e+02 (1.00e-30), 4.445e-01 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 183 >> > Newton solver: 2, 3.214e+01 (1.00e-30), 2.416e-02 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 152 >> > Newton solver: 3, 2.059e-01 (1.00e-30), 1.547e-04 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 151 >> > Newton solver: 4, 6.949e-06 (1.00e-30), 5.223e-09 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 149 >> > Newton solver: 5, 5.300e-12 (1.00e-30), 3.983e-15 >> (1.00e-10) >> > >> > >> > >> > 64x64 discretization >> > >> > Newton solver: iter, abs. residual (abs. tol), rel. residual (rel. >> tol) >> > >> > Newton solver: 0, 1.707e+03 (1.00e-30), 1.000e+00 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 198 >> > Newton solver: 1, 4.259e+02 (1.00e-30), 2.494e-01 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 357 >> > Newton solver: 2, 1.706e+01 (1.00e-30), 9.993e-03 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 266 >> > Newton solver: 3, 1.134e-01 (1.00e-30), 6.639e-05 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 261 >> > Newton solver: 4, 4.285e-06 (1.00e-30), 2.510e-09 >> (1.00e-10) >> > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL >> iterations >> > 263 >> > Newton solver: 5, 9.650e-12 (1.00e-30), 5.652e-15 >> (1.00e-10) >> > >> >> -- >> Dr. Sebastian Blauth >> Fraunhofer-Institut f?r >> Techno- und Wirtschaftsmathematik ITWM >> Abteilung Transportvorg?nge >> Fraunhofer-Platz 1, 67663 Kaiserslautern >> Telefon: +49 631 31600-4968 >> sebastian.blauth at itwm.fraunhofer.de >> www.itwm.fraunhofer.de >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From franz.pichler at v2c2.at Mon May 8 01:56:09 2023 From: franz.pichler at v2c2.at (Pichler, Franz) Date: Mon, 8 May 2023 06:56:09 +0000 Subject: [petsc-users] Petsc ObjectStateIncrease without proivate header Message-ID: Hello, i am using petsc in a single cpu setup where I have preassembled crs matrices that I wrap via PetSC's MatCreateSeqAIJWithArrays Functionality. Now I manipulate the values of these matrices (wohtout changing the sparsity) without using petsc, When I now want to solve again I have to call PetscObjectStateIncrease((PetscObject)petsc_A); So that oetsc actually solves again (otherwise thinking nothing hs changed , This means I have to include the private header #include Which makes a seamingless implementation of petsc into a cmake process more complicate (This guy has to be stated explicitly in the cmake process at the moment) I would like to resolve that by "going" around the private header, My first intuition was to increase the state by hand ((PetscObject)petsc_A_aux[the_sys])->state++; This is the definition of petscstateincrease in the header. This throws me an error: invalid use of incomplete type 'struct _p_PetscObject' compilation error. Is there any elegeant way around this? This is the first time I use the petsc mailing list so apologies for any beginners mistake I did in formatting or anything else. Best regards Franz Pichler -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Mon May 8 08:31:00 2023 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Mon, 8 May 2023 16:31:00 +0300 Subject: [petsc-users] Petsc ObjectStateIncrease without proivate header In-Reply-To: References: Message-ID: You can achieve the same effect by calling MatAssemblyBegin/End Il giorno lun 8 mag 2023 alle ore 15:54 Pichler, Franz < franz.pichler at v2c2.at> ha scritto: > Hello, > i am using petsc in a single cpu setup where I have preassembled crs > matrices that I wrap via PetSC?s > > MatCreateSeqAIJWithArrays Functionality. > > > > Now I manipulate the values of these matrices (wohtout changing the > sparsity) without using petsc, > > > > When I now want to solve again I have to call > > PetscObjectStateIncrease((PetscObject)petsc_A); > > > > So that oetsc actually solves again (otherwise thinking nothing hs changed > , > > This means I have to include the private header > > #include > > > > Which makes a seamingless implementation of petsc into a cmake process > more complicate (This guy has to be stated explicitly in the cmake process > at the moment) > > > > I would like to resolve that by ?going? around the private header, > > My first intuition was to increase the state by hand > > ((PetscObject)petsc_A_aux[the_sys])->state++; > > This is the definition of petscstateincrease in the header. This throws me > an > > error: invalid use of incomplete type ?struct _p_PetscObject? > > > > compilation error. > > > > Is there any elegeant way around this? > > This is the first time I use the petsc mailing list so apologies for any > beginners mistake I did in formatting or anything else. > > > > Best regards > > > > Franz Pichler > > > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon May 8 08:45:51 2023 From: jed at jedbrown.org (Jed Brown) Date: Mon, 08 May 2023 07:45:51 -0600 Subject: [petsc-users] Scalable Solver for Incompressible Flow In-Reply-To: References: <87cz3i7fj1.fsf@jedbrown.org> <3287ff5f-5ac1-fdff-52d1-97888568c098@itwm.fraunhofer.de> Message-ID: <8735479bsg.fsf@jedbrown.org> Sebastian Blauth writes: > Hello everyone, > > I wanted to briefly follow up on my question (see my last reply). > Does anyone know / have an idea why the LSC preconditioner in PETSc does > not seem to scale well with the problem size (the outer fgmres solver I > am using nearly scale nearly linearly with the problem size in my example). The implementation was tested on heterogeneous Stokes problems from geodynamics, and perhaps not on NS (or not with the discretization you're using). https://doi.org/10.1016/j.pepi.2008.07.036 There is a comment about not having plumbing to provide a mass matrix. A few lines earlier there is code using PetscObjectQuery, and that same pattern could be applied for the mass matrix. If you're on a roughly uniform mesh, including the mass scaling will probably have little effect, but it could have a big impact in the presence of highly anistropic elements or a broad range of scales. I don't think LSC has gotten a lot of use since everyone I know who tried it has been sort of disappointed relative to other methods (e.g., inverse viscosity scaled mass matrix for heterogeneous Stokes, PCD for moderate Re Navier-Stokes). Of course there are no steady solutions to high Re so you either have a turbulence model or are time stepping. I'm not aware of work with LSC with turbulence models, and when time stepping, you have enough mass matrix that cheaper preconditioners are good enough. That said, it would be a great contribution to support this scaling. > I have also already tried using -ksp_diagonal_scale but the results are > identical. That's expected, and likely to mess up some MG implementations so I wouldn't recommend it. From danyang.su at gmail.com Mon May 8 17:50:09 2023 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 08 May 2023 15:50:09 -0700 Subject: [petsc-users] Question on ISLocalToGlobalMappingGetIndices Fortran Interface Message-ID: <41BD5A47-A2CB-4A01-ADD2-AA6046F50B11@gmail.com> Dear PETSc-Users, Is there any changes in ISLocalToGlobalMappingGetIndices function after PETSc 3.17? In the previous PETSc version (<= 3.17), the function ?ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr)? works fine, even though the value of idltog looks out of bound (-11472655627481), https://www.mcs.anl.gov/petsc/petsc-3.14/src/ksp/ksp/tutorials/ex14f.F90.html. The value of idltog is not clear. In the latest PETSc version, ?this function can be called, but due to the extreme value of idltog, the code fails. I also tried to use ?ISLocalToGlobalMappingGetIndicesF90(ltogm,ltog,ierr)? () but no success. #if (PETSC_VERSION_MAJOR == 3 && PETSC_VERSION_MINOR <= 4) ??????? call DMDAGetGlobalIndicesF90(dmda_flow%da,PETSC_NULL_INTEGER,? & ???????????????????????????????????? idx,ierr) ??????? CHKERRQ(ierr) #else ??????? call DMGetLocalToGlobalMapping(dmda_flow%da,ltogm,ierr) ??????? CHKERRQ(ierr) ??????? call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) ??????? CHKERRQ(ierr) #endif ??????? ????????dof = dmda_flow%dof ????? ????????do ivol = 1, nngl #if (PETSC_VERSION_MAJOR == 3 && PETSC_VERSION_MINOR <= 4) ??????????? node_idx_lg2pg(ivol) = (idx(ivol*dof)+1)/dof #else ??????????? node_idx_lg2pg(ivol) = (ltog(ivol*dof + idltog)+1)/dof #endif ???????end do Any suggestions on that. Thanks, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon May 8 17:54:26 2023 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 8 May 2023 18:54:26 -0400 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Message-ID: See https://gitlab.com/petsc/petsc/-/merge_requests/6436 with commit barry/2023-05-08/add-ksp-min-its > On May 7, 2023, at 2:22 PM, Edoardo alinovi wrote: > > Hello Barry, > > Mega! Thank you Berry much for providing me with a working example! I ended up in writing this: > > call KSPConvergedDefault(ksp, n, rnorm, flag, PETSC_NULL_FUNCTION, ierr) > if (n flag = 0 > endif > ierr = 0 > > and it looks working but I'll take advantage of your suggestion ;) Is KSPConvergedDefaultDestroy mandatory? > > I know it's easy code, but maybe you might have a think to add this control and expose it as the max number of iterations in KSP. I can tell you it is very much used, I found myself many times in need to tell the solver to iterate regardless of the tolerances criteria. It happens for example in the RANS equation, especially omega. Sometimes they tend to stall and you do want to tighten the tolerances for a bunch of iters, or you might not know if they do while iterating! > > Cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon May 8 21:21:57 2023 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 8 May 2023 22:21:57 -0400 Subject: [petsc-users] Question on ISLocalToGlobalMappingGetIndices Fortran Interface In-Reply-To: <41BD5A47-A2CB-4A01-ADD2-AA6046F50B11@gmail.com> References: <41BD5A47-A2CB-4A01-ADD2-AA6046F50B11@gmail.com> Message-ID: On Mon, May 8, 2023 at 6:50?PM Danyang Su wrote: > Dear PETSc-Users, > > > > Is there any changes in ISLocalToGlobalMappingGetIndices function after > PETSc 3.17? > > > > In the previous PETSc version (<= 3.17), the function > ?ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr)? works fine, even > though the value of idltog looks out of bound (-11472655627481), > https://www.mcs.anl.gov/petsc/petsc-3.14/src/ksp/ksp/tutorials/ex14f.F90.html. > The value of idltog is not clear. > > > > In the latest PETSc version, this function can be called, but due to the > extreme value of idltog, the code fails. I also tried to use > ?ISLocalToGlobalMappingGetIndicesF90(ltogm,ltog,ierr)? () but no success. > * You do want the latter: doc/changes/319.rst:- Deprecate ``ISLocalToGlobalMappingGetIndices()`` in favor of ``ISLocalToGlobalMappingGetIndicesF90()`` * You might look at a test: src/ksp/ksp/tutorials/ex14f.F90: PetscCall(ISLocalToGlobalMappingGetIndicesF90(ltogm,ltog,ierr)) * If you use 64 bit integers be careful. * You want to use a memory checker like Valgrind or Sanitize. Mark > > #if (PETSC_VERSION_MAJOR == 3 && PETSC_VERSION_MINOR <= 4) > > call DMDAGetGlobalIndicesF90(dmda_flow%da,PETSC_NULL_INTEGER, & > > idx,ierr) > > CHKERRQ(ierr) > > #else > > call DMGetLocalToGlobalMapping(dmda_flow%da,ltogm,ierr) > > CHKERRQ(ierr) > > call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) > > CHKERRQ(ierr) > > #endif > > > > dof = dmda_flow%dof > > > > do ivol = 1, nngl > > > > #if (PETSC_VERSION_MAJOR == 3 && PETSC_VERSION_MINOR <= 4) > > node_idx_lg2pg(ivol) = (idx(ivol*dof)+1)/dof > > #else > > node_idx_lg2pg(ivol) = (ltog(ivol*dof + idltog)+1)/dof > > #endif > > end do > > > > Any suggestions on that. > > > > Thanks, > > > > Danyang > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Mon May 8 23:24:01 2023 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 08 May 2023 21:24:01 -0700 Subject: [petsc-users] Question on ISLocalToGlobalMappingGetIndices Fortran Interface In-Reply-To: References: <41BD5A47-A2CB-4A01-ADD2-AA6046F50B11@gmail.com> Message-ID: <248BE1B2-092D-4F29-9E9D-FFAF76C0B554@gmail.com> Thanks, Mark. Yes, it actually works when I update to ISLocalToGlobalMappingGetIndicesF90. I made a mistake reporting this does not work. Danyang From: Mark Adams Date: Monday, May 8, 2023 at 7:22 PM To: Danyang Su Cc: petsc-users Subject: Re: [petsc-users] Question on ISLocalToGlobalMappingGetIndices Fortran Interface On Mon, May 8, 2023 at 6:50?PM Danyang Su wrote: Dear PETSc-Users, Is there any changes in ISLocalToGlobalMappingGetIndices function after PETSc 3.17? In the previous PETSc version (<= 3.17), the function ?ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr)? works fine, even though the value of idltog looks out of bound (-11472655627481), https://www.mcs.anl.gov/petsc/petsc-3.14/src/ksp/ksp/tutorials/ex14f.F90.html. The value of idltog is not clear. In the latest PETSc version, this function can be called, but due to the extreme value of idltog, the code fails. I also tried to use ?ISLocalToGlobalMappingGetIndicesF90(ltogm,ltog,ierr)? () but no success. * You do want the latter: doc/changes/319.rst:- Deprecate ``ISLocalToGlobalMappingGetIndices()`` in favor of ``ISLocalToGlobalMappingGetIndicesF90()`` * You might look at a test: src/ksp/ksp/tutorials/ex14f.F90: PetscCall(ISLocalToGlobalMappingGetIndicesF90(ltogm,ltog,ierr)) * If you use 64 bit integers be careful. * You want to use a memory checker like Valgrind or Sanitize. Mark #if (PETSC_VERSION_MAJOR == 3 && PETSC_VERSION_MINOR <= 4) call DMDAGetGlobalIndicesF90(dmda_flow%da,PETSC_NULL_INTEGER, & idx,ierr) CHKERRQ(ierr) #else call DMGetLocalToGlobalMapping(dmda_flow%da,ltogm,ierr) CHKERRQ(ierr) call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) CHKERRQ(ierr) #endif dof = dmda_flow%dof do ivol = 1, nngl #if (PETSC_VERSION_MAJOR == 3 && PETSC_VERSION_MINOR <= 4) node_idx_lg2pg(ivol) = (idx(ivol*dof)+1)/dof #else node_idx_lg2pg(ivol) = (ltog(ivol*dof + idltog)+1)/dof #endif end do Any suggestions on that. Thanks, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephan.koehler at math.tu-freiberg.de Tue May 9 03:15:38 2023 From: stephan.koehler at math.tu-freiberg.de (=?UTF-8?Q?Stephan_K=c3=b6hler?=) Date: Tue, 9 May 2023 10:15:38 +0200 Subject: [petsc-users] Bug report LMVM matrix class Message-ID: <8a652863-d467-7752-ed32-498a1f56a553@math.tu-freiberg.de> Dear PETSc/Tao team, it seems to be that there is a bug in the LMVM matrix class: The function MatMultAdd_LMVM, see, e.g., https://petsc.org/release/src/ksp/ksp/utils/lmvm/lmvmimpl.c.html at line 114, if the vectors Y and Z are the same, then the result is wrong, since the first MatMult overwrites also the value in Y. Best regards Stephan K?hler -- Stephan K?hler TU Bergakademie Freiberg Institut f?r numerische Mathematik und Optimierung Akademiestra?e 6 09599 Freiberg Geb?udeteil Mittelbau, Zimmer 2.07 Telefon: +49 (0)3731 39-3173 (B?ro) -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xC9BF2C20DFE9F713.asc Type: application/pgp-keys Size: 758 bytes Desc: OpenPGP public key URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 236 bytes Desc: OpenPGP digital signature URL: From edoardo.alinovi at gmail.com Tue May 9 03:40:18 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Tue, 9 May 2023 10:40:18 +0200 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Message-ID: Mega! This is gonna be useful for many people :) Il Mar 9 Mag 2023, 00:54 Barry Smith ha scritto: > > See https://gitlab.com/petsc/petsc/-/merge_requests/6436 with commit > barry/2023-05-08/add-ksp-min-its > > > > On May 7, 2023, at 2:22 PM, Edoardo alinovi > wrote: > > Hello Barry, > > Mega! Thank you Berry much for providing me with a working example! I > ended up in writing this: > > > > > > * call KSPConvergedDefault(ksp, n, rnorm, flag, > PETSC_NULL_FUNCTION, ierr) if (n flag = 0 endif ierr = 0* > > and it looks working but I'll take advantage of your suggestion ;) Is > KSPConvergedDefaultDestroy mandatory? > > I know it's easy code, but maybe you might have a think to add this > control and expose it as the max number of iterations in KSP. I can tell > you it is very much used, I found myself many times in need to tell the > solver to iterate regardless of the tolerances criteria. It happens for > example in the RANS equation, especially omega. Sometimes they tend to > stall and you do want to tighten the tolerances for a bunch of iters, or > you might not know if they do while iterating! > > Cheers > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 9 07:10:12 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 May 2023 08:10:12 -0400 Subject: [petsc-users] Bug report LMVM matrix class In-Reply-To: <8a652863-d467-7752-ed32-498a1f56a553@math.tu-freiberg.de> References: <8a652863-d467-7752-ed32-498a1f56a553@math.tu-freiberg.de> Message-ID: On Tue, May 9, 2023 at 4:15?AM Stephan K?hler < stephan.koehler at math.tu-freiberg.de> wrote: > Dear PETSc/Tao team, > > it seems to be that there is a bug in the LMVM matrix class: > > The function MatMultAdd_LMVM, see, e.g., > https://petsc.org/release/src/ksp/ksp/utils/lmvm/lmvmimpl.c.html at line > 114, if the vectors Y and Z are the same, then the result is wrong, > since the first MatMult overwrites also the value in Y. > Yes, the condition for MatMultAdd() is that X is not the same as Z, so we need to either disallow this case, or create a work vector in order to handle it. Todd, which should be done? Thanks, Matt Best regards > Stephan K?hler > > -- > Stephan K?hler > TU Bergakademie Freiberg > Institut f?r numerische Mathematik und Optimierung > > Akademiestra?e 6 > 09599 Freiberg > Geb?udeteil Mittelbau, Zimmer 2.07 > > Telefon: +49 (0)3731 39-3173 (B?ro) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Tue May 9 09:02:00 2023 From: leonardo.mutti01 at universitadipavia.it (LEONARDO MUTTI) Date: Tue, 9 May 2023 16:02:00 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> Message-ID: Great thanks! I can now successfully run https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90. Going forward with my experiments, let me post a new code snippet (very similar to ex71f.F90) that I cannot get to work, probably I must be setting up the IS objects incorrectly. I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a vector b=(0.5,...,0.5). We have only one processor, and I want to solve Ax=b using GASM. In particular, KSP is set to preonly, GASM is the preconditioner and it uses on each submatrix an lu direct solver (sub_ksp = preonly, sub_pc = lu). For the GASM algorithm, I divide A into diag(1,1) and diag(2,2,...,2). For simplicity I set 0 overlap. Now I want to use GASM to solve Ax=b. The code follows. #include #include #include USE petscmat USE petscksp USE petscpc USE MPI Mat :: A Vec :: b, x PetscInt :: M, I, J, ISLen, NSub PetscMPIInt :: size PetscErrorCode :: ierr PetscScalar :: v KSP :: ksp PC :: pc IS :: subdomains_IS(2), inflated_IS(2) PetscInt,DIMENSION(4) :: indices_first_domain PetscInt,DIMENSION(36) :: indices_second_domain call PetscInitialize(PETSC_NULL_CHARACTER, ierr) call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! INTRO: create matrix and right hand side !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WRITE(*,*) "Assembling A,b" M = 8 call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) DO I=1,M DO J=1,M IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN v = 1 ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN v = 2 ELSE v = 0 ENDIF call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) END DO END DO call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) call VecCreate(PETSC_COMM_WORLD,b,ierr) call VecSetSizes(b, PETSC_DECIDE, M,ierr) call VecSetFromOptions(b,ierr) do I=1,M v = 0.5 call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) end do call VecAssemblyBegin(b,ierr) call VecAssemblyEnd(b,ierr) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! FIRST KSP/PC SETUP !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WRITE(*,*) "KSP/PC first setup" call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) call KSPSetOperators(ksp, A, A, ierr) call KSPSetType(ksp, 'preonly', ierr) call KSPGetPC(ksp, pc, ierr) call KSPSetUp(ksp, ierr) call PCSetType(pc, PCGASM, ierr) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! GASM, SETTING SUBDOMAINS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WRITE(*,*) "Setting GASM subdomains" ! Let's create the subdomain IS and inflated_IS ! They are equal if no overlap is present ! They are 1: 0,1,8,9 ! 2: 10,...,15,18,...,23,...,58,...,63 indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) do I=0,5 do J=0,5 indices_second_domain(I*6+1+J) = 18 + J + 8*I ! corresponds to diag(2,2,...,2) !WRITE(*,*) I*6+1+J, 18 + J + 8*I end do end do ! Convert into IS ISLen = 4 call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, & PETSC_COPY_VALUES, subdomains_IS(1), ierr) call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, & PETSC_COPY_VALUES, inflated_IS(1), ierr) ISLen = 36 call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, & PETSC_COPY_VALUES, subdomains_IS(2), ierr) call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, & PETSC_COPY_VALUES, inflated_IS(2), ierr) NSub = 2 call PCGASMSetSubdomains(pc,NSub, & subdomains_IS,inflated_IS,ierr) call PCGASMDestroySubdomains(NSub, & subdomains_IS,inflated_IS,ierr) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! GASM: SET SUBSOLVERS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WRITE(*,*) "Setting subsolvers for GASM" call PCSetUp(pc, ierr) ! should I add this? call PetscOptionsSetValue(PETSC_NULL_OPTIONS, & "-sub_pc_type", "lu", ierr) call PetscOptionsSetValue(PETSC_NULL_OPTIONS, & "-sub_ksp_type", "preonly", ierr) call KSPSetFromOptions(ksp, ierr) call PCSetFromOptions(pc, ierr) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! DUMMY SOLUTION: DID IT WORK? !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WRITE(*,*) "Solve" call VecDuplicate(b,x,ierr) call KSPSolve(ksp,b,x,ierr) call MatDestroy(A, ierr) call KSPDestroy(ksp, ierr) call PetscFinalize(ierr) This code is failing in multiple points. At call PCSetUp(pc, ierr) it produces: *[0]PETSC ERROR: Argument out of range* *[0]PETSC ERROR: Scatter indices in ix are out of range* *...* *[0]PETSC ERROR: #1 VecScatterCreate() at ***\src\vec\is\sf\INTERF~1\vscat.c:736* *[0]PETSC ERROR: #2 PCSetUp_GASM() at ***\src\ksp\pc\impls\gasm\gasm.c:433* *[0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994* And at call KSPSolve(ksp,b,x,ierr) it produces: *forrtl: severe (157): Program Exception - access violation* The index sets are setup coherently with the outputs of e.g. https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: in particular each element of the matrix A corresponds to a number from 0 to 63. Note that each submatrix does not represent some physical subdomain, the subdivision is just at the algebraic level. I thus have the following questions: - is this the correct way of creating the IS objects, given my objective at the beginning of the email? Is the ordering correct? - what am I doing wrong that is generating the above errors? Thanks for the patience and the time. Best, Leonardo Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith ha scritto: > > Added in *barry/2023-05-04/add-pcgasm-set-subdomains *see also > https://gitlab.com/petsc/petsc/-/merge_requests/6419 > > Barry > > > On May 4, 2023, at 11:23 AM, LEONARDO MUTTI < > leonardo.mutti01 at universitadipavia.it> wrote: > > Thank you for the help. > Adding to my example: > > > * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) > call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* > results in: > > * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS > referenced in function ... * > > * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS > referenced in function ... * > I'm not sure if the interfaces are missing or if I have a compilation > problem. > Thank you again. > Best, > Leonardo > > Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith > ha scritto: > >> >> Thank you for the test code. I have a fix in the branch >> barry/2023-04-29/fix-pcasmcreatesubdomains2d >> with >> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >> >> The functions did not have proper Fortran stubs and interfaces so I >> had to provide them manually in the new branch. >> >> Use >> >> git fetch >> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >> >> ./configure etc >> >> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to >> change things slightly and I updated the error handling for the latest >> version. >> >> Please let us know if you have any later questions. >> >> Barry >> >> >> >> >> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >> Hello. I am having a hard time understanding the index sets to feed >> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >> get more intuition on how the IS objects behave I tried the following >> minimal (non) working example, which should tile a 16x16 matrix into 16 >> square, non-overlapping submatrices: >> >> #include >> #include >> #include >> USE petscmat >> USE petscksp >> USE petscpc >> >> Mat :: A >> PetscInt :: M, NSubx, dof, overlap, NSub >> INTEGER :: I,J >> PetscErrorCode :: ierr >> PetscScalar :: v >> KSP :: ksp >> PC :: pc >> IS :: subdomains_IS, inflated_IS >> >> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >> >> !-----Create a dummy matrix >> M = 16 >> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >> & M, M, >> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >> & A, ierr) >> >> DO I=1,M >> DO J=1,M >> v = I*J >> CALL MatSetValue (A,I-1,J-1,v, >> & INSERT_VALUES , ierr) >> END DO >> END DO >> >> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >> >> !-----Create KSP and PC >> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >> call KSPSetOperators(ksp,A,A, ierr) >> call KSPSetType(ksp,"bcgs",ierr) >> call KSPGetPC(ksp,pc,ierr) >> call KSPSetUp(ksp, ierr) >> call PCSetType(pc,PCGASM, ierr) >> call PCSetUp(pc , ierr) >> >> !-----GASM setup >> NSubx = 4 >> dof = 1 >> overlap = 0 >> >> call PCGASMCreateSubdomains2D(pc, >> & M, M, >> & NSubx, NSubx, >> & dof, overlap, >> & NSub, subdomains_IS, inflated_IS, ierr) >> >> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >> >> call KSPDestroy(ksp, ierr) >> call PetscFinalize(ierr) >> >> Running this on one processor, I get NSub = 4. >> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as >> expected. >> Moreover, I get in the end "forrtl: severe (157): Program Exception - >> access violation". So: >> 1) why do I get two different results with ASM, and GASM? >> 2) why do I get access violation and how can I solve this? >> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. >> As I see on the Fortran interface, the arguments to >> PCGASMCreateSubdomains2D are IS objects: >> >> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >> import tPC,tIS >> PC a ! PC >> PetscInt b ! PetscInt >> PetscInt c ! PetscInt >> PetscInt d ! PetscInt >> PetscInt e ! PetscInt >> PetscInt f ! PetscInt >> PetscInt g ! PetscInt >> PetscInt h ! PetscInt >> IS i ! IS >> IS j ! IS >> PetscErrorCode z >> end subroutine PCGASMCreateSubdomains2D >> Thus: >> 3) what should be inside e.g., subdomains_IS? I expect it to contain, for >> every created subdomain, the list of rows and columns defining the subblock >> in the matrix, am I right? >> >> Context: I have a block-tridiagonal system arising from space-time finite >> elements, and I want to solve it with GMRES+PCGASM preconditioner, where >> each overlapping submatrix is on the diagonal and of size 3x3 blocks (and >> spanning multiple processes). This is PETSc 3.17.1 on Windows. >> >> Thanks in advance, >> Leonardo >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Tue May 9 09:15:08 2023 From: hongzhang at anl.gov (Zhang, Hong) Date: Tue, 9 May 2023 14:15:08 +0000 Subject: [petsc-users] Step size setting in TS In-Reply-To: <1c78d9b4bba544d5bf26683dc039c5bb@lanl.gov> References: <1c78d9b4bba544d5bf26683dc039c5bb@lanl.gov> Message-ID: <39113A1A-497E-4BE8-A4A8-DC0AA75AB71E@anl.gov> On May 6, 2023, at 6:24 PM, Jorti, Zakariae via petsc-users wrote: Hello, I have a time-dependent model that I solve using TSSolve. And I am trying to adaptively change the step size (dt). I found that there are some TSAdapt schemes already available. I have tried TSADAPTBASIC and TSADAPTCFL. The former runs without any problems, whereas the latter yields the following error: " [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Step rejection not implemented. The CFL implementation is incomplete/unusable " How/where the step rejection should be implemented? Are there any examples available? I think you need to compute the local step size based on CFL for Euler and set it with TSSetCFLTimeLocal(). Besides, I was also attempting to change the step size through the monitor: " SNES snes; TSGetSNES(ts, & snes); SNESConvergedReason reason = SNES_CONVERGED_ITERATING; PetscCall(SNESGetConvergedReason(snes, &reason)); if(reason < 0){ TSSetTimeStep(ts,28618.7); } else{ TSSetTimeStep(ts,57237.4); } " But when I try this solution, the TSStep seems to diverge even though the SNES solver converges (see log file below). Am I doing something wrong here by changing the value of the step size inside the monitor? TS usually checks convergence based on local truncation errors, not on whether SNES converges. It seems that a built-in TS adaptor is being used but you are overwriting the step size chosen by the adaptor. You can add -ts_adapt_type none to turn off the TS adaptor and control the step size by yourself. Alternatively, you can use the TS adaptor and tune the scaling factors with TSAdaptSetClip(). Hong (Mr.) Thank you. Best, Zakariae ----------------------------------------------------------------- Timestep 0: step size = 57237.4, time = 0., 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 8.393431982129e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.985106864162e-01 1 KSP Residual norm 2.509248490095e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.985106864162e-01 0 KSP Residual norm 1.985106864162e-01 1 KSP Residual norm 2.739697200542e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.885623182909e-01 1 KSP Residual norm 8.784869181255e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.664114929279e-01 1 KSP Residual norm 2.277701872167e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.664114929279e-01 0 KSP Residual norm 1.664114929279e-01 1 KSP Residual norm 2.062369048201e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.155996487462e-02 1 KSP Residual norm 8.830688649637e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 6.155996487462e-02 0 KSP Residual norm 6.155996487462e-02 1 KSP Residual norm 6.231091742747e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.828806540636e-02 1 KSP Residual norm 3.623270770725e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 1.828806540636e-02 0 KSP Residual norm 1.828806540636e-02 1 KSP Residual norm 7.390325185746e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.659571588896e-02 1 KSP Residual norm 5.280284947190e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 1.659571588896e-02 0 KSP Residual norm 1.659571588896e-02 1 KSP Residual norm 8.311611006756e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.475636107890e-02 1 KSP Residual norm 1.890387034462e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635129206894e-02 1 KSP Residual norm 8.192897103059e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 6 SNES Function norm 1.635129206894e-02 0 KSP Residual norm 1.635129206894e-02 1 KSP Residual norm 9.049749505225e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.634946381608e-02 1 KSP Residual norm 9.038879304284e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635085731721e-02 1 KSP Residual norm 9.050535714933e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 7 SNES Function norm 1.635085731721e-02 0 KSP Residual norm 1.635085731721e-02 1 KSP Residual norm 1.248783858778e-03 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.635003937816e-02 1 KSP Residual norm 1.248746884526e-03 Linear solve converged due to CONVERGED_RTOL iterations 1 8 SNES Function norm 1.635003937816e-02 0 KSP Residual norm 1.635003937816e-02 1 KSP Residual norm 1.631242803481e-02 2 KSP Residual norm 1.551854905139e-02 3 KSP Residual norm 7.586491369138e-03 4 KSP Residual norm 2.304721992110e-04 Linear solve converged due to CONVERGED_RTOL iterations 4 0 KSP Residual norm 1.635003936717e-02 1 KSP Residual norm 1.631242801091e-02 2 KSP Residual norm 1.551784322549e-02 3 KSP Residual norm 7.706018759197e-03 4 KSP Residual norm 2.652435205967e-04 Linear solve converged due to CONVERGED_RTOL iterations 4 9 SNES Function norm 1.282692400720e+06 Nonlinear solve did not converge due to DIVERGED_DTOL iterations 9 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 5.698410682663e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.618385788411e-02 1 KSP Residual norm 9.179214337527e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 8.618385788411e-02 0 KSP Residual norm 8.618385788411e-02 1 KSP Residual norm 9.659611612697e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.763975792565e-02 1 KSP Residual norm 2.674728432525e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.763975792565e-02 0 KSP Residual norm 2.763975792565e-02 1 KSP Residual norm 1.836844517206e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.773540750989e-03 1 KSP Residual norm 1.340353004310e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.773540750989e-03 0 KSP Residual norm 2.773540750989e-03 1 KSP Residual norm 5.382019472069e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.441017533038e-04 1 KSP Residual norm 5.218147408275e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 3.441017533038e-04 0 KSP Residual norm 3.441017533038e-04 1 KSP Residual norm 4.811955757758e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.991606932979e-06 1 KSP Residual norm 8.670005394864e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 3.991606932979e-06 0 KSP Residual norm 3.991606932979e-06 1 KSP Residual norm 8.920137762569e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.982782045945e-10 1 KSP Residual norm 3.185856555648e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 6 SNES Function norm 7.982782045945e-10 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 0 SNES Function norm 3.475518085003e-04 0 KSP Residual norm 3.475518085003e-04 1 KSP Residual norm 2.328087365783e-04 2 KSP Residual norm 5.062641920970e-09 Linear solve converged due to CONVERGED_RTOL iterations 2 0 KSP Residual norm 3.256441726122e-02 1 KSP Residual norm 8.957860841275e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 3.256441726122e-02 0 KSP Residual norm 3.256441726122e-02 1 KSP Residual norm 5.886144296665e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 4.876417342600e-03 1 KSP Residual norm 5.064113356579e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 4.876417342600e-03 0 KSP Residual norm 4.876417342600e-03 1 KSP Residual norm 7.010281488349e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.144574635801e-04 1 KSP Residual norm 9.768499579865e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 6.144574635801e-04 0 KSP Residual norm 6.144574635801e-04 1 KSP Residual norm 8.324370478600e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.721913471286e-06 1 KSP Residual norm 1.370545428753e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 4 SNES Function norm 8.721913471286e-06 0 KSP Residual norm 8.721913471286e-06 1 KSP Residual norm 1.353680905072e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.399977638261e-09 1 KSP Residual norm 5.989016835446e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 5 SNES Function norm 3.399977638261e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 5 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.733028652796e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.490800616133e-03 1 KSP Residual norm 8.231034393757e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.490800616133e-03 0 KSP Residual norm 7.490800616133e-03 1 KSP Residual norm 8.349300836114e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.394207859364e-04 1 KSP Residual norm 5.991798196295e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.394207859364e-04 0 KSP Residual norm 1.394207859364e-04 1 KSP Residual norm 5.941984242417e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.985895331367e-08 1 KSP Residual norm 1.522240176483e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.985895331367e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 1.026110861069e-03 0 KSP Residual norm 1.026110861069e-03 1 KSP Residual norm 1.768297003974e-04 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.742520550463e-03 1 KSP Residual norm 4.379214132544e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.742520550463e-03 0 KSP Residual norm 7.742520550463e-03 1 KSP Residual norm 4.526371657285e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.574467856482e-05 1 KSP Residual norm 2.285626422848e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.574467856482e-05 0 KSP Residual norm 2.574467856482e-05 1 KSP Residual norm 2.274220735999e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 4.797870687759e-09 1 KSP Residual norm 2.841037171618e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 4.797870687759e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 5.979305049211e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 8.195748548281e-04 1 KSP Residual norm 1.017772307359e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 8.195748548281e-04 0 KSP Residual norm 8.195748548281e-04 1 KSP Residual norm 1.026961867317e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.717269750886e-06 1 KSP Residual norm 4.675717893044e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.717269750886e-06 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.325295461903e-03 0 KSP Residual norm 1.325295461903e-03 1 KSP Residual norm 7.421950279125e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.330639207844e-03 1 KSP Residual norm 3.764830059967e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.330639207844e-03 0 KSP Residual norm 1.330639207844e-03 1 KSP Residual norm 3.764587772248e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.146996236879e-07 1 KSP Residual norm 1.459926588396e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.146996236879e-07 0 KSP Residual norm 2.146996236879e-07 1 KSP Residual norm 1.470407475325e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.171052440924e-11 1 KSP Residual norm 1.397958787200e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 2.171052440924e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 3 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 3.067250018894e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.608120509564e-04 1 KSP Residual norm 2.171575389779e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.608120509564e-04 0 KSP Residual norm 1.608120509564e-04 1 KSP Residual norm 2.240227170220e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 9.106906778906e-08 1 KSP Residual norm 1.799636201836e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 9.106906778906e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.418151440967e-03 0 KSP Residual norm 1.418151440967e-03 1 KSP Residual norm 3.515151918636e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.938727941997e-04 1 KSP Residual norm 6.040565904176e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.938727941997e-04 0 KSP Residual norm 2.938727941997e-04 1 KSP Residual norm 6.089297895562e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.093551284505e-08 1 KSP Residual norm 2.891462873664e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.093551284505e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 2.050867218313e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 3.054570961709e-05 1 KSP Residual norm 4.596885059357e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 3.054570961709e-05 0 KSP Residual norm 3.054570961709e-05 1 KSP Residual norm 4.828909801995e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.248104282690e-08 1 KSP Residual norm 8.831951540120e-13 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.248104282690e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.460190420374e-03 0 KSP Residual norm 1.460190420374e-03 1 KSP Residual norm 1.580519450452e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.884986293527e-05 1 KSP Residual norm 9.020937669734e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.884986293527e-05 0 KSP Residual norm 5.884986293527e-05 1 KSP Residual norm 9.054367061995e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.698647259662e-09 1 KSP Residual norm 1.767366286599e-13 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.698647259662e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.834492501580e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.663374849972e-06 1 KSP Residual norm 9.752865227095e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.663374849972e-06 0 KSP Residual norm 5.663374849972e-06 1 KSP Residual norm 1.014567990183e-08 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.600155584863e-09 1 KSP Residual norm 8.580477366667e-14 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.600155584863e-09 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 1.479145244919e-03 0 KSP Residual norm 1.479145244919e-03 1 KSP Residual norm 6.804553002336e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.118291636442e-05 1 KSP Residual norm 1.397149309582e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.118291636442e-05 0 KSP Residual norm 1.118291636442e-05 1 KSP Residual norm 1.386385106529e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.800461695540e-10 1 KSP Residual norm 3.217319587127e-14 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.800461695540e-10 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.776151497065e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.076105708636e-06 1 KSP Residual norm 2.155085223926e-09 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.076105708636e-06 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.489507915803e-03 0 KSP Residual norm 1.489507915803e-03 1 KSP Residual norm 2.921538148329e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.172405782412e-06 1 KSP Residual norm 2.719699969205e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.172405782412e-06 0 KSP Residual norm 2.172405782412e-06 1 KSP Residual norm 2.731813669182e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.323734019212e-11 1 KSP Residual norm 6.095298326433e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 5.323734019212e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.746846293829e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.376406135686e-07 1 KSP Residual norm 5.310287317591e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.376406135686e-07 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.497239199277e-03 0 KSP Residual norm 1.497239199277e-03 1 KSP Residual norm 1.251830706962e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 5.023668872625e-07 1 KSP Residual norm 7.656373075632e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 5.023668872625e-07 0 KSP Residual norm 5.023668872625e-07 1 KSP Residual norm 7.473091471413e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.003331377435e-11 1 KSP Residual norm 1.347408932811e-15 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 2.003331377435e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.757419385096e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 7.310279661922e-08 1 KSP Residual norm 1.417918576189e-10 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.310279661922e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.503392083665e-03 0 KSP Residual norm 1.503392083665e-03 1 KSP Residual norm 5.327548050375e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.611770751552e-07 1 KSP Residual norm 2.611036263545e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 1.611770751552e-07 0 KSP Residual norm 1.611770751552e-07 1 KSP Residual norm 2.607692240234e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 1.627978707015e-11 1 KSP Residual norm 4.254078683385e-16 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 1.627978707015e-11 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 0 SNES Function norm 2.854104379157e-02 0 KSP Residual norm 2.854104379157e-02 1 KSP Residual norm 1.747957348554e-05 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.999731614900e-08 1 KSP Residual norm 3.643836796648e-11 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 2.999731614900e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.507740473908e-03 0 KSP Residual norm 1.507740473908e-03 1 KSP Residual norm 2.263876724496e-07 Linear solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 6.390049329095e-08 1 KSP Residual norm 9.745723464126e-12 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 6.390049329095e-08 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 9 09:24:38 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 May 2023 10:24:38 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> Message-ID: On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI < leonardo.mutti01 at universitadipavia.it> wrote: > Great thanks! I can now successfully run > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90. > > Going forward with my experiments, let me post a new code snippet (very > similar to ex71f.F90) that I cannot get to work, probably I must be > setting up the IS objects incorrectly. > > I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a vector b=(0.5,...,0.5). > We have only one processor, and I want to solve Ax=b using GASM. In > particular, KSP is set to preonly, GASM is the preconditioner and it uses > on each submatrix an lu direct solver (sub_ksp = preonly, sub_pc = lu). > > For the GASM algorithm, I divide A into diag(1,1) and diag(2,2,...,2). For > simplicity I set 0 overlap. Now I want to use GASM to solve Ax=b. The code > follows. > > #include > #include > #include > USE petscmat > USE petscksp > USE petscpc > USE MPI > > Mat :: A > Vec :: b, x > PetscInt :: M, I, J, ISLen, NSub > PetscMPIInt :: size > PetscErrorCode :: ierr > PetscScalar :: v > KSP :: ksp > PC :: pc > IS :: subdomains_IS(2), inflated_IS(2) > PetscInt,DIMENSION(4) :: indices_first_domain > PetscInt,DIMENSION(36) :: indices_second_domain > > call PetscInitialize(PETSC_NULL_CHARACTER, ierr) > call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! INTRO: create matrix and right hand side > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > WRITE(*,*) "Assembling A,b" > > M = 8 > call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, > & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, > & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) > DO I=1,M > DO J=1,M > IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN > v = 1 > ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN > v = 2 > ELSE > v = 0 > ENDIF > call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) > END DO > END DO > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > > call VecCreate(PETSC_COMM_WORLD,b,ierr) > call VecSetSizes(b, PETSC_DECIDE, M,ierr) > call VecSetFromOptions(b,ierr) > > do I=1,M > v = 0.5 > call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) > end do > > call VecAssemblyBegin(b,ierr) > call VecAssemblyEnd(b,ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! FIRST KSP/PC SETUP > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > WRITE(*,*) "KSP/PC first setup" > > call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) > call KSPSetOperators(ksp, A, A, ierr) > call KSPSetType(ksp, 'preonly', ierr) > call KSPGetPC(ksp, pc, ierr) > call KSPSetUp(ksp, ierr) > call PCSetType(pc, PCGASM, ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! GASM, SETTING SUBDOMAINS > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > WRITE(*,*) "Setting GASM subdomains" > > ! Let's create the subdomain IS and inflated_IS > ! They are equal if no overlap is present > ! They are 1: 0,1,8,9 > ! 2: 10,...,15,18,...,23,...,58,...,63 > > indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) > do I=0,5 > do J=0,5 > indices_second_domain(I*6+1+J) = 18 + J + 8*I ! corresponds to > diag(2,2,...,2) > !WRITE(*,*) I*6+1+J, 18 + J + 8*I > end do > end do > > ! Convert into IS > ISLen = 4 > call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, > & PETSC_COPY_VALUES, subdomains_IS(1), ierr) > call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, > & PETSC_COPY_VALUES, inflated_IS(1), ierr) > ISLen = 36 > call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, > & PETSC_COPY_VALUES, subdomains_IS(2), ierr) > call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, > & PETSC_COPY_VALUES, inflated_IS(2), ierr) > > NSub = 2 > call PCGASMSetSubdomains(pc,NSub, > & subdomains_IS,inflated_IS,ierr) > call PCGASMDestroySubdomains(NSub, > & subdomains_IS,inflated_IS,ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! GASM: SET SUBSOLVERS > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > WRITE(*,*) "Setting subsolvers for GASM" > > call PCSetUp(pc, ierr) ! should I add this? > > call PetscOptionsSetValue(PETSC_NULL_OPTIONS, > & "-sub_pc_type", "lu", ierr) > call PetscOptionsSetValue(PETSC_NULL_OPTIONS, > & "-sub_ksp_type", "preonly", ierr) > > call KSPSetFromOptions(ksp, ierr) > call PCSetFromOptions(pc, ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! DUMMY SOLUTION: DID IT WORK? > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > WRITE(*,*) "Solve" > > call VecDuplicate(b,x,ierr) > call KSPSolve(ksp,b,x,ierr) > > call MatDestroy(A, ierr) > call KSPDestroy(ksp, ierr) > call PetscFinalize(ierr) > > This code is failing in multiple points. At call PCSetUp(pc, ierr) it > produces: > > *[0]PETSC ERROR: Argument out of range* > *[0]PETSC ERROR: Scatter indices in ix are out of range* > *...* > *[0]PETSC ERROR: #1 VecScatterCreate() at > ***\src\vec\is\sf\INTERF~1\vscat.c:736* > *[0]PETSC ERROR: #2 PCSetUp_GASM() at ***\src\ksp\pc\impls\gasm\gasm.c:433* > *[0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994* > > And at call KSPSolve(ksp,b,x,ierr) it produces: > > *forrtl: severe (157): Program Exception - access violation* > > > The index sets are setup coherently with the outputs of e.g. > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: > in particular each element of the matrix A corresponds to a number from 0 > to 63. > This is not correct, I believe. The indices are row/col indices, not indices into dense blocks, so for your example, they are all in [0, 8]. Thanks, Matt > Note that each submatrix does not represent some physical subdomain, the > subdivision is just at the algebraic level. > I thus have the following questions: > > - is this the correct way of creating the IS objects, given my > objective at the beginning of the email? Is the ordering correct? > - what am I doing wrong that is generating the above errors? > > Thanks for the patience and the time. > Best, > Leonardo > > Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith ha > scritto: > >> >> Added in *barry/2023-05-04/add-pcgasm-set-subdomains *see also >> https://gitlab.com/petsc/petsc/-/merge_requests/6419 >> >> Barry >> >> >> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >> Thank you for the help. >> Adding to my example: >> >> >> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >> inflated_IS,ierr) call >> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >> results in: >> >> * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS >> referenced in function ... * >> >> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >> referenced in function ... * >> I'm not sure if the interfaces are missing or if I have a compilation >> problem. >> Thank you again. >> Best, >> Leonardo >> >> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith >> ha scritto: >> >>> >>> Thank you for the test code. I have a fix in the branch >>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>> with >>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>> >>> The functions did not have proper Fortran stubs and interfaces so I >>> had to provide them manually in the new branch. >>> >>> Use >>> >>> git fetch >>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>> >>> ./configure etc >>> >>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had >>> to change things slightly and I updated the error handling for the latest >>> version. >>> >>> Please let us know if you have any later questions. >>> >>> Barry >>> >>> >>> >>> >>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>> leonardo.mutti01 at universitadipavia.it> wrote: >>> >>> Hello. I am having a hard time understanding the index sets to feed >>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>> get more intuition on how the IS objects behave I tried the following >>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>> square, non-overlapping submatrices: >>> >>> #include >>> #include >>> #include >>> USE petscmat >>> USE petscksp >>> USE petscpc >>> >>> Mat :: A >>> PetscInt :: M, NSubx, dof, overlap, NSub >>> INTEGER :: I,J >>> PetscErrorCode :: ierr >>> PetscScalar :: v >>> KSP :: ksp >>> PC :: pc >>> IS :: subdomains_IS, inflated_IS >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>> >>> !-----Create a dummy matrix >>> M = 16 >>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>> & M, M, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & A, ierr) >>> >>> DO I=1,M >>> DO J=1,M >>> v = I*J >>> CALL MatSetValue (A,I-1,J-1,v, >>> & INSERT_VALUES , ierr) >>> END DO >>> END DO >>> >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>> >>> !-----Create KSP and PC >>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>> call KSPSetOperators(ksp,A,A, ierr) >>> call KSPSetType(ksp,"bcgs",ierr) >>> call KSPGetPC(ksp,pc,ierr) >>> call KSPSetUp(ksp, ierr) >>> call PCSetType(pc,PCGASM, ierr) >>> call PCSetUp(pc , ierr) >>> >>> !-----GASM setup >>> NSubx = 4 >>> dof = 1 >>> overlap = 0 >>> >>> call PCGASMCreateSubdomains2D(pc, >>> & M, M, >>> & NSubx, NSubx, >>> & dof, overlap, >>> & NSub, subdomains_IS, inflated_IS, ierr) >>> >>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>> >>> call KSPDestroy(ksp, ierr) >>> call PetscFinalize(ierr) >>> >>> Running this on one processor, I get NSub = 4. >>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 >>> as expected. >>> Moreover, I get in the end "forrtl: severe (157): Program Exception - >>> access violation". So: >>> 1) why do I get two different results with ASM, and GASM? >>> 2) why do I get access violation and how can I solve this? >>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. >>> As I see on the Fortran interface, the arguments to >>> PCGASMCreateSubdomains2D are IS objects: >>> >>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>> import tPC,tIS >>> PC a ! PC >>> PetscInt b ! PetscInt >>> PetscInt c ! PetscInt >>> PetscInt d ! PetscInt >>> PetscInt e ! PetscInt >>> PetscInt f ! PetscInt >>> PetscInt g ! PetscInt >>> PetscInt h ! PetscInt >>> IS i ! IS >>> IS j ! IS >>> PetscErrorCode z >>> end subroutine PCGASMCreateSubdomains2D >>> Thus: >>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, >>> for every created subdomain, the list of rows and columns defining the >>> subblock in the matrix, am I right? >>> >>> Context: I have a block-tridiagonal system arising from space-time >>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>> >>> Thanks in advance, >>> Leonardo >>> >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue May 9 10:14:07 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 9 May 2023 11:14:07 -0400 Subject: [petsc-users] Final reminder to register for the PETSc user conference June 5-7 if you plan to attend Message-ID: <110DA069-35B4-42B0-84A6-C6F51A850B2C@petsc.dev> One final reminder to please register for the PETSc user conference at https://www.eventbrite.com/e/petsc-2023-user-meeting-tickets-494165441137 now if you plan to attend. We need numbers for ordering food etc. The preliminary agenda can be found at https://petsc.org/release/community/meetings/2023/#meeting Thanks Barry -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Tue May 9 11:31:29 2023 From: leonardo.mutti01 at universitadipavia.it (LEONARDO MUTTI) Date: Tue, 9 May 2023 18:31:29 +0200 Subject: [petsc-users] Fwd: Understanding index sets for PCGASM In-Reply-To: References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> Message-ID: ---------- Forwarded message --------- Da: LEONARDO MUTTI Date: mar 9 mag 2023 alle ore 18:29 Subject: Re: [petsc-users] Understanding index sets for PCGASM To: Matthew Knepley Thank you for your answer, but I am still confused, sorry. Consider https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 on one processor. Let M=12 for the sake of simplicity, i.e. we deal with a 12x12 2D grid, hence, a 144x144 matrix. Let NSubx = 3, so that on the grid we do 3 vertical and 3 horizontal subdivisions. We should obtain 9 subdomains that are grids of 4x4 nodes each, thus corresponding to 9 submatrices of size 16x16. In my run I obtain NSub = 9 (great) and subdomain_IS(i), i=1,...,9, reads: *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 0* *1 1* *2 2* *3 3* *4 12* *5 13* *6 14* *7 15* *8 24* *9 25* *10 26* *11 27* *12 36* *13 37* *14 38* *15 39* *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 4* *1 5* *2 6* *3 7* *4 16* *5 17* *6 18* *7 19* *8 28* *9 29* *10 30* *11 31* *12 40* *13 41* *14 42* *15 43* *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 8* *1 9* *2 10* *3 11* *4 20* *5 21* *6 22* *7 23* *8 32* *9 33* *10 34* *11 35* *12 44* *13 45* *14 46* *15 47* *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 48* *1 49* *2 50* *3 51* *4 60* *5 61* *6 62* *7 63* *8 72* *9 73* *10 74* *11 75* *12 84* *13 85* *14 86* *15 87* *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 52* *1 53* *2 54* *3 55* *4 64* *5 65* *6 66* *7 67* *8 76* *9 77* *10 78* *11 79* *12 88* *13 89* *14 90* *15 91* *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 56* *1 57* *2 58* *3 59* *4 68* *5 69* *6 70* *7 71* *8 80* *9 81* *10 82* *11 83* *12 92* *13 93* *14 94* *15 95* *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 96* *1 97* *2 98* *3 99* *4 108* *5 109* *6 110* *7 111* *8 120* *9 121* *10 122* *11 123* *12 132* *13 133* *14 134* *15 135* *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 100* *1 101* *2 102* *3 103* *4 112* *5 113* *6 114* *7 115* *8 124* *9 125* *10 126* *11 127* *12 136* *13 137* *14 138* *15 139* *IS Object: 1 MPI process* * type: general* *Number of indices in set 16* *0 104* *1 105* *2 106* *3 107* *4 116* *5 117* *6 118* *7 119* *8 128* *9 129* *10 130* *11 131* *12 140* *13 141* *14 142* *15 143* As you said, no number here reaches 144. But the number stored in subdomain_IS are 9x16= #subdomains x 16, whereas I would expect, also given your latest reply, 9x16x16x2=#subdomains x submatrix height x submatrix width x length of a (row,column) pair. It would really help me if you could briefly explain how the output above encodes the subdivision into subdomains. Many thanks again, Leonardo Il giorno mar 9 mag 2023 alle ore 16:24 Matthew Knepley ha scritto: > On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI < > leonardo.mutti01 at universitadipavia.it> wrote: > >> Great thanks! I can now successfully run >> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90. >> >> Going forward with my experiments, let me post a new code snippet (very >> similar to ex71f.F90) that I cannot get to work, probably I must be >> setting up the IS objects incorrectly. >> >> I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a vector b=(0.5,...,0.5). >> We have only one processor, and I want to solve Ax=b using GASM. In >> particular, KSP is set to preonly, GASM is the preconditioner and it uses >> on each submatrix an lu direct solver (sub_ksp = preonly, sub_pc = lu). >> >> For the GASM algorithm, I divide A into diag(1,1) and diag(2,2,...,2). >> For simplicity I set 0 overlap. Now I want to use GASM to solve Ax=b. The >> code follows. >> >> #include >> #include >> #include >> USE petscmat >> USE petscksp >> USE petscpc >> USE MPI >> >> Mat :: A >> Vec :: b, x >> PetscInt :: M, I, J, ISLen, NSub >> PetscMPIInt :: size >> PetscErrorCode :: ierr >> PetscScalar :: v >> KSP :: ksp >> PC :: pc >> IS :: subdomains_IS(2), inflated_IS(2) >> PetscInt,DIMENSION(4) :: indices_first_domain >> PetscInt,DIMENSION(36) :: indices_second_domain >> >> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >> call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) >> >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> ! INTRO: create matrix and right hand side >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> WRITE(*,*) "Assembling A,b" >> >> M = 8 >> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >> DO I=1,M >> DO J=1,M >> IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN >> v = 1 >> ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN >> v = 2 >> ELSE >> v = 0 >> ENDIF >> call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) >> END DO >> END DO >> >> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >> >> call VecCreate(PETSC_COMM_WORLD,b,ierr) >> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >> call VecSetFromOptions(b,ierr) >> >> do I=1,M >> v = 0.5 >> call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) >> end do >> >> call VecAssemblyBegin(b,ierr) >> call VecAssemblyEnd(b,ierr) >> >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> ! FIRST KSP/PC SETUP >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> WRITE(*,*) "KSP/PC first setup" >> >> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >> call KSPSetOperators(ksp, A, A, ierr) >> call KSPSetType(ksp, 'preonly', ierr) >> call KSPGetPC(ksp, pc, ierr) >> call KSPSetUp(ksp, ierr) >> call PCSetType(pc, PCGASM, ierr) >> >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> ! GASM, SETTING SUBDOMAINS >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> WRITE(*,*) "Setting GASM subdomains" >> >> ! Let's create the subdomain IS and inflated_IS >> ! They are equal if no overlap is present >> ! They are 1: 0,1,8,9 >> ! 2: 10,...,15,18,...,23,...,58,...,63 >> >> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >> do I=0,5 >> do J=0,5 >> indices_second_domain(I*6+1+J) = 18 + J + 8*I ! corresponds >> to diag(2,2,...,2) >> !WRITE(*,*) I*6+1+J, 18 + J + 8*I >> end do >> end do >> >> ! Convert into IS >> ISLen = 4 >> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >> & PETSC_COPY_VALUES, subdomains_IS(1), ierr) >> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >> & PETSC_COPY_VALUES, inflated_IS(1), ierr) >> ISLen = 36 >> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >> & PETSC_COPY_VALUES, subdomains_IS(2), ierr) >> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >> & PETSC_COPY_VALUES, inflated_IS(2), ierr) >> >> NSub = 2 >> call PCGASMSetSubdomains(pc,NSub, >> & subdomains_IS,inflated_IS,ierr) >> call PCGASMDestroySubdomains(NSub, >> & subdomains_IS,inflated_IS,ierr) >> >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> ! GASM: SET SUBSOLVERS >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> WRITE(*,*) "Setting subsolvers for GASM" >> >> call PCSetUp(pc, ierr) ! should I add this? >> >> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >> & "-sub_pc_type", "lu", ierr) >> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >> & "-sub_ksp_type", "preonly", ierr) >> >> call KSPSetFromOptions(ksp, ierr) >> call PCSetFromOptions(pc, ierr) >> >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> ! DUMMY SOLUTION: DID IT WORK? >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> WRITE(*,*) "Solve" >> >> call VecDuplicate(b,x,ierr) >> call KSPSolve(ksp,b,x,ierr) >> >> call MatDestroy(A, ierr) >> call KSPDestroy(ksp, ierr) >> call PetscFinalize(ierr) >> >> This code is failing in multiple points. At call PCSetUp(pc, ierr) it >> produces: >> >> *[0]PETSC ERROR: Argument out of range* >> *[0]PETSC ERROR: Scatter indices in ix are out of range* >> *...* >> *[0]PETSC ERROR: #1 VecScatterCreate() at >> ***\src\vec\is\sf\INTERF~1\vscat.c:736* >> *[0]PETSC ERROR: #2 PCSetUp_GASM() at >> ***\src\ksp\pc\impls\gasm\gasm.c:433* >> *[0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994* >> >> And at call KSPSolve(ksp,b,x,ierr) it produces: >> >> *forrtl: severe (157): Program Exception - access violation* >> >> >> The index sets are setup coherently with the outputs of e.g. >> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: >> in particular each element of the matrix A corresponds to a number from 0 >> to 63. >> > > This is not correct, I believe. The indices are row/col indices, not > indices into dense blocks, so for > your example, they are all in [0, 8]. > > Thanks, > > Matt > > >> Note that each submatrix does not represent some physical subdomain, the >> subdivision is just at the algebraic level. >> I thus have the following questions: >> >> - is this the correct way of creating the IS objects, given my >> objective at the beginning of the email? Is the ordering correct? >> - what am I doing wrong that is generating the above errors? >> >> Thanks for the patience and the time. >> Best, >> Leonardo >> >> Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith >> ha scritto: >> >>> >>> Added in *barry/2023-05-04/add-pcgasm-set-subdomains *see also >>> https://gitlab.com/petsc/petsc/-/merge_requests/6419 >>> >>> Barry >>> >>> >>> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI < >>> leonardo.mutti01 at universitadipavia.it> wrote: >>> >>> Thank you for the help. >>> Adding to my example: >>> >>> >>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>> inflated_IS,ierr) call >>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>> results in: >>> >>> * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS >>> referenced in function ... * >>> >>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>> referenced in function ... * >>> I'm not sure if the interfaces are missing or if I have a compilation >>> problem. >>> Thank you again. >>> Best, >>> Leonardo >>> >>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith >>> ha scritto: >>> >>>> >>>> Thank you for the test code. I have a fix in the branch >>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>> with >>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>> >>>> The functions did not have proper Fortran stubs and interfaces so I >>>> had to provide them manually in the new branch. >>>> >>>> Use >>>> >>>> git fetch >>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>> >>>> ./configure etc >>>> >>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had >>>> to change things slightly and I updated the error handling for the latest >>>> version. >>>> >>>> Please let us know if you have any later questions. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>> >>>> Hello. I am having a hard time understanding the index sets to feed >>>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>>> get more intuition on how the IS objects behave I tried the following >>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>> square, non-overlapping submatrices: >>>> >>>> #include >>>> #include >>>> #include >>>> USE petscmat >>>> USE petscksp >>>> USE petscpc >>>> >>>> Mat :: A >>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>> INTEGER :: I,J >>>> PetscErrorCode :: ierr >>>> PetscScalar :: v >>>> KSP :: ksp >>>> PC :: pc >>>> IS :: subdomains_IS, inflated_IS >>>> >>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>> >>>> !-----Create a dummy matrix >>>> M = 16 >>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>> & M, M, >>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>> & A, ierr) >>>> >>>> DO I=1,M >>>> DO J=1,M >>>> v = I*J >>>> CALL MatSetValue (A,I-1,J-1,v, >>>> & INSERT_VALUES , ierr) >>>> END DO >>>> END DO >>>> >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>> >>>> !-----Create KSP and PC >>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>> call KSPSetOperators(ksp,A,A, ierr) >>>> call KSPSetType(ksp,"bcgs",ierr) >>>> call KSPGetPC(ksp,pc,ierr) >>>> call KSPSetUp(ksp, ierr) >>>> call PCSetType(pc,PCGASM, ierr) >>>> call PCSetUp(pc , ierr) >>>> >>>> !-----GASM setup >>>> NSubx = 4 >>>> dof = 1 >>>> overlap = 0 >>>> >>>> call PCGASMCreateSubdomains2D(pc, >>>> & M, M, >>>> & NSubx, NSubx, >>>> & dof, overlap, >>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>> >>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>> >>>> call KSPDestroy(ksp, ierr) >>>> call PetscFinalize(ierr) >>>> >>>> Running this on one processor, I get NSub = 4. >>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 >>>> as expected. >>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - >>>> access violation". So: >>>> 1) why do I get two different results with ASM, and GASM? >>>> 2) why do I get access violation and how can I solve this? >>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>> objects. As I see on the Fortran interface, the arguments to >>>> PCGASMCreateSubdomains2D are IS objects: >>>> >>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>> import tPC,tIS >>>> PC a ! PC >>>> PetscInt b ! PetscInt >>>> PetscInt c ! PetscInt >>>> PetscInt d ! PetscInt >>>> PetscInt e ! PetscInt >>>> PetscInt f ! PetscInt >>>> PetscInt g ! PetscInt >>>> PetscInt h ! PetscInt >>>> IS i ! IS >>>> IS j ! IS >>>> PetscErrorCode z >>>> end subroutine PCGASMCreateSubdomains2D >>>> Thus: >>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, >>>> for every created subdomain, the list of rows and columns defining the >>>> subblock in the matrix, am I right? >>>> >>>> Context: I have a block-tridiagonal system arising from space-time >>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>> >>>> Thanks in advance, >>>> Leonardo >>>> >>>> >>>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue May 9 11:32:27 2023 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 9 May 2023 12:32:27 -0400 Subject: [petsc-users] PCMG questions Message-ID: I have a MG hierarchy that I construct manually with DMRefine and DMPlexExtrude. * The solver works great with chevy/sor but with chevy/sor it converges slowly or I get indefinite PC errors from CG. And the eigen estimates in cheby are really high, like 10-15. * I tried turning galerkin=none and I got this error. Any thoughts on either of these issues? Thanks, Mark [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector() [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! [0]PETSC ERROR: Option left: name:-ksp_converged_reason (no value) source: command line [0]PETSC ERROR: Option left: name:-mg_levels_esteig_ksp_type value: cg source: command line [0]PETSC ERROR: Option left: name:-mg_levels_pc_type value: sor source: command line [0]PETSC ERROR: Option left: name:-options_left (no value) source: command line [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-224-g9ed82936d20 GIT Date: 2023-05-07 12:33:48 -0400 [0]PETSC ERROR: ./ex96 on a arch-macosx-gnu-O named MarksMac-302.local by markadams Tue May 9 12:26:52 2023 [0]PETSC ERROR: Configure options CFLAGS="-g -Wall" CXXFLAGS="-g -Wall" COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich --with-strict-petscerrorcode --download-triangle=1 --with-x=0 --with-debugging=0 --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 [0]PETSC ERROR: #2 DMCreateGlobalVector() at /Users/markadams/Codes/petsc/src/dm/interface/dm.c:1022 [0]PETSC ERROR: #3 DMGetNamedGlobalVector() at /Users/markadams/Codes/petsc/src/dm/interface/dmget.c:377 [0]PETSC ERROR: #4 DMRestrictHook_SNESVecSol() at /Users/markadams/Codes/petsc/src/snes/interface/snes.c:649 [0]PETSC ERROR: #5 DMRestrict() at /Users/markadams/Codes/petsc/src/dm/interface/dm.c:3407 [0]PETSC ERROR: #6 PCSetUp_MG() at /Users/markadams/Codes/petsc/src/ksp/pc/impls/mg/mg.c:1074 [0]PETSC ERROR: #7 PCSetUp() at /Users/markadams/Codes/petsc/src/ksp/pc/interface/precon.c:994 [0]PETSC ERROR: #8 KSPSetUp() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:406 [0]PETSC ERROR: #9 KSPSolve_Private() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:824 [0]PETSC ERROR: #10 KSPSolve() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:1070 [0]PETSC ERROR: #11 SNESSolve_KSPONLY() at /Users/markadams/Codes/petsc/src/snes/impls/ksponly/ksponly.c:48 [0]PETSC ERROR: #12 SNESSolve() at /Users/markadams/Codes/petsc/src/snes/interface/snes.c:4663 [0]PETSC ERROR: #13 main() at ex96.c:433 -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Tue May 9 11:44:57 2023 From: leonardo.mutti01 at universitadipavia.it (LEONARDO MUTTI) Date: Tue, 9 May 2023 18:44:57 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> Message-ID: Partial typo: I expect 9x(16+16) numbers to be stored in subdomain_IS : # subdomains x (row indices of the submatrix + col indices of the submatrix). Il giorno mar 9 mag 2023 alle ore 18:31 LEONARDO MUTTI < leonardo.mutti01 at universitadipavia.it> ha scritto: > > > ---------- Forwarded message --------- > Da: LEONARDO MUTTI > Date: mar 9 mag 2023 alle ore 18:29 > Subject: Re: [petsc-users] Understanding index sets for PCGASM > To: Matthew Knepley > > > Thank you for your answer, but I am still confused, sorry. > Consider > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 on > one processor. > Let M=12 for the sake of simplicity, i.e. we deal with a 12x12 2D grid, > hence, a 144x144 matrix. > Let NSubx = 3, so that on the grid we do 3 vertical and 3 horizontal > subdivisions. > We should obtain 9 subdomains that are grids of 4x4 nodes each, thus > corresponding to 9 submatrices of size 16x16. > In my run I obtain NSub = 9 (great) and subdomain_IS(i), i=1,...,9, reads: > > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 0* > *1 1* > *2 2* > *3 3* > *4 12* > *5 13* > *6 14* > *7 15* > *8 24* > *9 25* > *10 26* > *11 27* > *12 36* > *13 37* > *14 38* > *15 39* > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 4* > *1 5* > *2 6* > *3 7* > *4 16* > *5 17* > *6 18* > *7 19* > *8 28* > *9 29* > *10 30* > *11 31* > *12 40* > *13 41* > *14 42* > *15 43* > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 8* > *1 9* > *2 10* > *3 11* > *4 20* > *5 21* > *6 22* > *7 23* > *8 32* > *9 33* > *10 34* > *11 35* > *12 44* > *13 45* > *14 46* > *15 47* > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 48* > *1 49* > *2 50* > *3 51* > *4 60* > *5 61* > *6 62* > *7 63* > *8 72* > *9 73* > *10 74* > *11 75* > *12 84* > *13 85* > *14 86* > *15 87* > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 52* > *1 53* > *2 54* > *3 55* > *4 64* > *5 65* > *6 66* > *7 67* > *8 76* > *9 77* > *10 78* > *11 79* > *12 88* > *13 89* > *14 90* > *15 91* > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 56* > *1 57* > *2 58* > *3 59* > *4 68* > *5 69* > *6 70* > *7 71* > *8 80* > *9 81* > *10 82* > *11 83* > *12 92* > *13 93* > *14 94* > *15 95* > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 96* > *1 97* > *2 98* > *3 99* > *4 108* > *5 109* > *6 110* > *7 111* > *8 120* > *9 121* > *10 122* > *11 123* > *12 132* > *13 133* > *14 134* > *15 135* > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 100* > *1 101* > *2 102* > *3 103* > *4 112* > *5 113* > *6 114* > *7 115* > *8 124* > *9 125* > *10 126* > *11 127* > *12 136* > *13 137* > *14 138* > *15 139* > *IS Object: 1 MPI process* > * type: general* > *Number of indices in set 16* > *0 104* > *1 105* > *2 106* > *3 107* > *4 116* > *5 117* > *6 118* > *7 119* > *8 128* > *9 129* > *10 130* > *11 131* > *12 140* > *13 141* > *14 142* > *15 143* > > As you said, no number here reaches 144. > But the number stored in subdomain_IS are 9x16= #subdomains x 16, whereas > I would expect, also given your latest reply, 9x16x16x2=#subdomains x > submatrix height x submatrix width x length of a (row,column) pair. > It would really help me if you could briefly explain how the output above > encodes the subdivision into subdomains. > Many thanks again, > Leonardo > > > > Il giorno mar 9 mag 2023 alle ore 16:24 Matthew Knepley > ha scritto: > >> On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >>> Great thanks! I can now successfully run >>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90. >>> >>> Going forward with my experiments, let me post a new code snippet (very >>> similar to ex71f.F90) that I cannot get to work, probably I must be >>> setting up the IS objects incorrectly. >>> >>> I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a >>> vector b=(0.5,...,0.5). We have only one processor, and I want to solve >>> Ax=b using GASM. In particular, KSP is set to preonly, GASM is the >>> preconditioner and it uses on each submatrix an lu direct solver (sub_ksp = >>> preonly, sub_pc = lu). >>> >>> For the GASM algorithm, I divide A into diag(1,1) and diag(2,2,...,2). >>> For simplicity I set 0 overlap. Now I want to use GASM to solve Ax=b. The >>> code follows. >>> >>> #include >>> #include >>> #include >>> USE petscmat >>> USE petscksp >>> USE petscpc >>> USE MPI >>> >>> Mat :: A >>> Vec :: b, x >>> PetscInt :: M, I, J, ISLen, NSub >>> PetscMPIInt :: size >>> PetscErrorCode :: ierr >>> PetscScalar :: v >>> KSP :: ksp >>> PC :: pc >>> IS :: subdomains_IS(2), inflated_IS(2) >>> PetscInt,DIMENSION(4) :: indices_first_domain >>> PetscInt,DIMENSION(36) :: indices_second_domain >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >>> call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) >>> >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> ! INTRO: create matrix and right hand side >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> WRITE(*,*) "Assembling A,b" >>> >>> M = 8 >>> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >>> DO I=1,M >>> DO J=1,M >>> IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN >>> v = 1 >>> ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN >>> v = 2 >>> ELSE >>> v = 0 >>> ENDIF >>> call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) >>> END DO >>> END DO >>> >>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call VecCreate(PETSC_COMM_WORLD,b,ierr) >>> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >>> call VecSetFromOptions(b,ierr) >>> >>> do I=1,M >>> v = 0.5 >>> call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) >>> end do >>> >>> call VecAssemblyBegin(b,ierr) >>> call VecAssemblyEnd(b,ierr) >>> >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> ! FIRST KSP/PC SETUP >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> WRITE(*,*) "KSP/PC first setup" >>> >>> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >>> call KSPSetOperators(ksp, A, A, ierr) >>> call KSPSetType(ksp, 'preonly', ierr) >>> call KSPGetPC(ksp, pc, ierr) >>> call KSPSetUp(ksp, ierr) >>> call PCSetType(pc, PCGASM, ierr) >>> >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> ! GASM, SETTING SUBDOMAINS >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> WRITE(*,*) "Setting GASM subdomains" >>> >>> ! Let's create the subdomain IS and inflated_IS >>> ! They are equal if no overlap is present >>> ! They are 1: 0,1,8,9 >>> ! 2: 10,...,15,18,...,23,...,58,...,63 >>> >>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>> do I=0,5 >>> do J=0,5 >>> indices_second_domain(I*6+1+J) = 18 + J + 8*I ! corresponds >>> to diag(2,2,...,2) >>> !WRITE(*,*) I*6+1+J, 18 + J + 8*I >>> end do >>> end do >>> >>> ! Convert into IS >>> ISLen = 4 >>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>> & PETSC_COPY_VALUES, subdomains_IS(1), ierr) >>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>> & PETSC_COPY_VALUES, inflated_IS(1), ierr) >>> ISLen = 36 >>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>> & PETSC_COPY_VALUES, subdomains_IS(2), ierr) >>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>> & PETSC_COPY_VALUES, inflated_IS(2), ierr) >>> >>> NSub = 2 >>> call PCGASMSetSubdomains(pc,NSub, >>> & subdomains_IS,inflated_IS,ierr) >>> call PCGASMDestroySubdomains(NSub, >>> & subdomains_IS,inflated_IS,ierr) >>> >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> ! GASM: SET SUBSOLVERS >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> WRITE(*,*) "Setting subsolvers for GASM" >>> >>> call PCSetUp(pc, ierr) ! should I add this? >>> >>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>> & "-sub_pc_type", "lu", ierr) >>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>> & "-sub_ksp_type", "preonly", ierr) >>> >>> call KSPSetFromOptions(ksp, ierr) >>> call PCSetFromOptions(pc, ierr) >>> >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> ! DUMMY SOLUTION: DID IT WORK? >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> WRITE(*,*) "Solve" >>> >>> call VecDuplicate(b,x,ierr) >>> call KSPSolve(ksp,b,x,ierr) >>> >>> call MatDestroy(A, ierr) >>> call KSPDestroy(ksp, ierr) >>> call PetscFinalize(ierr) >>> >>> This code is failing in multiple points. At call PCSetUp(pc, ierr) it >>> produces: >>> >>> *[0]PETSC ERROR: Argument out of range* >>> *[0]PETSC ERROR: Scatter indices in ix are out of range* >>> *...* >>> *[0]PETSC ERROR: #1 VecScatterCreate() at >>> ***\src\vec\is\sf\INTERF~1\vscat.c:736* >>> *[0]PETSC ERROR: #2 PCSetUp_GASM() at >>> ***\src\ksp\pc\impls\gasm\gasm.c:433* >>> *[0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994* >>> >>> And at call KSPSolve(ksp,b,x,ierr) it produces: >>> >>> *forrtl: severe (157): Program Exception - access violation* >>> >>> >>> The index sets are setup coherently with the outputs of e.g. >>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: >>> in particular each element of the matrix A corresponds to a number from 0 >>> to 63. >>> >> >> This is not correct, I believe. The indices are row/col indices, not >> indices into dense blocks, so for >> your example, they are all in [0, 8]. >> >> Thanks, >> >> Matt >> >> >>> Note that each submatrix does not represent some physical subdomain, the >>> subdivision is just at the algebraic level. >>> I thus have the following questions: >>> >>> - is this the correct way of creating the IS objects, given my >>> objective at the beginning of the email? Is the ordering correct? >>> - what am I doing wrong that is generating the above errors? >>> >>> Thanks for the patience and the time. >>> Best, >>> Leonardo >>> >>> Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith >>> ha scritto: >>> >>>> >>>> Added in *barry/2023-05-04/add-pcgasm-set-subdomains *see also >>>> https://gitlab.com/petsc/petsc/-/merge_requests/6419 >>>> >>>> Barry >>>> >>>> >>>> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI < >>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>> >>>> Thank you for the help. >>>> Adding to my example: >>>> >>>> >>>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>>> inflated_IS,ierr) call >>>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>>> results in: >>>> >>>> * Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS >>>> referenced in function ... * >>>> >>>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>>> referenced in function ... * >>>> I'm not sure if the interfaces are missing or if I have a compilation >>>> problem. >>>> Thank you again. >>>> Best, >>>> Leonardo >>>> >>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith >>>> ha scritto: >>>> >>>>> >>>>> Thank you for the test code. I have a fix in the branch >>>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>> with >>>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>> >>>>> The functions did not have proper Fortran stubs and interfaces so I >>>>> had to provide them manually in the new branch. >>>>> >>>>> Use >>>>> >>>>> git fetch >>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>> >>>>> ./configure etc >>>>> >>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had >>>>> to change things slightly and I updated the error handling for the latest >>>>> version. >>>>> >>>>> Please let us know if you have any later questions. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>> >>>>> Hello. I am having a hard time understanding the index sets to feed >>>>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>>>> get more intuition on how the IS objects behave I tried the following >>>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>>> square, non-overlapping submatrices: >>>>> >>>>> #include >>>>> #include >>>>> #include >>>>> USE petscmat >>>>> USE petscksp >>>>> USE petscpc >>>>> >>>>> Mat :: A >>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>> INTEGER :: I,J >>>>> PetscErrorCode :: ierr >>>>> PetscScalar :: v >>>>> KSP :: ksp >>>>> PC :: pc >>>>> IS :: subdomains_IS, inflated_IS >>>>> >>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>> >>>>> !-----Create a dummy matrix >>>>> M = 16 >>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>> & M, M, >>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>> & A, ierr) >>>>> >>>>> DO I=1,M >>>>> DO J=1,M >>>>> v = I*J >>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>> & INSERT_VALUES , ierr) >>>>> END DO >>>>> END DO >>>>> >>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>> >>>>> !-----Create KSP and PC >>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>> call KSPGetPC(ksp,pc,ierr) >>>>> call KSPSetUp(ksp, ierr) >>>>> call PCSetType(pc,PCGASM, ierr) >>>>> call PCSetUp(pc , ierr) >>>>> >>>>> !-----GASM setup >>>>> NSubx = 4 >>>>> dof = 1 >>>>> overlap = 0 >>>>> >>>>> call PCGASMCreateSubdomains2D(pc, >>>>> & M, M, >>>>> & NSubx, NSubx, >>>>> & dof, overlap, >>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>> >>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>> >>>>> call KSPDestroy(ksp, ierr) >>>>> call PetscFinalize(ierr) >>>>> >>>>> Running this on one processor, I get NSub = 4. >>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 >>>>> as expected. >>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - >>>>> access violation". So: >>>>> 1) why do I get two different results with ASM, and GASM? >>>>> 2) why do I get access violation and how can I solve this? >>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>>> objects. As I see on the Fortran interface, the arguments to >>>>> PCGASMCreateSubdomains2D are IS objects: >>>>> >>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>> import tPC,tIS >>>>> PC a ! PC >>>>> PetscInt b ! PetscInt >>>>> PetscInt c ! PetscInt >>>>> PetscInt d ! PetscInt >>>>> PetscInt e ! PetscInt >>>>> PetscInt f ! PetscInt >>>>> PetscInt g ! PetscInt >>>>> PetscInt h ! PetscInt >>>>> IS i ! IS >>>>> IS j ! IS >>>>> PetscErrorCode z >>>>> end subroutine PCGASMCreateSubdomains2D >>>>> Thus: >>>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, >>>>> for every created subdomain, the list of rows and columns defining the >>>>> subblock in the matrix, am I right? >>>>> >>>>> Context: I have a block-tridiagonal system arising from space-time >>>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>> >>>>> Thanks in advance, >>>>> Leonardo >>>>> >>>>> >>>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue May 9 13:45:37 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 9 May 2023 14:45:37 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> Message-ID: <65FAB3E9-8D08-4AEE-874E-636EB2C76A29@petsc.dev> It is simplier than you are making it out to be. Each IS[] is a list of rows (and columns) in the sub (domain) matrix. In your case with the matrix of 144 by 144 the indices will go from 0 to 143. In your simple Fortran code you have a completely different problem. A matrix with 8 rows and columns. In that case if you want the first IS to represent just the first row (and column) in the matrix then it should contain only 0. The second submatrix which is all rows (but the first) should have 1,2,3,4,5,6,7 I do not understand why your code has >>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) it should just be 0 > On May 9, 2023, at 12:44 PM, LEONARDO MUTTI wrote: > > Partial typo: I expect 9x(16+16) numbers to be stored in subdomain_IS : # subdomains x (row indices of the submatrix + col indices of the submatrix). > > Il giorno mar 9 mag 2023 alle ore 18:31 LEONARDO MUTTI > ha scritto: >> >> >> ---------- Forwarded message --------- >> Da: LEONARDO MUTTI > >> Date: mar 9 mag 2023 alle ore 18:29 >> Subject: Re: [petsc-users] Understanding index sets for PCGASM >> To: Matthew Knepley > >> >> >> Thank you for your answer, but I am still confused, sorry. >> Consider https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 on one processor. >> Let M=12 for the sake of simplicity, i.e. we deal with a 12x12 2D grid, hence, a 144x144 matrix. >> Let NSubx = 3, so that on the grid we do 3 vertical and 3 horizontal subdivisions. >> We should obtain 9 subdomains that are grids of 4x4 nodes each, thus corresponding to 9 submatrices of size 16x16. >> In my run I obtain NSub = 9 (great) and subdomain_IS(i), i=1,...,9, reads: >> >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 0 >> 1 1 >> 2 2 >> 3 3 >> 4 12 >> 5 13 >> 6 14 >> 7 15 >> 8 24 >> 9 25 >> 10 26 >> 11 27 >> 12 36 >> 13 37 >> 14 38 >> 15 39 >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 4 >> 1 5 >> 2 6 >> 3 7 >> 4 16 >> 5 17 >> 6 18 >> 7 19 >> 8 28 >> 9 29 >> 10 30 >> 11 31 >> 12 40 >> 13 41 >> 14 42 >> 15 43 >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 8 >> 1 9 >> 2 10 >> 3 11 >> 4 20 >> 5 21 >> 6 22 >> 7 23 >> 8 32 >> 9 33 >> 10 34 >> 11 35 >> 12 44 >> 13 45 >> 14 46 >> 15 47 >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 48 >> 1 49 >> 2 50 >> 3 51 >> 4 60 >> 5 61 >> 6 62 >> 7 63 >> 8 72 >> 9 73 >> 10 74 >> 11 75 >> 12 84 >> 13 85 >> 14 86 >> 15 87 >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 52 >> 1 53 >> 2 54 >> 3 55 >> 4 64 >> 5 65 >> 6 66 >> 7 67 >> 8 76 >> 9 77 >> 10 78 >> 11 79 >> 12 88 >> 13 89 >> 14 90 >> 15 91 >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 56 >> 1 57 >> 2 58 >> 3 59 >> 4 68 >> 5 69 >> 6 70 >> 7 71 >> 8 80 >> 9 81 >> 10 82 >> 11 83 >> 12 92 >> 13 93 >> 14 94 >> 15 95 >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 96 >> 1 97 >> 2 98 >> 3 99 >> 4 108 >> 5 109 >> 6 110 >> 7 111 >> 8 120 >> 9 121 >> 10 122 >> 11 123 >> 12 132 >> 13 133 >> 14 134 >> 15 135 >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 100 >> 1 101 >> 2 102 >> 3 103 >> 4 112 >> 5 113 >> 6 114 >> 7 115 >> 8 124 >> 9 125 >> 10 126 >> 11 127 >> 12 136 >> 13 137 >> 14 138 >> 15 139 >> IS Object: 1 MPI process >> type: general >> Number of indices in set 16 >> 0 104 >> 1 105 >> 2 106 >> 3 107 >> 4 116 >> 5 117 >> 6 118 >> 7 119 >> 8 128 >> 9 129 >> 10 130 >> 11 131 >> 12 140 >> 13 141 >> 14 142 >> 15 143 >> >> As you said, no number here reaches 144. >> But the number stored in subdomain_IS are 9x16= #subdomains x 16, whereas I would expect, also given your latest reply, 9x16x16x2=#subdomains x submatrix height x submatrix width x length of a (row,column) pair. >> It would really help me if you could briefly explain how the output above encodes the subdivision into subdomains. >> Many thanks again, >> Leonardo >> >> >> Il giorno mar 9 mag 2023 alle ore 16:24 Matthew Knepley > ha scritto: >>> On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI > wrote: >>>> Great thanks! I can now successfully run https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90. >>>> >>>> Going forward with my experiments, let me post a new code snippet (very similar to ex71f.F90) that I cannot get to work, probably I must be setting up the IS objects incorrectly. >>>> >>>> I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a vector b=(0.5,...,0.5). We have only one processor, and I want to solve Ax=b using GASM. In particular, KSP is set to preonly, GASM is the preconditioner and it uses on each submatrix an lu direct solver (sub_ksp = preonly, sub_pc = lu). >>>> >>>> For the GASM algorithm, I divide A into diag(1,1) and diag(2,2,...,2). For simplicity I set 0 overlap. Now I want to use GASM to solve Ax=b. The code follows. >>>> >>>> #include >>>> #include >>>> #include >>>> USE petscmat >>>> USE petscksp >>>> USE petscpc >>>> USE MPI >>>> >>>> Mat :: A >>>> Vec :: b, x >>>> PetscInt :: M, I, J, ISLen, NSub >>>> PetscMPIInt :: size >>>> PetscErrorCode :: ierr >>>> PetscScalar :: v >>>> KSP :: ksp >>>> PC :: pc >>>> IS :: subdomains_IS(2), inflated_IS(2) >>>> PetscInt,DIMENSION(4) :: indices_first_domain >>>> PetscInt,DIMENSION(36) :: indices_second_domain >>>> >>>> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >>>> call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) >>>> >>>> >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> ! INTRO: create matrix and right hand side >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> >>>> WRITE(*,*) "Assembling A,b" >>>> >>>> M = 8 >>>> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >>>> DO I=1,M >>>> DO J=1,M >>>> IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN >>>> v = 1 >>>> ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN >>>> v = 2 >>>> ELSE >>>> v = 0 >>>> ENDIF >>>> call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) >>>> END DO >>>> END DO >>>> >>>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call VecCreate(PETSC_COMM_WORLD,b,ierr) >>>> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >>>> call VecSetFromOptions(b,ierr) >>>> >>>> do I=1,M >>>> v = 0.5 >>>> call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) >>>> end do >>>> >>>> call VecAssemblyBegin(b,ierr) >>>> call VecAssemblyEnd(b,ierr) >>>> >>>> >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> ! FIRST KSP/PC SETUP >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> >>>> WRITE(*,*) "KSP/PC first setup" >>>> >>>> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >>>> call KSPSetOperators(ksp, A, A, ierr) >>>> call KSPSetType(ksp, 'preonly', ierr) >>>> call KSPGetPC(ksp, pc, ierr) >>>> call KSPSetUp(ksp, ierr) >>>> call PCSetType(pc, PCGASM, ierr) >>>> >>>> >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> ! GASM, SETTING SUBDOMAINS >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> >>>> WRITE(*,*) "Setting GASM subdomains" >>>> >>>> ! Let's create the subdomain IS and inflated_IS >>>> ! They are equal if no overlap is present >>>> ! They are 1: 0,1,8,9 >>>> ! 2: 10,...,15,18,...,23,...,58,...,63 >>>> >>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>> do I=0,5 >>>> do J=0,5 >>>> indices_second_domain(I*6+1+J) = 18 + J + 8*I ! corresponds to diag(2,2,...,2) >>>> !WRITE(*,*) I*6+1+J, 18 + J + 8*I >>>> end do >>>> end do >>>> >>>> ! Convert into IS >>>> ISLen = 4 >>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>> & PETSC_COPY_VALUES, subdomains_IS(1), ierr) >>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>> & PETSC_COPY_VALUES, inflated_IS(1), ierr) >>>> ISLen = 36 >>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>> & PETSC_COPY_VALUES, subdomains_IS(2), ierr) >>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>> & PETSC_COPY_VALUES, inflated_IS(2), ierr) >>>> >>>> NSub = 2 >>>> call PCGASMSetSubdomains(pc,NSub, >>>> & subdomains_IS,inflated_IS,ierr) >>>> call PCGASMDestroySubdomains(NSub, >>>> & subdomains_IS,inflated_IS,ierr) >>>> >>>> >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> ! GASM: SET SUBSOLVERS >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> >>>> WRITE(*,*) "Setting subsolvers for GASM" >>>> >>>> call PCSetUp(pc, ierr) ! should I add this? >>>> >>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>> & "-sub_pc_type", "lu", ierr) >>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>> & "-sub_ksp_type", "preonly", ierr) >>>> >>>> call KSPSetFromOptions(ksp, ierr) >>>> call PCSetFromOptions(pc, ierr) >>>> >>>> >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> ! DUMMY SOLUTION: DID IT WORK? >>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>> >>>> WRITE(*,*) "Solve" >>>> >>>> call VecDuplicate(b,x,ierr) >>>> call KSPSolve(ksp,b,x,ierr) >>>> >>>> call MatDestroy(A, ierr) >>>> call KSPDestroy(ksp, ierr) >>>> call PetscFinalize(ierr) >>>> >>>> This code is failing in multiple points. At call PCSetUp(pc, ierr) it produces: >>>> >>>> [0]PETSC ERROR: Argument out of range >>>> [0]PETSC ERROR: Scatter indices in ix are out of range >>>> ... >>>> [0]PETSC ERROR: #1 VecScatterCreate() at ***\src\vec\is\sf\INTERF~1\vscat.c:736 >>>> [0]PETSC ERROR: #2 PCSetUp_GASM() at ***\src\ksp\pc\impls\gasm\gasm.c:433 >>>> [0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994 >>>> >>>> And at call KSPSolve(ksp,b,x,ierr) it produces: >>>> >>>> forrtl: severe (157): Program Exception - access violation >>>> >>>> The index sets are setup coherently with the outputs of e.g. https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: in particular each element of the matrix A corresponds to a number from 0 to 63. >>> >>> This is not correct, I believe. The indices are row/col indices, not indices into dense blocks, so for >>> your example, they are all in [0, 8]. >>> >>> Thanks, >>> >>> Matt >>> >>>> Note that each submatrix does not represent some physical subdomain, the subdivision is just at the algebraic level. >>>> I thus have the following questions: >>>> is this the correct way of creating the IS objects, given my objective at the beginning of the email? Is the ordering correct? >>>> what am I doing wrong that is generating the above errors? >>>> Thanks for the patience and the time. >>>> Best, >>>> Leonardo >>>> >>>> Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith > ha scritto: >>>>> >>>>> Added in barry/2023-05-04/add-pcgasm-set-subdomains see also https://gitlab.com/petsc/petsc/-/merge_requests/6419 >>>>> >>>>> Barry >>>>> >>>>> >>>>>> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI > wrote: >>>>>> >>>>>> Thank you for the help. >>>>>> Adding to my example: >>>>>> call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) >>>>>> call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr) >>>>>> results in: >>>>>> Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS referenced in function ... >>>>>> Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS referenced in function ... >>>>>> I'm not sure if the interfaces are missing or if I have a compilation problem. >>>>>> Thank you again. >>>>>> Best, >>>>>> Leonardo >>>>>> >>>>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith > ha scritto: >>>>>>> >>>>>>> Thank you for the test code. I have a fix in the branch barry/2023-04-29/fix-pcasmcreatesubdomains2d with merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>>>> >>>>>>> The functions did not have proper Fortran stubs and interfaces so I had to provide them manually in the new branch. >>>>>>> >>>>>>> Use >>>>>>> >>>>>>> git fetch >>>>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>> ./configure etc >>>>>>> >>>>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to change things slightly and I updated the error handling for the latest version. >>>>>>> >>>>>>> Please let us know if you have any later questions. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI > wrote: >>>>>>>> >>>>>>>> Hello. I am having a hard time understanding the index sets to feed PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To get more intuition on how the IS objects behave I tried the following minimal (non) working example, which should tile a 16x16 matrix into 16 square, non-overlapping submatrices: >>>>>>>> >>>>>>>> #include >>>>>>>> #include >>>>>>>> #include >>>>>>>> USE petscmat >>>>>>>> USE petscksp >>>>>>>> USE petscpc >>>>>>>> >>>>>>>> Mat :: A >>>>>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>>>>> INTEGER :: I,J >>>>>>>> PetscErrorCode :: ierr >>>>>>>> PetscScalar :: v >>>>>>>> KSP :: ksp >>>>>>>> PC :: pc >>>>>>>> IS :: subdomains_IS, inflated_IS >>>>>>>> >>>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>>>>> >>>>>>>> !-----Create a dummy matrix >>>>>>>> M = 16 >>>>>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>>> & M, M, >>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>> & A, ierr) >>>>>>>> >>>>>>>> DO I=1,M >>>>>>>> DO J=1,M >>>>>>>> v = I*J >>>>>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>>>>> & INSERT_VALUES , ierr) >>>>>>>> END DO >>>>>>>> END DO >>>>>>>> >>>>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>> >>>>>>>> !-----Create KSP and PC >>>>>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>>>>> call KSPGetPC(ksp,pc,ierr) >>>>>>>> call KSPSetUp(ksp, ierr) >>>>>>>> call PCSetType(pc,PCGASM, ierr) >>>>>>>> call PCSetUp(pc , ierr) >>>>>>>> >>>>>>>> !-----GASM setup >>>>>>>> NSubx = 4 >>>>>>>> dof = 1 >>>>>>>> overlap = 0 >>>>>>>> >>>>>>>> call PCGASMCreateSubdomains2D(pc, >>>>>>>> & M, M, >>>>>>>> & NSubx, NSubx, >>>>>>>> & dof, overlap, >>>>>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>>>>> >>>>>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>>>>> >>>>>>>> call KSPDestroy(ksp, ierr) >>>>>>>> call PetscFinalize(ierr) >>>>>>>> >>>>>>>> Running this on one processor, I get NSub = 4. >>>>>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as expected. >>>>>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - access violation". So: >>>>>>>> 1) why do I get two different results with ASM, and GASM? >>>>>>>> 2) why do I get access violation and how can I solve this? >>>>>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. As I see on the Fortran interface, the arguments to PCGASMCreateSubdomains2D are IS objects: >>>>>>>> >>>>>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>>>>> import tPC,tIS >>>>>>>> PC a ! PC >>>>>>>> PetscInt b ! PetscInt >>>>>>>> PetscInt c ! PetscInt >>>>>>>> PetscInt d ! PetscInt >>>>>>>> PetscInt e ! PetscInt >>>>>>>> PetscInt f ! PetscInt >>>>>>>> PetscInt g ! PetscInt >>>>>>>> PetscInt h ! PetscInt >>>>>>>> IS i ! IS >>>>>>>> IS j ! IS >>>>>>>> PetscErrorCode z >>>>>>>> end subroutine PCGASMCreateSubdomains2D >>>>>>>> Thus: >>>>>>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, for every created subdomain, the list of rows and columns defining the subblock in the matrix, am I right? >>>>>>>> >>>>>>>> Context: I have a block-tridiagonal system arising from space-time finite elements, and I want to solve it with GMRES+PCGASM preconditioner, where each overlapping submatrix is on the diagonal and of size 3x3 blocks (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>>>>> >>>>>>>> Thanks in advance, >>>>>>>> Leonardo >>>>>>> >>>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue May 9 14:01:30 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 9 May 2023 15:01:30 -0400 Subject: [petsc-users] PCMG questions In-Reply-To: References: Message-ID: > On May 9, 2023, at 12:32 PM, Mark Adams wrote: > > I have a MG hierarchy that I construct manually with DMRefine and DMPlexExtrude. > > * The solver works great with chevy/sor but with chevy/sor it converges slowly or I get indefinite PC errors from CG. And the eigen estimates in cheby are really high, like 10-15. So with Cheby/SOR it works great but with the exact same options Cheby/SOR it behaves poorly? Are you using some quantum computer and NERSc? > > * I tried turning galerkin=none and I got this error. This is because without Garkin it needs to restrict the current solution and then compute the coarse grid Jacobian. Since you did not provide a DM that has the ability to even generate coarse grid vectors the process can work. You need a DM that can provide the coarse grid vectors and restrict solutions. Did you forget to pass a DM to the solver? > > Any thoughts on either of these issues? > > Thanks, > Mark > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector() > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-ksp_converged_reason (no value) source: command line > [0]PETSC ERROR: Option left: name:-mg_levels_esteig_ksp_type value: cg source: command line > [0]PETSC ERROR: Option left: name:-mg_levels_pc_type value: sor source: command line > [0]PETSC ERROR: Option left: name:-options_left (no value) source: command line > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-224-g9ed82936d20 GIT Date: 2023-05-07 12:33:48 -0400 > [0]PETSC ERROR: ./ex96 on a arch-macosx-gnu-O named MarksMac-302.local by markadams Tue May 9 12:26:52 2023 > [0]PETSC ERROR: Configure options CFLAGS="-g -Wall" CXXFLAGS="-g -Wall" COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich --with-strict-petscerrorcode --download-triangle=1 --with-x=0 --with-debugging=0 --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O > [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 > [0]PETSC ERROR: #2 DMCreateGlobalVector() at /Users/markadams/Codes/petsc/src/dm/interface/dm.c:1022 > [0]PETSC ERROR: #3 DMGetNamedGlobalVector() at /Users/markadams/Codes/petsc/src/dm/interface/dmget.c:377 > [0]PETSC ERROR: #4 DMRestrictHook_SNESVecSol() at /Users/markadams/Codes/petsc/src/snes/interface/snes.c:649 > [0]PETSC ERROR: #5 DMRestrict() at /Users/markadams/Codes/petsc/src/dm/interface/dm.c:3407 > [0]PETSC ERROR: #6 PCSetUp_MG() at /Users/markadams/Codes/petsc/src/ksp/pc/impls/mg/mg.c:1074 > [0]PETSC ERROR: #7 PCSetUp() at /Users/markadams/Codes/petsc/src/ksp/pc/interface/precon.c:994 > [0]PETSC ERROR: #8 KSPSetUp() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:406 > [0]PETSC ERROR: #9 KSPSolve_Private() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:824 > [0]PETSC ERROR: #10 KSPSolve() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:1070 > [0]PETSC ERROR: #11 SNESSolve_KSPONLY() at /Users/markadams/Codes/petsc/src/snes/impls/ksponly/ksponly.c:48 > [0]PETSC ERROR: #12 SNESSolve() at /Users/markadams/Codes/petsc/src/snes/interface/snes.c:4663 > [0]PETSC ERROR: #13 main() at ex96.c:433 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Tue May 9 14:11:45 2023 From: liufield at gmail.com (neil liu) Date: Tue, 9 May 2023 15:11:45 -0400 Subject: [petsc-users] About case ksp/tutorial/ex36.cxx Message-ID: Hello, Petsc Developers, I am trying to compile ksp/tutorial/ex36.cxx like make ex36, it shows an error " Documents/petsc-3.19.1/include/petscdmmoab.h:10:10: fatal error: moab/Core.hpp: No such file or directory #include /*I "moab/Core.hpp" I*/ ^~~~~~~~~~~~~~~ compilation terminated. " Did I miss something? In addition, I tried to know how to use DMAddboundary. It seems all the examples for DMAddboundary are related to a PetscFE object. Does that mean DMAddboundary can only be used with PetscFE? Thanks, Xiaodong -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 9 15:54:26 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 May 2023 16:54:26 -0400 Subject: [petsc-users] About case ksp/tutorial/ex36.cxx In-Reply-To: References: Message-ID: On Tue, May 9, 2023 at 3:12?PM neil liu wrote: > Hello, Petsc Developers, > I am trying to compile ksp/tutorial/ex36.cxx like make ex36, > > it shows an error > " Documents/petsc-3.19.1/include/petscdmmoab.h:10:10: fatal error: > moab/Core.hpp: No such file or directory > #include /*I "moab/Core.hpp" I*/ > ^~~~~~~~~~~~~~~ > compilation terminated. " > > Did I miss something? > The example only works if you install Moab first. > In addition, I tried to know how to use DMAddboundary. It seems all the > examples for DMAddboundary are related to a PetscFE object. Does that mean > DMAddboundary can only be used with PetscFE? > Yes, that is true. Thanks, Matt > Thanks, > > Xiaodong > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue May 9 16:40:47 2023 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 9 May 2023 17:40:47 -0400 Subject: [petsc-users] PCMG questions In-Reply-To: References: Message-ID: On Tue, May 9, 2023 at 3:01?PM Barry Smith wrote: > > > On May 9, 2023, at 12:32 PM, Mark Adams wrote: > > I have a MG hierarchy that I construct manually with DMRefine and > DMPlexExtrude. > > * The solver works great with chevy/sor but with chevy/sor it converges > slowly or I get indefinite PC errors from CG. And the eigen estimates in > cheby are really high, like 10-15. > > > So with Cheby/SOR it works great but with the exact same options > Cheby/SOR it behaves poorly? Are you using some quantum computer and NERSc? > It turned out that I had the sign wrong on my Laplacian point function and so the matrix was negative definite. I'm not sure what really happened exactly but it is sort of behaving better. It looks like my prolongation operator is garbage, the coarse grid correction does nothing (cg/jacobi converges in a little less that the number of MG iterations times the sum of pre and post smoothing steps), and the rows sums of P are not 1. Not sure what is going on there, but it is probably related to the DM hierarchy not being constructed correctly.... > > * I tried turning galerkin=none and I got this error. > > > This is because without Garkin it needs to restrict the current solution > and then compute the coarse grid Jacobian. Since you did not provide a DM > that has the ability to even generate coarse grid vectors the process can > work. You need a DM that can provide the coarse grid vectors and restrict > solutions. Did you forget to pass a DM to the solver? > The DM does everything. Similar to many examples but I've been checking with snes/tutorials/ex12.c today. I do call: PetscCall(DMSetCoarseDM(dmhierarchy[r], dmhierarchy[r-1])); But I am missing something else that goes on in DMRefineHierarchy, which I can't use because I am semi-coarsening. I probably have to build a section on each DM or something, but I have bigger fish to fry at this point. (I construct a 2D coarse grid, refine that a number of times and DMPlexExtrude each one the same amount (number and distance), the extruded direction is wrapped around a torus and made periodic. The fine grid now looks like, and will eventually be, the grids that tokamak codes use.) Thanks, Mark > > Any thoughts on either of these issues? > > Thanks, > Mark > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Must call DMShellSetGlobalVector() or > DMShellSetCreateGlobalVector() > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could > be the program crashed before they were used or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-ksp_converged_reason (no value) > source: command line > [0]PETSC ERROR: Option left: name:-mg_levels_esteig_ksp_type value: cg > source: command line > [0]PETSC ERROR: Option left: name:-mg_levels_pc_type value: sor source: > command line > [0]PETSC ERROR: Option left: name:-options_left (no value) source: > command line > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-224-g9ed82936d20 > GIT Date: 2023-05-07 12:33:48 -0400 > [0]PETSC ERROR: ./ex96 on a arch-macosx-gnu-O named MarksMac-302.local by > markadams Tue May 9 12:26:52 2023 > [0]PETSC ERROR: Configure options CFLAGS="-g -Wall" CXXFLAGS="-g -Wall" > COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang > --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich > --with-strict-petscerrorcode --download-triangle=1 --with-x=0 > --with-debugging=0 --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O > [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at > /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 > [0]PETSC ERROR: #2 DMCreateGlobalVector() at > /Users/markadams/Codes/petsc/src/dm/interface/dm.c:1022 > [0]PETSC ERROR: #3 DMGetNamedGlobalVector() at > /Users/markadams/Codes/petsc/src/dm/interface/dmget.c:377 > [0]PETSC ERROR: #4 DMRestrictHook_SNESVecSol() at > /Users/markadams/Codes/petsc/src/snes/interface/snes.c:649 > [0]PETSC ERROR: #5 DMRestrict() at > /Users/markadams/Codes/petsc/src/dm/interface/dm.c:3407 > [0]PETSC ERROR: #6 PCSetUp_MG() at > /Users/markadams/Codes/petsc/src/ksp/pc/impls/mg/mg.c:1074 > [0]PETSC ERROR: #7 PCSetUp() at > /Users/markadams/Codes/petsc/src/ksp/pc/interface/precon.c:994 > [0]PETSC ERROR: #8 KSPSetUp() at > /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:406 > [0]PETSC ERROR: #9 KSPSolve_Private() at > /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:824 > [0]PETSC ERROR: #10 KSPSolve() at > /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:1070 > [0]PETSC ERROR: #11 SNESSolve_KSPONLY() at > /Users/markadams/Codes/petsc/src/snes/impls/ksponly/ksponly.c:48 > [0]PETSC ERROR: #12 SNESSolve() at > /Users/markadams/Codes/petsc/src/snes/interface/snes.c:4663 > [0]PETSC ERROR: #13 main() at ex96.c:433 > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue May 9 20:07:45 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 9 May 2023 21:07:45 -0400 Subject: [petsc-users] PCMG questions In-Reply-To: References: Message-ID: <6FA6127D-D052-46A2-BFCB-082F55BBDE86@petsc.dev> > Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector() > [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 It looks like you have built a DMSHELL? You need to teach it how to generate global vectors since yours currently does not. Barry > On May 9, 2023, at 5:40 PM, Mark Adams wrote: > > > > On Tue, May 9, 2023 at 3:01?PM Barry Smith > wrote: >> >> >>> On May 9, 2023, at 12:32 PM, Mark Adams > wrote: >>> >>> I have a MG hierarchy that I construct manually with DMRefine and DMPlexExtrude. >>> >>> * The solver works great with chevy/sor but with chevy/sor it converges slowly or I get indefinite PC errors from CG. And the eigen estimates in cheby are really high, like 10-15. >> >> So with Cheby/SOR it works great but with the exact same options Cheby/SOR it behaves poorly? Are you using some quantum computer and NERSc? > > It turned out that I had the sign wrong on my Laplacian point function and so the matrix was negative definite. I'm not sure what really happened exactly but it is sort of behaving better. > It looks like my prolongation operator is garbage, the coarse grid correction does nothing (cg/jacobi converges in a little less that the number of MG iterations times the sum of pre and post smoothing steps), and the rows sums of P are not 1. > Not sure what is going on there, but it is probably related to the DM hierarchy not being constructed correctly.... >>> >>> * I tried turning galerkin=none and I got this error. >> >> This is because without Garkin it needs to restrict the current solution and then compute the coarse grid Jacobian. Since you did not provide a DM that has the ability to even generate coarse grid vectors the process can work. You need a DM that can provide the coarse grid vectors and restrict solutions. Did you forget to pass a DM to the solver? > > The DM does everything. Similar to many examples but I've been checking with snes/tutorials/ex12.c today. > I do call: > PetscCall(DMSetCoarseDM(dmhierarchy[r], dmhierarchy[r-1])); > But I am missing something else that goes on in DMRefineHierarchy, which I can't use because I am semi-coarsening. > I probably have to build a section on each DM or something, but I have bigger fish to fry at this point. > > (I construct a 2D coarse grid, refine that a number of times and DMPlexExtrude each one the same amount (number and distance), the extruded direction is wrapped around a torus and made periodic. > The fine grid now looks like, and will eventually be, the grids that tokamak codes use.) > > Thanks, > Mark > >>> >>> Any thoughts on either of these issues? >>> >>> Thanks, >>> Mark >>> >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [0]PETSC ERROR: Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector() >>> [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! >>> [0]PETSC ERROR: Option left: name:-ksp_converged_reason (no value) source: command line >>> [0]PETSC ERROR: Option left: name:-mg_levels_esteig_ksp_type value: cg source: command line >>> [0]PETSC ERROR: Option left: name:-mg_levels_pc_type value: sor source: command line >>> [0]PETSC ERROR: Option left: name:-options_left (no value) source: command line >>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>> [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-224-g9ed82936d20 GIT Date: 2023-05-07 12:33:48 -0400 >>> [0]PETSC ERROR: ./ex96 on a arch-macosx-gnu-O named MarksMac-302.local by markadams Tue May 9 12:26:52 2023 >>> [0]PETSC ERROR: Configure options CFLAGS="-g -Wall" CXXFLAGS="-g -Wall" COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich --with-strict-petscerrorcode --download-triangle=1 --with-x=0 --with-debugging=0 --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O >>> [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 >>> [0]PETSC ERROR: #2 DMCreateGlobalVector() at /Users/markadams/Codes/petsc/src/dm/interface/dm.c:1022 >>> [0]PETSC ERROR: #3 DMGetNamedGlobalVector() at /Users/markadams/Codes/petsc/src/dm/interface/dmget.c:377 >>> [0]PETSC ERROR: #4 DMRestrictHook_SNESVecSol() at /Users/markadams/Codes/petsc/src/snes/interface/snes.c:649 >>> [0]PETSC ERROR: #5 DMRestrict() at /Users/markadams/Codes/petsc/src/dm/interface/dm.c:3407 >>> [0]PETSC ERROR: #6 PCSetUp_MG() at /Users/markadams/Codes/petsc/src/ksp/pc/impls/mg/mg.c:1074 >>> [0]PETSC ERROR: #7 PCSetUp() at /Users/markadams/Codes/petsc/src/ksp/pc/interface/precon.c:994 >>> [0]PETSC ERROR: #8 KSPSetUp() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:406 >>> [0]PETSC ERROR: #9 KSPSolve_Private() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:824 >>> [0]PETSC ERROR: #10 KSPSolve() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:1070 >>> [0]PETSC ERROR: #11 SNESSolve_KSPONLY() at /Users/markadams/Codes/petsc/src/snes/impls/ksponly/ksponly.c:48 >>> [0]PETSC ERROR: #12 SNESSolve() at /Users/markadams/Codes/petsc/src/snes/interface/snes.c:4663 >>> [0]PETSC ERROR: #13 main() at ex96.c:433 >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue May 9 21:30:02 2023 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 9 May 2023 22:30:02 -0400 Subject: [petsc-users] PCMG questions In-Reply-To: <6FA6127D-D052-46A2-BFCB-082F55BBDE86@petsc.dev> References: <6FA6127D-D052-46A2-BFCB-082F55BBDE86@petsc.dev> Message-ID: No, there is no DMSHELL. The code is in adams/snes-example-tokamac: src/dm/impls/plex/tests/ex96.c With Galerkin coarse grids, the solver does solve but it is wrong so I am going to focus on that. I am manually doing what is in DMRefineHierarchy_Plex and there are a few things that I am not doing. This code does not work in parallel but I need to redo my MPI syntax strategy here (building a tree in MPI), so don't look at that. (the first few coarse grids are attached, wip) Thanks, Mark On Tue, May 9, 2023 at 9:07?PM Barry Smith wrote: > > Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector() > [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at > /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 > > > It looks like you have built a DMSHELL? You need to teach it how to > generate global vectors since yours currently does not. > > Barry > > > On May 9, 2023, at 5:40 PM, Mark Adams wrote: > > > > On Tue, May 9, 2023 at 3:01?PM Barry Smith wrote: > >> >> >> On May 9, 2023, at 12:32 PM, Mark Adams wrote: >> >> I have a MG hierarchy that I construct manually with DMRefine and >> DMPlexExtrude. >> >> * The solver works great with chevy/sor but with chevy/sor it converges >> slowly or I get indefinite PC errors from CG. And the eigen estimates in >> cheby are really high, like 10-15. >> >> >> So with Cheby/SOR it works great but with the exact same options >> Cheby/SOR it behaves poorly? Are you using some quantum computer and NERSc? >> > > It turned out that I had the sign wrong on my Laplacian point function and > so the matrix was negative definite. I'm not sure what really happened > exactly but it is sort of behaving better. > It looks like my prolongation operator is garbage, the coarse grid > correction does nothing (cg/jacobi converges in a little less that the > number of MG iterations times the sum of pre and post smoothing steps), and > the rows sums of P are not 1. > Not sure what is going on there, but it is probably related to the DM > hierarchy not being constructed correctly.... > >> >> * I tried turning galerkin=none and I got this error. >> >> >> This is because without Garkin it needs to restrict the current >> solution and then compute the coarse grid Jacobian. Since you did not >> provide a DM that has the ability to even generate coarse grid vectors the >> process can work. You need a DM that can provide the coarse grid vectors >> and restrict solutions. Did you forget to pass a DM to the solver? >> > > The DM does everything. Similar to many examples but I've been checking > with snes/tutorials/ex12.c today. > I do call: > PetscCall(DMSetCoarseDM(dmhierarchy[r], dmhierarchy[r-1])); > But I am missing something else that goes on in DMRefineHierarchy, which I > can't use because I am semi-coarsening. > I probably have to build a section on each DM or something, but I have > bigger fish to fry at this point. > > (I construct a 2D coarse grid, refine that a number of times and > DMPlexExtrude each one the same amount (number and distance), the extruded > direction is wrapped around a torus and made periodic. > The fine grid now looks like, and will eventually be, the grids that > tokamak codes use.) > > Thanks, > Mark > > >> >> Any thoughts on either of these issues? >> >> Thanks, >> Mark >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Must call DMShellSetGlobalVector() or >> DMShellSetCreateGlobalVector() >> [0]PETSC ERROR: WARNING! There are option(s) set that were not used! >> Could be the program crashed before they were used or a spelling mistake, >> etc! >> [0]PETSC ERROR: Option left: name:-ksp_converged_reason (no value) >> source: command line >> [0]PETSC ERROR: Option left: name:-mg_levels_esteig_ksp_type value: cg >> source: command line >> [0]PETSC ERROR: Option left: name:-mg_levels_pc_type value: sor source: >> command line >> [0]PETSC ERROR: Option left: name:-options_left (no value) source: >> command line >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-224-g9ed82936d20 >> GIT Date: 2023-05-07 12:33:48 -0400 >> [0]PETSC ERROR: ./ex96 on a arch-macosx-gnu-O named MarksMac-302.local by >> markadams Tue May 9 12:26:52 2023 >> [0]PETSC ERROR: Configure options CFLAGS="-g -Wall" CXXFLAGS="-g -Wall" >> COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang >> --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich >> --with-strict-petscerrorcode --download-triangle=1 --with-x=0 >> --with-debugging=0 --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O >> [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at >> /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 >> [0]PETSC ERROR: #2 DMCreateGlobalVector() at >> /Users/markadams/Codes/petsc/src/dm/interface/dm.c:1022 >> [0]PETSC ERROR: #3 DMGetNamedGlobalVector() at >> /Users/markadams/Codes/petsc/src/dm/interface/dmget.c:377 >> [0]PETSC ERROR: #4 DMRestrictHook_SNESVecSol() at >> /Users/markadams/Codes/petsc/src/snes/interface/snes.c:649 >> [0]PETSC ERROR: #5 DMRestrict() at >> /Users/markadams/Codes/petsc/src/dm/interface/dm.c:3407 >> [0]PETSC ERROR: #6 PCSetUp_MG() at >> /Users/markadams/Codes/petsc/src/ksp/pc/impls/mg/mg.c:1074 >> [0]PETSC ERROR: #7 PCSetUp() at >> /Users/markadams/Codes/petsc/src/ksp/pc/interface/precon.c:994 >> [0]PETSC ERROR: #8 KSPSetUp() at >> /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:406 >> [0]PETSC ERROR: #9 KSPSolve_Private() at >> /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:824 >> [0]PETSC ERROR: #10 KSPSolve() at >> /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:1070 >> [0]PETSC ERROR: #11 SNESSolve_KSPONLY() at >> /Users/markadams/Codes/petsc/src/snes/impls/ksponly/ksponly.c:48 >> [0]PETSC ERROR: #12 SNESSolve() at >> /Users/markadams/Codes/petsc/src/snes/interface/snes.c:4663 >> [0]PETSC ERROR: #13 main() at ex96.c:433 >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2023-05-09 at 7.49.16 AM.png Type: image/png Size: 572809 bytes Desc: not available URL: From bsmith at petsc.dev Tue May 9 21:43:32 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 9 May 2023 22:43:32 -0400 Subject: [petsc-users] PCMG questions In-Reply-To: References: <6FA6127D-D052-46A2-BFCB-082F55BBDE86@petsc.dev> Message-ID: <569581EB-6C65-4B8E-9A3E-C44FE7531AAC@petsc.dev> Based on the error message, there is a DMSHELL. Run with -start_in_debugger and put a breakpoint in DMCreate_Shell, and you'll find out where it is being called > On May 9, 2023, at 10:30 PM, Mark Adams wrote: > > No, there is no DMSHELL. > > The code is in adams/snes-example-tokamac: src/dm/impls/plex/tests/ex96.c > > With Galerkin coarse grids, the solver does solve but it is wrong so I am going to focus on that. > I am manually doing what is in DMRefineHierarchy_Plex and there are a few things that I am not doing. > This code does not work in parallel but I need to redo my MPI syntax strategy here (building a tree in MPI), so don't look at that. > (the first few coarse grids are attached, wip) > > Thanks, > Mark > > > On Tue, May 9, 2023 at 9:07?PM Barry Smith > wrote: >> >>> Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector() >>> [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 >> >> It looks like you have built a DMSHELL? You need to teach it how to generate global vectors since yours currently does not. >> >> Barry >> >> >>> On May 9, 2023, at 5:40 PM, Mark Adams > wrote: >>> >>> >>> >>> On Tue, May 9, 2023 at 3:01?PM Barry Smith > wrote: >>>> >>>> >>>>> On May 9, 2023, at 12:32 PM, Mark Adams > wrote: >>>>> >>>>> I have a MG hierarchy that I construct manually with DMRefine and DMPlexExtrude. >>>>> >>>>> * The solver works great with chevy/sor but with chevy/sor it converges slowly or I get indefinite PC errors from CG. And the eigen estimates in cheby are really high, like 10-15. >>>> >>>> So with Cheby/SOR it works great but with the exact same options Cheby/SOR it behaves poorly? Are you using some quantum computer and NERSc? >>> >>> It turned out that I had the sign wrong on my Laplacian point function and so the matrix was negative definite. I'm not sure what really happened exactly but it is sort of behaving better. >>> It looks like my prolongation operator is garbage, the coarse grid correction does nothing (cg/jacobi converges in a little less that the number of MG iterations times the sum of pre and post smoothing steps), and the rows sums of P are not 1. >>> Not sure what is going on there, but it is probably related to the DM hierarchy not being constructed correctly.... >>>>> >>>>> * I tried turning galerkin=none and I got this error. >>>> >>>> This is because without Garkin it needs to restrict the current solution and then compute the coarse grid Jacobian. Since you did not provide a DM that has the ability to even generate coarse grid vectors the process can work. You need a DM that can provide the coarse grid vectors and restrict solutions. Did you forget to pass a DM to the solver? >>> >>> The DM does everything. Similar to many examples but I've been checking with snes/tutorials/ex12.c today. >>> I do call: >>> PetscCall(DMSetCoarseDM(dmhierarchy[r], dmhierarchy[r-1])); >>> But I am missing something else that goes on in DMRefineHierarchy, which I can't use because I am semi-coarsening. >>> I probably have to build a section on each DM or something, but I have bigger fish to fry at this point. >>> >>> (I construct a 2D coarse grid, refine that a number of times and DMPlexExtrude each one the same amount (number and distance), the extruded direction is wrapped around a torus and made periodic. >>> The fine grid now looks like, and will eventually be, the grids that tokamak codes use.) >>> >>> Thanks, >>> Mark >>> >>>>> >>>>> Any thoughts on either of these issues? >>>>> >>>>> Thanks, >>>>> Mark >>>>> >>>>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>>> [0]PETSC ERROR: Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector() >>>>> [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! >>>>> [0]PETSC ERROR: Option left: name:-ksp_converged_reason (no value) source: command line >>>>> [0]PETSC ERROR: Option left: name:-mg_levels_esteig_ksp_type value: cg source: command line >>>>> [0]PETSC ERROR: Option left: name:-mg_levels_pc_type value: sor source: command line >>>>> [0]PETSC ERROR: Option left: name:-options_left (no value) source: command line >>>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>>>> [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-224-g9ed82936d20 GIT Date: 2023-05-07 12:33:48 -0400 >>>>> [0]PETSC ERROR: ./ex96 on a arch-macosx-gnu-O named MarksMac-302.local by markadams Tue May 9 12:26:52 2023 >>>>> [0]PETSC ERROR: Configure options CFLAGS="-g -Wall" CXXFLAGS="-g -Wall" COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich --with-strict-petscerrorcode --download-triangle=1 --with-x=0 --with-debugging=0 --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O >>>>> [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at /Users/markadams/Codes/petsc/src/dm/impls/shell/dmshell.c:210 >>>>> [0]PETSC ERROR: #2 DMCreateGlobalVector() at /Users/markadams/Codes/petsc/src/dm/interface/dm.c:1022 >>>>> [0]PETSC ERROR: #3 DMGetNamedGlobalVector() at /Users/markadams/Codes/petsc/src/dm/interface/dmget.c:377 >>>>> [0]PETSC ERROR: #4 DMRestrictHook_SNESVecSol() at /Users/markadams/Codes/petsc/src/snes/interface/snes.c:649 >>>>> [0]PETSC ERROR: #5 DMRestrict() at /Users/markadams/Codes/petsc/src/dm/interface/dm.c:3407 >>>>> [0]PETSC ERROR: #6 PCSetUp_MG() at /Users/markadams/Codes/petsc/src/ksp/pc/impls/mg/mg.c:1074 >>>>> [0]PETSC ERROR: #7 PCSetUp() at /Users/markadams/Codes/petsc/src/ksp/pc/interface/precon.c:994 >>>>> [0]PETSC ERROR: #8 KSPSetUp() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:406 >>>>> [0]PETSC ERROR: #9 KSPSolve_Private() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:824 >>>>> [0]PETSC ERROR: #10 KSPSolve() at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:1070 >>>>> [0]PETSC ERROR: #11 SNESSolve_KSPONLY() at /Users/markadams/Codes/petsc/src/snes/impls/ksponly/ksponly.c:48 >>>>> [0]PETSC ERROR: #12 SNESSolve() at /Users/markadams/Codes/petsc/src/snes/interface/snes.c:4663 >>>>> [0]PETSC ERROR: #13 main() at ex96.c:433 >>>>> >>>>> >>>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jiannan_Tu at uml.edu Tue May 9 21:59:24 2023 From: Jiannan_Tu at uml.edu (Tu, Jiannan) Date: Wed, 10 May 2023 02:59:24 +0000 Subject: [petsc-users] Unconditional jump or move depends on uninitialised value(s) In-Reply-To: References: <295E7E1A-1649-435F-AE65-F061F287513F@petsc.dev> <3F1BD989-8516-4649-A385-5F94FD1A9470@petsc.dev> <0B7BA32F-03CE-44F2-A9A3-4584B2D7AB94@anl.gov> <9523EDF9-7C02-4872-9E0E-1DFCBCB28066@anl.gov> <88895B56-2BEF-49BF-B6E9-F75186E712D9@anl.gov> Message-ID: I am using PETSC SNES to solve a nonlinear equation system resulted from discretization of partial differential equations. When I use Valgrind to check my program, there are lots of errors of ?Unconditional jump or move depends on uninitialised value(s)? produced. The errors occur in the function routine set by SNESSetFunction(snes, NULL, formfunctions, ¶ms). Even using only one MPI process such errors occur. It seems solution vector X is not initialized through SNESSolve(). But if the formfunctions() is called directly from main(), there are no errors. I really don?t understand why. Could you please help me identify what is going wrong? Thank you very much, Jiannan ------------------------------------------------- The error message is like the following example Conditional jump or move depends on uninitialised value(s) ==866758== at 0xA19178C: sqrt (w_sqrt_compat.c:31) ==866758== by 0x4EA9E4C: VecNorm_Seq (bvec2.c:227) ==866758== by 0x4F705C8: VecNorm (rvector.c:228) ==866758== by 0x65D5B16: SNESSolve_NEWTONLS (ls.c:179) ==866758== by 0x673093F: SNESSolve (snes.c:4809) ==866758== by 0x12F607: main (iditm3d.cpp:138) ==866758== Uninitialised value was created by a stack allocation ==866758== at 0x11AC36: functions(Field***, Field***, Field***, int, int, int, int, int, int, AppCtx*, Field***) (formfunctions.cpp:16) functions() is called within the formfunctions(). The code snippet using SNES solver is int formfunctions(SNES, Vec, Vec, void *ctx); int jacobian(SNES, Vec, Mat, Mat, void *ctx); DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_PERIODIC, DMDA_STENCIL_STAR,a1,a2,a3,PETSC_DECIDE,PETSC_DECIDE,1,a4,2,NULL,NULL,NULL,&da); DMSetFromOptions(da); DMSetUp(da); DMCreateGlobalVector(da, &X); VecDuplicate(X, ¶ms.U); VecDuplicate(X, ¶ms.Xn); /* set up grids and related geometric parameters. Set up initial solution vector X */ if (initialize(da, X, ¶ms) < 0) exit(-1); SNES snes; SNESCreate(MPI_COMM_WORLD, &snes); SNESSetType(snes, SNESNEWTONLS); SNESSetFromOptions(snes); SNESSetDM(snes, da); SNESSetFunction(snes, NULL, formfunctions, ¶ms); KSP ksp; SNESGetKSP(snes, &ksp); KSPSetType(ksp, KSPFGMRES); KSPSetFromOptions(ksp); PC pc; KSPGetPC(ksp, &pc); PCSetType(pc, PCJACOBI); PCSetFromOptions(pc); Mat A; DMSetMatrixPreallocateOnly(da, PETSC_FALSE); DMSetMatType(da, MATMPIAIJ); DMDASetBlockFills(da, dfill, ofill); DMCreateMatrix(da, &A); SNESSetJacobian(snes, A, A, jacobian, ¶ms); SNESSetSolution(snes, X); //set initial guess of the solution SNESSolve(snes, PETSC_NULL, X); //iterative solver to find the solution From: Zhang, Hong Sent: Tuesday, February 21, 2023 11:21 PM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. On Feb 21, 2023, at 8:54 PM, Tu, Jiannan wrote: CN or BEular doesn?t work. They produce negative densities at the lower boundary even RHS functions are positive. So for TS, all equations must include udot? You have algebraic constraints only for the boundary points. For all the other points, you must have udot in IFunction. I recommend you to take a look at the example src/ts/tutorials/ex25.c Hong (Mr.) Thank you, Jiannan From: Zhang, Hong Sent: Monday, February 20, 2023 11:07 AM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. If you have to include the boundary points, I would suggest starting from a fully implicit solver such as CN or BEuler with a finite-difference approximated Jacobian. When this works for a small scale setting, you can build up more functionalities such as IMEX and analytical Jacobians and extend the problem to a larger scale. But the udot issue needs to be fixed in the first place. Hong (Mr.) On Feb 19, 2023, at 9:23 PM, Tu, Jiannan wrote: It is the second order derivative of, say electron temperature = 0 at the boundary. I am not sure how I can exclude the boundary points because the values of unknowns must be specified at the boundary. Are there any other solvers, e.g., CN, good to solve the equation system? Thank you, Jiannan From: Zhang, Hong Sent: Sunday, February 19, 2023 4:48 PM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. It is fine to drop udot for the boundary points, but you need to keep udot for all the other points. In addition, which boundary condition do you use in IFunction? The way you are treating the boundary points actually leads to a system of differential-algebraic equations, which could be difficult to solve with the ARKIMEX solver. Can you try to exclude the boundary points from the computational domain so that you will have just a system of ODEs? Hong (Mr.) On Feb 18, 2023, at 4:28 PM, Tu, Jiannan wrote: Thanks for the instruction. This is the boundary condition and there is no udot in the equation. I think this is the way to define IFunction at the boundary. Maybe I?m wrong? Or is there some way to introduce udot into the specification of the equation at the boundary from the aspect of the implementation for TS? Thank you, Jiannan From: Zhang, Hong Sent: Saturday, February 18, 2023 12:40 PM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected You don't often get email from hongzhang at anl.gov. Learn why this is important CAUTION: This email was sent from outside the UMass Lowell network. On Feb 18, 2023, at 8:44 AM, Tu, Jiannan wrote: The RHS function at the bottom boundary is determined by the boundary condition, which is the second order derivative = 0, i.e. G(u) = 2*X[i=1] ? X[i=2]. Then in IFunction, F(u, udot) = X[i=0]. This might be the problem. Your F(u, udot) is missing udot according to your description. Take a simple ODE udot = f(u) + g(u) for example. One way to partition this ODE is to define F = udot - f(u) as the IFunction and G = g(u) as the RHSFunction. Hong (Mr.) Thank you, Jiannan From: Zhang, Hong Sent: Friday, February 17, 2023 11:54 PM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected You don't often get email from hongzhang at anl.gov. Learn why this is important CAUTION: This email was sent from outside the UMass Lowell network. On Feb 17, 2023, at 6:19 PM, Tu, Jiannan wrote: I need to find out what causes negative temperature first. Following is the message with adaptivity turned off. The G(u) gives right-hand equation for electron temperature at bottom boundary. The F(u, u?) function is F(u, u?) = X = G(u) and the jacobian element is d F(u, u?) / dX =1. This looks strange. Can you elaborate a bit on your partitioned ODE? For example, how are your F(u,udot) (IFunction) and G(u) (RHSFunction) defined? A good IMEX example can be found at ts/tutorial/advection-diffusion-reaction/ex5.c (and reaction_diffusion.c). Hong (Mr.) The solution from TSStep is checked for positivity of densities and temperatures. >From the message below, it is seen that G(u) > 0 (I added output of right-hand equation for electron temperature). The solution for electron temperature X should be X * jacobian element = G(u) > 0 since jacobian element = 1. I don?t understand why it becomes negative. Is my understanding of TS formula incorrect? Thank you, Jiannan ---------------------------------- G(u) = 1.86534e-07 0 SNES Function norm 2.274473072183e+03 1 SNES Function norm 8.641749325070e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 G(u) = 1.86534e-07 0 SNES Function norm 8.716501970511e-02 1 SNES Function norm 2.213263548813e-04 2 SNES Function norm 2.779985176426e-08 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 G(u) = 1.86534e-07 0 SNES Function norm 3.177195995186e-01 1 SNES Function norm 3.607702491344e-04 2 SNES Function norm 4.345809629121e-08 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 G(u) = 1.86534e-07 TSAdapt none arkimex 0:3 step 0 accepted t=42960 + 2.189e-02 dt=2.189e-02 electron temperature = -3.6757e-15 at (i, j, k) = (0, 1, 0) From: Barry Smith Sent: Friday, February 17, 2023 3:45 PM To: Tu, Jiannan; Hong Zhang; Emil Constantinescu Cc: petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. On Feb 17, 2023, at 3:32 PM, Tu, Jiannan wrote: The ts_type arkimex is used. There is right hand-side function RHSFunction set by TSSetRHSFunction() and also stiff function set by TSSetIFunction(). With adaptivity shut off, TS can finish its first time step after the 3rd ?Nonlinear solve converged due to ??. The solution gives negative electron and neutral temperatures at the bottom boundary. I need to fix the negative temperatures and see how the code works. BTW, what is this ts_adapt? Is it by default on? It is default for some of the TSTypes (in particular, the better ones). It adapts the timestep to ensure some local error estimate is below a certain tolerance. As Matt notes normally as it tries smaller and smaller time steps the local error estimate would get smaller and smaller; this is not happening here, hence the error. Have you tried with the argument -ts_arkimex_fully_implicit ? I am not an expert but my guess is something is "odd" about your functions, either the RHSFunction or the Function or both. Do you have a hierarchy of models for your problem? Could you try runs with fewer terms in your functions, that may be producing the difficulties? If you can determine what triggers the problem with the local error estimators, that might help the experts in ODE solution (not me) determine what could be going wrong. Barry Thank you, Jiannan From: Matthew Knepley Sent: Friday, February 17, 2023 3:15 PM To: Tu, Jiannan Cc: Barry Smith; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. I am not sure what TS you are using, but the estimate of the local truncation error is 91.4, and does not seem to change when you make the step smaller, so something is off. You can shut off the adaptivity using -ts_adapt_type none Thanks, Matt On Fri, Feb 17, 2023 at 3:01 PM Tu, Jiannan > wrote: These are what I got with the options you suggested. Thank you, Jiannan ------------------------------------------------------------------------------- 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 1.673091274668e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 8.715428433630e-02 1 SNES Function norm 4.995727626692e-04 2 SNES Function norm 5.498018152230e-08 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 3.237461568254e-01 1 SNES Function norm 7.988531005091e-04 2 SNES Function norm 1.280948196292e-07 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 2.189e-02 dt=4.374e-03 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 4.881903203545e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 7.562592690785e-02 1 SNES Function norm 1.143078818923e-04 2 SNES Function norm 9.834547907735e-09 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 2.683968949758e-01 1 SNES Function norm 1.838028436639e-04 2 SNES Function norm 9.470813523140e-09 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-03 dt=4.374e-04 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 1.821562431175e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.005443458812e-01 1 SNES Function norm 3.633336946661e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 1.515368382715e-01 1 SNES Function norm 3.389298316830e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-04 dt=4.374e-05 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 4.541003359206e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 1.713800906043e-01 1 SNES Function norm 1.179958172167e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.020265094117e-01 1 SNES Function norm 1.513971290464e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-05 dt=4.374e-06 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 6.090269704320e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.136603895703e-01 1 SNES Function norm 1.877474016012e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.127812462507e-01 1 SNES Function norm 2.713146825704e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-06 dt=4.374e-07 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 2.793512213059e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.205196267430e-01 1 SNES Function norm 2.572653773308e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.260057361977e-01 1 SNES Function norm 2.705816087598e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-07 dt=4.374e-08 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 2.764855860446e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.212505522844e-01 1 SNES Function norm 2.958996472386e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.273222034162e-01 1 SNES Function norm 2.994512887620e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-08 dt=4.374e-09 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 3.317240589134e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.213246532918e-01 1 SNES Function norm 2.799468604767e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.274570888397e-01 1 SNES Function norm 3.066048050994e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-09 dt=4.374e-10 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072189e+03 1 SNES Function norm 2.653507278572e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.213869585841e-01 1 SNES Function norm 2.177156902895e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.275136370365e-01 1 SNES Function norm 1.962849131557e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-10 dt=4.374e-11 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072218e+03 1 SNES Function norm 5.664907315679e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.223208399368e-01 1 SNES Function norm 5.688863091415e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.287121218919e-01 1 SNES Function norm 4.085338521320e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-11 dt=4.374e-12 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473071968e+03 1 SNES Function norm 4.694691905235e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.211786508657e-01 1 SNES Function norm 1.503497433939e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.272667798977e-01 1 SNES Function norm 2.176132327279e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-12 dt=4.374e-13 wlte= 91.4 wltea= -1 wlter= -1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.6, Mar 30, 2022 [0]PETSC ERROR: ./iditm3d on a named office by jtu Fri Feb 17 14:54:22 2023 [0]PETSC ERROR: Configure options --prefix=/usr/local --with-mpi-dir=/usr/local --with-fc=0 --with-openmp --with-hdf5-dir=/usr/local --download-f2cblaslapack=1 [0]PETSC ERROR: #1 TSStep() at /home/jtu/Downloads/petsc-3.16.6/src/ts/interface/ts.c:3583 From: Barry Smith Sent: Friday, February 17, 2023 12:58 PM To: Tu, Jiannan Cc: petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. Can you please run with also the options -ts_monitor -ts_adapt_monitor ? The output is confusing because it prints that the Nonlinear solve has converged but then TSStep has failed due to DIVERGED_STEP_REJECTED which seems contradictory On Feb 17, 2023, at 12:09 PM, Tu, Jiannan > wrote: My code uses TS to solve a set of multi-fluid MHD equations. The jacobian is provided with function F(t, u, u'). Both linear and nonlinear solvers converge but snes repeats itself until gets "TSStep has failed due to diverged_step_rejected." Is it because I used TSStep rather than TSSolve? I have checked the condition number. The condition number with pc_type asm is about 1 (without precondition it is about 4x10^4). The maximum ratio of off-diagonal jacobian element over diagonal element is about 21. Could you help me to identify what is going wrong? Thank you very much! Jiannan --------------------------------------------------------------------------------------------------- Run command with options mpiexec -n $1 ./iditm3d -ts_type arkimex -snes_tyep ngmres -ksp_type gmres -pc_type asm \ -ts_rtol 1.0e-4 -ts_atol 1.0e-4 -snes_monitor -snes_rtol 1.0e-4 -snes_atol 1.0e-4 \ -snes_converged_reason The output message is Start time advancing ... 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 1.673091274668e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 8.715428433630e-02 1 SNES Function norm 4.995727626692e-04 2 SNES Function norm 5.498018152230e-08 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 3.237461568254e-01 1 SNES Function norm 7.988531005091e-04 2 SNES Function norm 1.280948196292e-07 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 4.881903203545e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 7.562592690785e-02 1 SNES Function norm 1.143078818923e-04 2 SNES Function norm 9.834547907735e-09 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 2.683968949758e-01 1 SNES Function norm 1.838028436639e-04 2 SNES Function norm 9.470813523140e-09 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 1.821562431175e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.005443458812e-01 1 SNES Function norm 3.633336946661e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 1.515368382715e-01 1 SNES Function norm 3.389298316830e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 4.541003359206e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 1.713800906043e-01 1 SNES Function norm 1.179958172167e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.020265094117e-01 1 SNES Function norm 1.513971290464e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 6.090269704320e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.136603895703e-01 1 SNES Function norm 1.877474016012e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.127812462507e-01 1 SNES Function norm 2.713146825704e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 2.793512213059e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.205196267430e-01 1 SNES Function norm 2.572653773308e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.260057361977e-01 1 SNES Function norm 2.705816087598e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 2.764855860446e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.212505522844e-01 1 SNES Function norm 2.958996472386e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.273222034162e-01 1 SNES Function norm 2.994512887620e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 3.317240589134e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.213246532918e-01 1 SNES Function norm 2.799468604767e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.274570888397e-01 1 SNES Function norm 3.066048050994e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 2.274473072189e+03 1 SNES Function norm 2.653507278572e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.213869585841e-01 1 SNES Function norm 2.177156902895e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.275136370365e-01 1 SNES Function norm 1.962849131557e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 2.274473072218e+03 1 SNES Function norm 5.664907315679e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.223208399368e-01 1 SNES Function norm 5.688863091415e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.287121218919e-01 1 SNES Function norm 4.085338521320e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 2.274473071968e+03 1 SNES Function norm 4.694691905235e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.211786508657e-01 1 SNES Function norm 1.503497433939e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.272667798977e-01 1 SNES Function norm 2.176132327279e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.6, Mar 30, 2022 [0]PETSC ERROR: ./iditm3d on a named office by jtu Fri Feb 17 11:59:43 2023 [0]PETSC ERROR: Configure options --prefix=/usr/local --with-mpi-dir=/usr/local --with-fc=0 --with-openmp --with-hdf5-dir=/usr/local --download-f2cblaslapack=1 [0]PETSC ERROR: #1 TSStep() at /home/jtu/Downloads/petsc-3.16.6/src/ts/interface/ts.c:3583 -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue May 9 22:16:13 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 9 May 2023 23:16:13 -0400 Subject: [petsc-users] Unconditional jump or move depends on uninitialised value(s) In-Reply-To: References: <295E7E1A-1649-435F-AE65-F061F287513F@petsc.dev> <3F1BD989-8516-4649-A385-5F94FD1A9470@petsc.dev> <0B7BA32F-03CE-44F2-A9A3-4584B2D7AB94@anl.gov> <9523EDF9-7C02-4872-9E0E-1DFCBCB28066@anl.gov> <88895B56-2BEF-49BF-B6E9-F75186E712D9@anl.gov> Message-ID: <1E04F37F-2286-404D-AB87-9FDD5E878DC4@petsc.dev> It would have been best to send formfunctions.cpp: since that is where the problem is. Likely you have a local variable (hence the message "stack-allocation") in that function that is not initialized but that you use to fill up the function array values. Valgrind does not detect all uses of unitialized memory, it only detects them when they would change a flow direction in the code, like in an if () test. This is why the problem pops up in the VecNorm() and not if you just call your function from main(). There are compiler options to detect the use of used declared variables that are not initialized which might help you find the problem at compile time. Barry > On May 9, 2023, at 10:59 PM, Tu, Jiannan wrote: > > I am using PETSC SNES to solve a nonlinear equation system resulted from discretization of partial differential equations. When I use Valgrind to check my program, there are lots of errors of ?Unconditional jump or move depends on uninitialised value(s)? produced. The errors occur in the function routine set by SNESSetFunction(snes, NULL, formfunctions, ¶ms). Even using only one MPI process such errors occur. It seems solution vector X is not initialized through SNESSolve(). But if the formfunctions() is called directly from main(), there are no errors. I really don?t understand why. Could you please help me identify what is going wrong? > > Thank you very much, > > Jiannan > > ------------------------------------------------- > The error message is like the following example > > Conditional jump or move depends on uninitialised value(s) > ==866758== at 0xA19178C: sqrt (w_sqrt_compat.c:31) > ==866758== by 0x4EA9E4C: VecNorm_Seq (bvec2.c:227) > ==866758== by 0x4F705C8: VecNorm (rvector.c:228) > ==866758== by 0x65D5B16: SNESSolve_NEWTONLS (ls.c:179) > ==866758== by 0x673093F: SNESSolve (snes.c:4809) > ==866758== by 0x12F607: main (iditm3d.cpp:138) > ==866758== Uninitialised value was created by a stack allocation > ==866758== at 0x11AC36: functions(Field***, Field***, Field***, int, int, int, int, int, int, AppCtx*, Field***) (formfunctions.cpp:16) > > functions() is called within the formfunctions(). The code snippet using SNES solver is > > int formfunctions(SNES, Vec, Vec, void *ctx); > int jacobian(SNES, Vec, Mat, Mat, void *ctx); > > DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_PERIODIC, > DMDA_STENCIL_STAR,a1,a2,a3,PETSC_DECIDE,PETSC_DECIDE,1,a4,2,NULL,NULL,NULL,&da); > > DMSetFromOptions(da); > DMSetUp(da); > > DMCreateGlobalVector(da, &X); > VecDuplicate(X, ¶ms.U); > VecDuplicate(X, ¶ms.Xn); > > /* set up grids and related geometric parameters. Set up initial solution vector X */ > if (initialize(da, X, ¶ms) < 0) exit(-1); > > SNES snes; > SNESCreate(MPI_COMM_WORLD, &snes); > SNESSetType(snes, SNESNEWTONLS); > SNESSetFromOptions(snes); > SNESSetDM(snes, da); > SNESSetFunction(snes, NULL, formfunctions, ¶ms); > > KSP ksp; > SNESGetKSP(snes, &ksp); > KSPSetType(ksp, KSPFGMRES); > KSPSetFromOptions(ksp); > > PC pc; > KSPGetPC(ksp, &pc); > PCSetType(pc, PCJACOBI); > PCSetFromOptions(pc); > > Mat A; > DMSetMatrixPreallocateOnly(da, PETSC_FALSE); > DMSetMatType(da, MATMPIAIJ); > DMDASetBlockFills(da, dfill, ofill); > DMCreateMatrix(da, &A); > > SNESSetJacobian(snes, A, A, jacobian, ¶ms); > > SNESSetSolution(snes, X); //set initial guess of the solution > SNESSolve(snes, PETSC_NULL, X); //iterative solver to find the solution > > From: Zhang, Hong > Sent: Tuesday, February 21, 2023 11:21 PM > To: Tu, Jiannan > Cc: Barry Smith ; Hong Zhang ; Constantinescu, Emil M. ; petsc-users > Subject: Re: [petsc-users] TS failed due to diverged_step_rejected > > CAUTION: This email was sent from outside the UMass Lowell network. > > > > > On Feb 21, 2023, at 8:54 PM, Tu, Jiannan > wrote: > > CN or BEular doesn?t work. They produce negative densities at the lower boundary even RHS functions are positive. So for TS, all equations must include udot? > > You have algebraic constraints only for the boundary points. For all the other points, you must have udot in IFunction. I recommend you to take a look at the example src/ts/tutorials/ex25.c > > Hong (Mr.) > > > > Thank you, > Jiannan > > From: Zhang, Hong > Sent: Monday, February 20, 2023 11:07 AM > To: Tu, Jiannan > Cc: Barry Smith ; Hong Zhang ; Constantinescu, Emil M. ; petsc-users > Subject: Re: [petsc-users] TS failed due to diverged_step_rejected > > CAUTION: This email was sent from outside the UMass Lowell network. > > If you have to include the boundary points, I would suggest starting from a fully implicit solver such as CN or BEuler with a finite-difference approximated Jacobian. When this works for a small scale setting, you can build up more functionalities such as IMEX and analytical Jacobians and extend the problem to a larger scale. But the udot issue needs to be fixed in the first place. > > Hong (Mr.) > > > > On Feb 19, 2023, at 9:23 PM, Tu, Jiannan > wrote: > > It is the second order derivative of, say electron temperature = 0 at the boundary. > > I am not sure how I can exclude the boundary points because the values of unknowns must be specified at the boundary. Are there any other solvers, e.g., CN, good to solve the equation system? > > Thank you, > Jiannan > > > From: Zhang, Hong > Sent: Sunday, February 19, 2023 4:48 PM > To: Tu, Jiannan > Cc: Barry Smith ; Hong Zhang ; Constantinescu, Emil M. ; petsc-users > Subject: Re: [petsc-users] TS failed due to diverged_step_rejected > > CAUTION: This email was sent from outside the UMass Lowell network. > > It is fine to drop udot for the boundary points, but you need to keep udot for all the other points. > > In addition, which boundary condition do you use in IFunction? The way you are treating the boundary points actually leads to a system of differential-algebraic equations, which could be difficult to solve with the ARKIMEX solver. Can you try to exclude the boundary points from the computational domain so that you will have just a system of ODEs? > > Hong (Mr.) > > > > > On Feb 18, 2023, at 4:28 PM, Tu, Jiannan > wrote: > > Thanks for the instruction. This is the boundary condition and there is no udot in the equation. I think this is the way to define IFunction at the boundary. Maybe I?m wrong? Or is there some way to introduce udot into the specification of the equation at the boundary from the aspect of the implementation for TS? > > Thank you, > Jiannan > > From: Zhang, Hong > Sent: Saturday, February 18, 2023 12:40 PM > To: Tu, Jiannan > Cc: Barry Smith ; Hong Zhang ; Constantinescu, Emil M. ; petsc-users > Subject: Re: [petsc-users] TS failed due to diverged_step_rejected > > You don't often get email from hongzhang at anl.gov . Learn why this is important > CAUTION: This email was sent from outside the UMass Lowell network. > > > > > > > > On Feb 18, 2023, at 8:44 AM, Tu, Jiannan > wrote: > > The RHS function at the bottom boundary is determined by the boundary condition, which is the second order derivative = 0, i.e. G(u) = 2*X[i=1] ? X[i=2]. Then in IFunction, F(u, udot) = X[i=0]. > > This might be the problem. Your F(u, udot) is missing udot according to your description. Take a simple ODE udot = f(u) + g(u) for example. One way to partition this ODE is to define F = udot - f(u) as the IFunction and G = g(u) as the RHSFunction. > > Hong (Mr.) > > > > > > > Thank you, > Jiannan > > > From: Zhang, Hong > Sent: Friday, February 17, 2023 11:54 PM > To: Tu, Jiannan > Cc: Barry Smith ; Hong Zhang ; Constantinescu, Emil M. ; petsc-users > Subject: Re: [petsc-users] TS failed due to diverged_step_rejected > > You don't often get email from hongzhang at anl.gov . Learn why this is important > CAUTION: This email was sent from outside the UMass Lowell network. > > > > > > > > > On Feb 17, 2023, at 6:19 PM, Tu, Jiannan > wrote: > > I need to find out what causes negative temperature first. Following is the message with adaptivity turned off. The G(u) gives right-hand equation for electron temperature at bottom boundary. The F(u, u?) function is F(u, u?) = X = G(u) and the jacobian element is d F(u, u?) / dX =1. > > This looks strange. Can you elaborate a bit on your partitioned ODE? For example, how are your F(u,udot) (IFunction) and G(u) (RHSFunction) defined? > > A good IMEX example can be found at ts/tutorial/advection-diffusion-reaction/ex5.c (and reaction_diffusion.c). > > Hong (Mr.) > > > > > > > The solution from TSStep is checked for positivity of densities and temperatures. > > From the message below, it is seen that G(u) > 0 (I added output of right-hand equation for electron temperature). The solution for electron temperature X should be X * jacobian element = G(u) > 0 since jacobian element = 1. I don?t understand why it becomes negative. Is my understanding of TS formula incorrect? > > Thank you, > Jiannan > > ---------------------------------- > G(u) = 1.86534e-07 > 0 SNES Function norm 2.274473072183e+03 > 1 SNES Function norm 8.641749325070e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > G(u) = 1.86534e-07 > 0 SNES Function norm 8.716501970511e-02 > 1 SNES Function norm 2.213263548813e-04 > 2 SNES Function norm 2.779985176426e-08 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > G(u) = 1.86534e-07 > 0 SNES Function norm 3.177195995186e-01 > 1 SNES Function norm 3.607702491344e-04 > 2 SNES Function norm 4.345809629121e-08 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > G(u) = 1.86534e-07 > TSAdapt none arkimex 0:3 step 0 accepted t=42960 + 2.189e-02 dt=2.189e-02 > electron temperature = -3.6757e-15 at (i, j, k) = (0, 1, 0) > > > > From: Barry Smith > Sent: Friday, February 17, 2023 3:45 PM > To: Tu, Jiannan ; Hong Zhang ; Emil Constantinescu > Cc: petsc-users > Subject: Re: [petsc-users] TS failed due to diverged_step_rejected > > CAUTION: This email was sent from outside the UMass Lowell network. > > > > > > > > > > On Feb 17, 2023, at 3:32 PM, Tu, Jiannan > wrote: > > The ts_type arkimex is used. There is right hand-side function RHSFunction set by TSSetRHSFunction() and also stiff function set by TSSetIFunction(). > > With adaptivity shut off, TS can finish its first time step after the 3rd ?Nonlinear solve converged due to ??. The solution gives negative electron and neutral temperatures at the bottom boundary. I need to fix the negative temperatures and see how the code works. > > BTW, what is this ts_adapt? Is it by default on? > > It is default for some of the TSTypes (in particular, the better ones). It adapts the timestep to ensure some local error estimate is below a certain tolerance. As Matt notes normally as it tries smaller and smaller time steps the local error estimate would get smaller and smaller; this is not happening here, hence the error. > > Have you tried with the argument -ts_arkimex_fully_implicit ? > > I am not an expert but my guess is something is "odd" about your functions, either the RHSFunction or the Function or both. Do you have a hierarchy of models for your problem? Could you try runs with fewer terms in your functions, that may be producing the difficulties? If you can determine what triggers the problem with the local error estimators, that might help the experts in ODE solution (not me) determine what could be going wrong. > > Barry > > > > > > > > > > > Thank you, > Jiannan > > From: Matthew Knepley > Sent: Friday, February 17, 2023 3:15 PM > To: Tu, Jiannan > Cc: Barry Smith ; petsc-users > Subject: Re: [petsc-users] TS failed due to diverged_step_rejected > > CAUTION: This email was sent from outside the UMass Lowell network. > > I am not sure what TS you are using, but the estimate of the local truncation error is 91.4, and does not seem > to change when you make the step smaller, so something is off. You can shut off the adaptivity using > > -ts_adapt_type none > > Thanks, > > Matt > > On Fri, Feb 17, 2023 at 3:01 PM Tu, Jiannan > wrote: > These are what I got with the options you suggested. > > Thank you, > Jiannan > > ------------------------------------------------------------------------------- > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 1.673091274668e-03 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 8.715428433630e-02 > 1 SNES Function norm 4.995727626692e-04 > 2 SNES Function norm 5.498018152230e-08 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > 0 SNES Function norm 3.237461568254e-01 > 1 SNES Function norm 7.988531005091e-04 > 2 SNES Function norm 1.280948196292e-07 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 2.189e-02 dt=4.374e-03 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 4.881903203545e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 7.562592690785e-02 > 1 SNES Function norm 1.143078818923e-04 > 2 SNES Function norm 9.834547907735e-09 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > 0 SNES Function norm 2.683968949758e-01 > 1 SNES Function norm 1.838028436639e-04 > 2 SNES Function norm 9.470813523140e-09 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-03 dt=4.374e-04 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 1.821562431175e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 1.005443458812e-01 > 1 SNES Function norm 3.633336946661e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 1.515368382715e-01 > 1 SNES Function norm 3.389298316830e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-04 dt=4.374e-05 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 4.541003359206e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 1.713800906043e-01 > 1 SNES Function norm 1.179958172167e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.020265094117e-01 > 1 SNES Function norm 1.513971290464e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-05 dt=4.374e-06 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 6.090269704320e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.136603895703e-01 > 1 SNES Function norm 1.877474016012e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 3.127812462507e-01 > 1 SNES Function norm 2.713146825704e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-06 dt=4.374e-07 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 2.793512213059e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.205196267430e-01 > 1 SNES Function norm 2.572653773308e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 3.260057361977e-01 > 1 SNES Function norm 2.705816087598e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-07 dt=4.374e-08 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 2.764855860446e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.212505522844e-01 > 1 SNES Function norm 2.958996472386e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 3.273222034162e-01 > 1 SNES Function norm 2.994512887620e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-08 dt=4.374e-09 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 3.317240589134e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.213246532918e-01 > 1 SNES Function norm 2.799468604767e-04 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 3.274570888397e-01 > 1 SNES Function norm 3.066048050994e-04 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-09 dt=4.374e-10 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072189e+03 > 1 SNES Function norm 2.653507278572e-03 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.213869585841e-01 > 1 SNES Function norm 2.177156902895e-03 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 3.275136370365e-01 > 1 SNES Function norm 1.962849131557e-03 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-10 dt=4.374e-11 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473072218e+03 > 1 SNES Function norm 5.664907315679e-03 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.223208399368e-01 > 1 SNES Function norm 5.688863091415e-03 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 3.287121218919e-01 > 1 SNES Function norm 4.085338521320e-03 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-11 dt=4.374e-12 wlte= 91.4 wltea= -1 wlter= -1 > 0 SNES Function norm 2.274473071968e+03 > 1 SNES Function norm 4.694691905235e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.211786508657e-01 > 1 SNES Function norm 1.503497433939e-04 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 3.272667798977e-01 > 1 SNES Function norm 2.176132327279e-04 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-12 dt=4.374e-13 wlte= 91.4 wltea= -1 wlter= -1 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: > [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.6, Mar 30, 2022 > [0]PETSC ERROR: ./iditm3d on a named office by jtu Fri Feb 17 14:54:22 2023 > [0]PETSC ERROR: Configure options --prefix=/usr/local --with-mpi-dir=/usr/local --with-fc=0 --with-openmp --with-hdf5-dir=/usr/local --download-f2cblaslapack=1 > [0]PETSC ERROR: #1 TSStep() at /home/jtu/Downloads/petsc-3.16.6/src/ts/interface/ts.c:3583 > > > > From: Barry Smith > Sent: Friday, February 17, 2023 12:58 PM > To: Tu, Jiannan > Cc: petsc-users > Subject: Re: [petsc-users] TS failed due to diverged_step_rejected > > CAUTION: This email was sent from outside the UMass Lowell network. > > > Can you please run with also the options -ts_monitor -ts_adapt_monitor ? > > The output is confusing because it prints that the Nonlinear solve has converged but then TSStep has failed due to DIVERGED_STEP_REJECTED which seems contradictory > > > > On Feb 17, 2023, at 12:09 PM, Tu, Jiannan > wrote: > > My code uses TS to solve a set of multi-fluid MHD equations. The jacobian is provided with function F(t, u, u'). Both linear and nonlinear solvers converge but snes repeats itself until gets "TSStep has failed due to diverged_step_rejected." > > Is it because I used TSStep rather than TSSolve? I have checked the condition number. The condition number with pc_type asm is about 1 (without precondition it is about 4x10^4). The maximum ratio of off-diagonal jacobian element over diagonal element is about 21. > > Could you help me to identify what is going wrong? > > Thank you very much! > > Jiannan > > --------------------------------------------------------------------------------------------------- > Run command with options > > mpiexec -n $1 ./iditm3d -ts_type arkimex -snes_tyep ngmres -ksp_type gmres -pc_type asm \ > -ts_rtol 1.0e-4 -ts_atol 1.0e-4 -snes_monitor -snes_rtol 1.0e-4 -snes_atol 1.0e-4 \ > -snes_converged_reason > > The output message is > > Start time advancing ... > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 1.673091274668e-03 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 8.715428433630e-02 > 1 SNES Function norm 4.995727626692e-04 > 2 SNES Function norm 5.498018152230e-08 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > 0 SNES Function norm 3.237461568254e-01 > 1 SNES Function norm 7.988531005091e-04 > 2 SNES Function norm 1.280948196292e-07 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 4.881903203545e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 7.562592690785e-02 > 1 SNES Function norm 1.143078818923e-04 > 2 SNES Function norm 9.834547907735e-09 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > 0 SNES Function norm 2.683968949758e-01 > 1 SNES Function norm 1.838028436639e-04 > 2 SNES Function norm 9.470813523140e-09 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 1.821562431175e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 1.005443458812e-01 > 1 SNES Function norm 3.633336946661e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 1.515368382715e-01 > 1 SNES Function norm 3.389298316830e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 4.541003359206e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 1.713800906043e-01 > 1 SNES Function norm 1.179958172167e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.020265094117e-01 > 1 SNES Function norm 1.513971290464e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 6.090269704320e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.136603895703e-01 > 1 SNES Function norm 1.877474016012e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 3.127812462507e-01 > 1 SNES Function norm 2.713146825704e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 2.793512213059e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.205196267430e-01 > 1 SNES Function norm 2.572653773308e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 3.260057361977e-01 > 1 SNES Function norm 2.705816087598e-06 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 2.764855860446e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.212505522844e-01 > 1 SNES Function norm 2.958996472386e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 3.273222034162e-01 > 1 SNES Function norm 2.994512887620e-05 > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 > 0 SNES Function norm 2.274473072186e+03 > 1 SNES Function norm 3.317240589134e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.213246532918e-01 > 1 SNES Function norm 2.799468604767e-04 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 3.274570888397e-01 > 1 SNES Function norm 3.066048050994e-04 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.274473072189e+03 > 1 SNES Function norm 2.653507278572e-03 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.213869585841e-01 > 1 SNES Function norm 2.177156902895e-03 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 3.275136370365e-01 > 1 SNES Function norm 1.962849131557e-03 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.274473072218e+03 > 1 SNES Function norm 5.664907315679e-03 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.223208399368e-01 > 1 SNES Function norm 5.688863091415e-03 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 3.287121218919e-01 > 1 SNES Function norm 4.085338521320e-03 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.274473071968e+03 > 1 SNES Function norm 4.694691905235e-04 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > 0 SNES Function norm 2.211786508657e-01 > 1 SNES Function norm 1.503497433939e-04 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 3.272667798977e-01 > 1 SNES Function norm 2.176132327279e-04 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: > [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.6, Mar 30, 2022 > [0]PETSC ERROR: ./iditm3d on a named office by jtu Fri Feb 17 11:59:43 2023 > [0]PETSC ERROR: Configure options --prefix=/usr/local --with-mpi-dir=/usr/local --with-fc=0 --with-openmp --with-hdf5-dir=/usr/local --download-f2cblaslapack=1 > [0]PETSC ERROR: #1 TSStep() at /home/jtu/Downloads/petsc-3.16.6/src/ts/interface/ts.c:3583 > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jiannan_Tu at uml.edu Wed May 10 12:38:22 2023 From: Jiannan_Tu at uml.edu (Tu, Jiannan) Date: Wed, 10 May 2023 17:38:22 +0000 Subject: [petsc-users] Unconditional jump or move depends on uninitialised value(s) In-Reply-To: <1E04F37F-2286-404D-AB87-9FDD5E878DC4@petsc.dev> References: <295E7E1A-1649-435F-AE65-F061F287513F@petsc.dev> <3F1BD989-8516-4649-A385-5F94FD1A9470@petsc.dev> <0B7BA32F-03CE-44F2-A9A3-4584B2D7AB94@anl.gov> <9523EDF9-7C02-4872-9E0E-1DFCBCB28066@anl.gov> <88895B56-2BEF-49BF-B6E9-F75186E712D9@anl.gov> <1E04F37F-2286-404D-AB87-9FDD5E878DC4@petsc.dev> Message-ID: Hi Barry, Thank you for your advice. I didn't send the function because it is a little bit long. The function is attached. Your reply reminds me I should need also to check all the other local variables besides vector X (which the error messages give me the impression the array from the vector X is the cause). I'll see what gcc complier options I should use to detect the use of declared variables that are not initialized. Jiannan ________________________________ From: Barry Smith Sent: Tuesday, May 9, 2023 11:16 PM To: Tu, Jiannan Cc: petsc-users at mcs.anl.gov ; Zhang, Hong Subject: Re: [petsc-users]Unconditional jump or move depends on uninitialised value(s) CAUTION: This email was sent from outside the UMass Lowell network. It would have been best to send formfunctions.cpp: since that is where the problem is. Likely you have a local variable (hence the message "stack-allocation") in that function that is not initialized but that you use to fill up the function array values. Valgrind does not detect all uses of unitialized memory, it only detects them when they would change a flow direction in the code, like in an if () test. This is why the problem pops up in the VecNorm() and not if you just call your function from main(). There are compiler options to detect the use of used declared variables that are not initialized which might help you find the problem at compile time. Barry On May 9, 2023, at 10:59 PM, Tu, Jiannan wrote: I am using PETSC SNES to solve a nonlinear equation system resulted from discretization of partial differential equations. When I use Valgrind to check my program, there are lots of errors of ?Unconditional jump or move depends on uninitialised value(s)? produced. The errors occur in the function routine set by SNESSetFunction(snes, NULL, formfunctions, ¶ms). Even using only one MPI process such errors occur. It seems solution vector X is not initialized through SNESSolve(). But if the formfunctions() is called directly from main(), there are no errors. I really don?t understand why. Could you please help me identify what is going wrong? Thank you very much, Jiannan ------------------------------------------------- The error message is like the following example Conditional jump or move depends on uninitialised value(s) ==866758== at 0xA19178C: sqrt (w_sqrt_compat.c:31) ==866758== by 0x4EA9E4C: VecNorm_Seq (bvec2.c:227) ==866758== by 0x4F705C8: VecNorm (rvector.c:228) ==866758== by 0x65D5B16: SNESSolve_NEWTONLS (ls.c:179) ==866758== by 0x673093F: SNESSolve (snes.c:4809) ==866758== by 0x12F607: main (iditm3d.cpp:138) ==866758== Uninitialised value was created by a stack allocation ==866758== at 0x11AC36: functions(Field***, Field***, Field***, int, int, int, int, int, int, AppCtx*, Field***) (formfunctions.cpp:16) functions() is called within the formfunctions(). The code snippet using SNES solver is int formfunctions(SNES, Vec, Vec, void *ctx); int jacobian(SNES, Vec, Mat, Mat, void *ctx); DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_PERIODIC, DMDA_STENCIL_STAR,a1,a2,a3,PETSC_DECIDE,PETSC_DECIDE,1,a4,2,NULL,NULL,NULL,&da); DMSetFromOptions(da); DMSetUp(da); DMCreateGlobalVector(da, &X); VecDuplicate(X, ¶ms.U); VecDuplicate(X, ¶ms.Xn); /* set up grids and related geometric parameters. Set up initial solution vector X */ if (initialize(da, X, ¶ms) < 0) exit(-1); SNES snes; SNESCreate(MPI_COMM_WORLD, &snes); SNESSetType(snes, SNESNEWTONLS); SNESSetFromOptions(snes); SNESSetDM(snes, da); SNESSetFunction(snes, NULL, formfunctions, ¶ms); KSP ksp; SNESGetKSP(snes, &ksp); KSPSetType(ksp, KSPFGMRES); KSPSetFromOptions(ksp); PC pc; KSPGetPC(ksp, &pc); PCSetType(pc, PCJACOBI); PCSetFromOptions(pc); Mat A; DMSetMatrixPreallocateOnly(da, PETSC_FALSE); DMSetMatType(da, MATMPIAIJ); DMDASetBlockFills(da, dfill, ofill); DMCreateMatrix(da, &A); SNESSetJacobian(snes, A, A, jacobian, ¶ms); SNESSetSolution(snes, X); //set initial guess of the solution SNESSolve(snes, PETSC_NULL, X); //iterative solver to find the solution From: Zhang, Hong Sent: Tuesday, February 21, 2023 11:21 PM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. On Feb 21, 2023, at 8:54 PM, Tu, Jiannan > wrote: CN or BEular doesn?t work. They produce negative densities at the lower boundary even RHS functions are positive. So for TS, all equations must include udot? You have algebraic constraints only for the boundary points. For all the other points, you must have udot in IFunction. I recommend you to take a look at the example src/ts/tutorials/ex25.c Hong (Mr.) Thank you, Jiannan From: Zhang, Hong Sent: Monday, February 20, 2023 11:07 AM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. If you have to include the boundary points, I would suggest starting from a fully implicit solver such as CN or BEuler with a finite-difference approximated Jacobian. When this works for a small scale setting, you can build up more functionalities such as IMEX and analytical Jacobians and extend the problem to a larger scale. But the udot issue needs to be fixed in the first place. Hong (Mr.) On Feb 19, 2023, at 9:23 PM, Tu, Jiannan > wrote: It is the second order derivative of, say electron temperature = 0 at the boundary. I am not sure how I can exclude the boundary points because the values of unknowns must be specified at the boundary. Are there any other solvers, e.g., CN, good to solve the equation system? Thank you, Jiannan From: Zhang, Hong Sent: Sunday, February 19, 2023 4:48 PM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. It is fine to drop udot for the boundary points, but you need to keep udot for all the other points. In addition, which boundary condition do you use in IFunction? The way you are treating the boundary points actually leads to a system of differential-algebraic equations, which could be difficult to solve with the ARKIMEX solver. Can you try to exclude the boundary points from the computational domain so that you will have just a system of ODEs? Hong (Mr.) On Feb 18, 2023, at 4:28 PM, Tu, Jiannan > wrote: Thanks for the instruction. This is the boundary condition and there is no udot in the equation. I think this is the way to define IFunction at the boundary. Maybe I?m wrong? Or is there some way to introduce udot into the specification of the equation at the boundary from the aspect of the implementation for TS? Thank you, Jiannan From: Zhang, Hong Sent: Saturday, February 18, 2023 12:40 PM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected You don't often get email from hongzhang at anl.gov. Learn why this is important CAUTION: This email was sent from outside the UMass Lowell network. On Feb 18, 2023, at 8:44 AM, Tu, Jiannan > wrote: The RHS function at the bottom boundary is determined by the boundary condition, which is the second order derivative = 0, i.e. G(u) = 2*X[i=1] ? X[i=2]. Then in IFunction, F(u, udot) = X[i=0]. This might be the problem. Your F(u, udot) is missing udot according to your description. Take a simple ODE udot = f(u) + g(u) for example. One way to partition this ODE is to define F = udot - f(u) as the IFunction and G = g(u) as the RHSFunction. Hong (Mr.) Thank you, Jiannan From: Zhang, Hong Sent: Friday, February 17, 2023 11:54 PM To: Tu, Jiannan Cc: Barry Smith; Hong Zhang; Constantinescu, Emil M.; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected You don't often get email from hongzhang at anl.gov. Learn why this is important CAUTION: This email was sent from outside the UMass Lowell network. On Feb 17, 2023, at 6:19 PM, Tu, Jiannan > wrote: I need to find out what causes negative temperature first. Following is the message with adaptivity turned off. The G(u) gives right-hand equation for electron temperature at bottom boundary. The F(u, u?) function is F(u, u?) = X = G(u) and the jacobian element is d F(u, u?) / dX =1. This looks strange. Can you elaborate a bit on your partitioned ODE? For example, how are your F(u,udot) (IFunction) and G(u) (RHSFunction) defined? A good IMEX example can be found at ts/tutorial/advection-diffusion-reaction/ex5.c (and reaction_diffusion.c). Hong (Mr.) The solution from TSStep is checked for positivity of densities and temperatures. >From the message below, it is seen that G(u) > 0 (I added output of right-hand equation for electron temperature). The solution for electron temperature X should be X * jacobian element = G(u) > 0 since jacobian element = 1. I don?t understand why it becomes negative. Is my understanding of TS formula incorrect? Thank you, Jiannan ---------------------------------- G(u) = 1.86534e-07 0 SNES Function norm 2.274473072183e+03 1 SNES Function norm 8.641749325070e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 G(u) = 1.86534e-07 0 SNES Function norm 8.716501970511e-02 1 SNES Function norm 2.213263548813e-04 2 SNES Function norm 2.779985176426e-08 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 G(u) = 1.86534e-07 0 SNES Function norm 3.177195995186e-01 1 SNES Function norm 3.607702491344e-04 2 SNES Function norm 4.345809629121e-08 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 G(u) = 1.86534e-07 TSAdapt none arkimex 0:3 step 0 accepted t=42960 + 2.189e-02 dt=2.189e-02 electron temperature = -3.6757e-15 at (i, j, k) = (0, 1, 0) From: Barry Smith Sent: Friday, February 17, 2023 3:45 PM To: Tu, Jiannan; Hong Zhang; Emil Constantinescu Cc: petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. On Feb 17, 2023, at 3:32 PM, Tu, Jiannan > wrote: The ts_type arkimex is used. There is right hand-side function RHSFunction set by TSSetRHSFunction() and also stiff function set by TSSetIFunction(). With adaptivity shut off, TS can finish its first time step after the 3rd ?Nonlinear solve converged due to ??. The solution gives negative electron and neutral temperatures at the bottom boundary. I need to fix the negative temperatures and see how the code works. BTW, what is this ts_adapt? Is it by default on? It is default for some of the TSTypes (in particular, the better ones). It adapts the timestep to ensure some local error estimate is below a certain tolerance. As Matt notes normally as it tries smaller and smaller time steps the local error estimate would get smaller and smaller; this is not happening here, hence the error. Have you tried with the argument -ts_arkimex_fully_implicit ? I am not an expert but my guess is something is "odd" about your functions, either the RHSFunction or the Function or both. Do you have a hierarchy of models for your problem? Could you try runs with fewer terms in your functions, that may be producing the difficulties? If you can determine what triggers the problem with the local error estimators, that might help the experts in ODE solution (not me) determine what could be going wrong. Barry Thank you, Jiannan From: Matthew Knepley Sent: Friday, February 17, 2023 3:15 PM To: Tu, Jiannan Cc: Barry Smith; petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. I am not sure what TS you are using, but the estimate of the local truncation error is 91.4, and does not seem to change when you make the step smaller, so something is off. You can shut off the adaptivity using -ts_adapt_type none Thanks, Matt On Fri, Feb 17, 2023 at 3:01 PM Tu, Jiannan > wrote: These are what I got with the options you suggested. Thank you, Jiannan ------------------------------------------------------------------------------- 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 1.673091274668e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 8.715428433630e-02 1 SNES Function norm 4.995727626692e-04 2 SNES Function norm 5.498018152230e-08 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 3.237461568254e-01 1 SNES Function norm 7.988531005091e-04 2 SNES Function norm 1.280948196292e-07 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 2.189e-02 dt=4.374e-03 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 4.881903203545e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 7.562592690785e-02 1 SNES Function norm 1.143078818923e-04 2 SNES Function norm 9.834547907735e-09 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 2.683968949758e-01 1 SNES Function norm 1.838028436639e-04 2 SNES Function norm 9.470813523140e-09 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-03 dt=4.374e-04 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 1.821562431175e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.005443458812e-01 1 SNES Function norm 3.633336946661e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 1.515368382715e-01 1 SNES Function norm 3.389298316830e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-04 dt=4.374e-05 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 4.541003359206e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 1.713800906043e-01 1 SNES Function norm 1.179958172167e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.020265094117e-01 1 SNES Function norm 1.513971290464e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-05 dt=4.374e-06 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 6.090269704320e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.136603895703e-01 1 SNES Function norm 1.877474016012e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.127812462507e-01 1 SNES Function norm 2.713146825704e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-06 dt=4.374e-07 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 2.793512213059e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.205196267430e-01 1 SNES Function norm 2.572653773308e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.260057361977e-01 1 SNES Function norm 2.705816087598e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-07 dt=4.374e-08 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 2.764855860446e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.212505522844e-01 1 SNES Function norm 2.958996472386e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.273222034162e-01 1 SNES Function norm 2.994512887620e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-08 dt=4.374e-09 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 3.317240589134e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.213246532918e-01 1 SNES Function norm 2.799468604767e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.274570888397e-01 1 SNES Function norm 3.066048050994e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-09 dt=4.374e-10 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072189e+03 1 SNES Function norm 2.653507278572e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.213869585841e-01 1 SNES Function norm 2.177156902895e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.275136370365e-01 1 SNES Function norm 1.962849131557e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-10 dt=4.374e-11 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473072218e+03 1 SNES Function norm 5.664907315679e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.223208399368e-01 1 SNES Function norm 5.688863091415e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.287121218919e-01 1 SNES Function norm 4.085338521320e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-11 dt=4.374e-12 wlte= 91.4 wltea= -1 wlter= -1 0 SNES Function norm 2.274473071968e+03 1 SNES Function norm 4.694691905235e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.211786508657e-01 1 SNES Function norm 1.503497433939e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.272667798977e-01 1 SNES Function norm 2.176132327279e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 TSAdapt basic arkimex 0:3 step 0 rejected t=42960 + 4.374e-12 dt=4.374e-13 wlte= 91.4 wltea= -1 wlter= -1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.6, Mar 30, 2022 [0]PETSC ERROR: ./iditm3d on a named office by jtu Fri Feb 17 14:54:22 2023 [0]PETSC ERROR: Configure options --prefix=/usr/local --with-mpi-dir=/usr/local --with-fc=0 --with-openmp --with-hdf5-dir=/usr/local --download-f2cblaslapack=1 [0]PETSC ERROR: #1 TSStep() at /home/jtu/Downloads/petsc-3.16.6/src/ts/interface/ts.c:3583 From: Barry Smith Sent: Friday, February 17, 2023 12:58 PM To: Tu, Jiannan Cc: petsc-users Subject: Re: [petsc-users] TS failed due to diverged_step_rejected CAUTION: This email was sent from outside the UMass Lowell network. Can you please run with also the options -ts_monitor -ts_adapt_monitor ? The output is confusing because it prints that the Nonlinear solve has converged but then TSStep has failed due to DIVERGED_STEP_REJECTED which seems contradictory On Feb 17, 2023, at 12:09 PM, Tu, Jiannan > wrote: My code uses TS to solve a set of multi-fluid MHD equations. The jacobian is provided with function F(t, u, u'). Both linear and nonlinear solvers converge but snes repeats itself until gets "TSStep has failed due to diverged_step_rejected." Is it because I used TSStep rather than TSSolve? I have checked the condition number. The condition number with pc_type asm is about 1 (without precondition it is about 4x10^4). The maximum ratio of off-diagonal jacobian element over diagonal element is about 21. Could you help me to identify what is going wrong? Thank you very much! Jiannan --------------------------------------------------------------------------------------------------- Run command with options mpiexec -n $1 ./iditm3d -ts_type arkimex -snes_tyep ngmres -ksp_type gmres -pc_type asm \ -ts_rtol 1.0e-4 -ts_atol 1.0e-4 -snes_monitor -snes_rtol 1.0e-4 -snes_atol 1.0e-4 \ -snes_converged_reason The output message is Start time advancing ... 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 1.673091274668e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 8.715428433630e-02 1 SNES Function norm 4.995727626692e-04 2 SNES Function norm 5.498018152230e-08 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 3.237461568254e-01 1 SNES Function norm 7.988531005091e-04 2 SNES Function norm 1.280948196292e-07 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 4.881903203545e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 7.562592690785e-02 1 SNES Function norm 1.143078818923e-04 2 SNES Function norm 9.834547907735e-09 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 2.683968949758e-01 1 SNES Function norm 1.838028436639e-04 2 SNES Function norm 9.470813523140e-09 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 2 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 1.821562431175e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 1.005443458812e-01 1 SNES Function norm 3.633336946661e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 1.515368382715e-01 1 SNES Function norm 3.389298316830e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 4.541003359206e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 1.713800906043e-01 1 SNES Function norm 1.179958172167e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.020265094117e-01 1 SNES Function norm 1.513971290464e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 6.090269704320e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.136603895703e-01 1 SNES Function norm 1.877474016012e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.127812462507e-01 1 SNES Function norm 2.713146825704e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 2.793512213059e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.205196267430e-01 1 SNES Function norm 2.572653773308e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.260057361977e-01 1 SNES Function norm 2.705816087598e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 2.764855860446e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.212505522844e-01 1 SNES Function norm 2.958996472386e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 3.273222034162e-01 1 SNES Function norm 2.994512887620e-05 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 1 0 SNES Function norm 2.274473072186e+03 1 SNES Function norm 3.317240589134e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.213246532918e-01 1 SNES Function norm 2.799468604767e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.274570888397e-01 1 SNES Function norm 3.066048050994e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 2.274473072189e+03 1 SNES Function norm 2.653507278572e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.213869585841e-01 1 SNES Function norm 2.177156902895e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.275136370365e-01 1 SNES Function norm 1.962849131557e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 2.274473072218e+03 1 SNES Function norm 5.664907315679e-03 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.223208399368e-01 1 SNES Function norm 5.688863091415e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.287121218919e-01 1 SNES Function norm 4.085338521320e-03 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 2.274473071968e+03 1 SNES Function norm 4.694691905235e-04 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 0 SNES Function norm 2.211786508657e-01 1 SNES Function norm 1.503497433939e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 0 SNES Function norm 3.272667798977e-01 1 SNES Function norm 2.176132327279e-04 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [0]PETSC ERROR: TSStep has failed due to DIVERGED_STEP_REJECTED [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.6, Mar 30, 2022 [0]PETSC ERROR: ./iditm3d on a named office by jtu Fri Feb 17 11:59:43 2023 [0]PETSC ERROR: Configure options --prefix=/usr/local --with-mpi-dir=/usr/local --with-fc=0 --with-openmp --with-hdf5-dir=/usr/local --download-f2cblaslapack=1 [0]PETSC ERROR: #1 TSStep() at /home/jtu/Downloads/petsc-3.16.6/src/ts/interface/ts.c:3583 -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: formfunctions.cpp Type: text/x-c++src Size: 37248 bytes Desc: formfunctions.cpp URL: From jed at jedbrown.org Wed May 10 14:50:40 2023 From: jed at jedbrown.org (Jed Brown) Date: Wed, 10 May 2023 13:50:40 -0600 Subject: [petsc-users] issues with VecSetValues in petsc 3.19 In-Reply-To: References: Message-ID: <87bkist17z.fsf@jedbrown.org> Edoardo alinovi writes: > Hello Barry, > > Welcome to the party! Thank you guys for your precious suggestions, they > are really helpful! > > It's been a while since I am messing around and I have tested many > combinations. Schur + selfp is the best preconditioner, it converges within > 5 iters using gmres for inner solvers but it is not very fast and sometimes > multiplicative it's a better option as the inner iterations looks lighter. > If I tune the relative tolerance I get a huge speed up, but I am a bit less > confident about the results. The funny thing is that the default relative > tolerance if commercial CFD solver is huge, very often 0.1 -.- This depends on time step size and tolerances for quantities of interest. > If I may borrow your brain guys for a while, I would like to ask your > opinion about the multigrid they use in this paper: > https://www.aub.edu.lb/msfea/research/Documents/CFD-P18.pdf. > At some point they say: " The algorithm used in this work is a combination > of the ILU(0) [28] algorithm with an additive corrective multigrid method > [29] ", 29: https://www.tandfonline.com/doi/abs/10.1080/10407788608913491 It looks like this is a coupled solver with collocated dofs and thus I don't know how they stabilize in the incompressible limit (an inf-sup requirement is needed for stability). (This is based on a cursory read.) From FERRANJ2 at my.erau.edu Thu May 11 13:14:45 2023 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Thu, 11 May 2023 18:14:45 +0000 Subject: [petsc-users] DMPlex, is there an API to get a list of Boundary points? Message-ID: Greetings. I terms of dm-plex terminology, I need a list points corresponding to the boundary (i.e., height-1 points whose support is of size 1). Sorry if this is trivial, but I've been looking at the list of APIs for DM-Plex and couldn't find/discern something that addresses this need. I guess I could loop through all height-1 points and call DMPlexGetSupportSize() and assemble said list manually. The assumption for the plex I'm working with is that it has had DMPlexInterpolate()called on it. Sincerely: J.A. Ferrand Embry-Riddle Aeronautical University - Daytona Beach - FL Ph.D. Candidate, Aerospace Engineering M.Sc. Aerospace Engineering B.Sc. Aerospace Engineering B.Sc. Computational Mathematics Phone: (386)-843-1829 Email(s): ferranj2 at my.erau.edu jesus.ferrand at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu May 11 13:20:46 2023 From: jed at jedbrown.org (Jed Brown) Date: Thu, 11 May 2023 12:20:46 -0600 Subject: [petsc-users] DMPlex, is there an API to get a list of Boundary points? In-Reply-To: References: Message-ID: <87h6sispa9.fsf@jedbrown.org> Boundary faces are often labeled already on a mesh, but you can use this to set a label for all boundary faces. https://petsc.org/main/manualpages/DMPlex/DMPlexMarkBoundaryFaces/ "Ferrand, Jesus A." writes: > Greetings. > > I terms of dm-plex terminology, I need a list points corresponding to the boundary (i.e., height-1 points whose support is of size 1). > Sorry if this is trivial, but I've been looking at the list of APIs for DM-Plex and couldn't find/discern something that addresses this need. > > I guess I could loop through all height-1 points and call DMPlexGetSupportSize() and assemble said list manually. > The assumption for the plex I'm working with is that it has had DMPlexInterpolate()called on it. > > > Sincerely: > > J.A. Ferrand > > Embry-Riddle Aeronautical University - Daytona Beach - FL > Ph.D. Candidate, Aerospace Engineering > > M.Sc. Aerospace Engineering > > B.Sc. Aerospace Engineering > > B.Sc. Computational Mathematics > > > Phone: (386)-843-1829 > > Email(s): ferranj2 at my.erau.edu > > jesus.ferrand at gmail.com From myoung.space.science at gmail.com Thu May 11 20:14:21 2023 From: myoung.space.science at gmail.com (Matthew Young) Date: Thu, 11 May 2023 21:14:21 -0400 Subject: [petsc-users] DMSWARM particle coordinates per rank Message-ID: Does setting up a PIC-type DMSWARM with an associated cell DM guarantee that each MPI rank will own the particles with coordinates inside the bounds of the portion of the grid it owns? --Matt ========================== Matthew Young, PhD (he/him) Research Scientist II Space Science Center University of New Hampshire Matthew.Young at unh.edu ========================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 12 04:14:59 2023 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 May 2023 05:14:59 -0400 Subject: [petsc-users] DMSWARM particle coordinates per rank In-Reply-To: References: Message-ID: On Thu, May 11, 2023 at 9:15?PM Matthew Young < myoung.space.science at gmail.com> wrote: > Does setting up a PIC-type DMSWARM with an associated cell DM guarantee > that each MPI rank will own the particles with coordinates inside the > bounds of the portion of the grid it owns? > There is a caveat that we are currently fixing. Swarm communication is setup to be nearest neighbor (since there is no coarse grid of bounding boxes). So if your particles are initially in the right place, and only move nearest neighbor, everything is fine. We are adding a hierarchy of bounding boxes so that we can communicate anywhere. Thanks, Matt > --Matt > ========================== > Matthew Young, PhD (he/him) > Research Scientist II > Space Science Center > University of New Hampshire > Matthew.Young at unh.edu > ========================== > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From myoung.space.science at gmail.com Fri May 12 08:39:49 2023 From: myoung.space.science at gmail.com (Matthew Young) Date: Fri, 12 May 2023 09:39:49 -0400 Subject: [petsc-users] DMSWARM particle coordinates per rank In-Reply-To: References: Message-ID: Got it. I'm specifically thinking about this in terms of the gather stage of my PIC code, where I loop over local particles to fill local density and flux arrays by linearly interpolating particle positions to the grid. The gather function currently assumes that the coordinates (i.e., the array representation of DMSwarmPICField_coor) of all particles on a given rank would correspond to only the global indices owned by that rank, via the relationship between indices and coordinates in the associated cell DM. Based on what you described, it sounds like I need to make sure that I initially lay down the particles so that their rank matches their coordinates. --Matt ========================== Matthew Young, PhD (he/him) Research Scientist II Space Science Center University of New Hampshire Matthew.Young at unh.edu ========================== On Fri, May 12, 2023 at 5:15?AM Matthew Knepley wrote: > On Thu, May 11, 2023 at 9:15?PM Matthew Young < > myoung.space.science at gmail.com> wrote: > >> Does setting up a PIC-type DMSWARM with an associated cell DM guarantee >> that each MPI rank will own the particles with coordinates inside the >> bounds of the portion of the grid it owns? >> > > There is a caveat that we are currently fixing. Swarm communication is > setup to be nearest neighbor (since there is no coarse grid of > bounding boxes). So if your particles are initially in the right place, and > only move nearest neighbor, everything is fine. We are adding a hierarchy > of bounding boxes so that we can communicate anywhere. > > Thanks, > > Matt > > >> --Matt >> ========================== >> Matthew Young, PhD (he/him) >> Research Scientist II >> Space Science Center >> University of New Hampshire >> Matthew.Young at unh.edu >> ========================== >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 12 09:05:26 2023 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 May 2023 10:05:26 -0400 Subject: [petsc-users] DMSWARM particle coordinates per rank In-Reply-To: References: Message-ID: On Fri, May 12, 2023 at 9:40?AM Matthew Young < myoung.space.science at gmail.com> wrote: > Got it. > > I'm specifically thinking about this in terms of the gather stage of my > PIC code, where I loop over local particles to fill local density and flux > arrays by linearly interpolating particle positions to the grid. The gather > function currently assumes that the coordinates (i.e., the array > representation of DMSwarmPICField_coor) of all particles on a given rank > would correspond to only the global indices owned by that rank, via the > relationship between indices and coordinates in the associated cell DM. > Based on what you described, it sounds like I need to make sure that I > initially lay down the particles so that their rank matches their > coordinates. > Right now, yes. I will fix that before August. Thanks, Matt > --Matt > ========================== > Matthew Young, PhD (he/him) > Research Scientist II > Space Science Center > University of New Hampshire > Matthew.Young at unh.edu > ========================== > > > On Fri, May 12, 2023 at 5:15?AM Matthew Knepley wrote: > >> On Thu, May 11, 2023 at 9:15?PM Matthew Young < >> myoung.space.science at gmail.com> wrote: >> >>> Does setting up a PIC-type DMSWARM with an associated cell DM guarantee >>> that each MPI rank will own the particles with coordinates inside the >>> bounds of the portion of the grid it owns? >>> >> >> There is a caveat that we are currently fixing. Swarm communication is >> setup to be nearest neighbor (since there is no coarse grid of >> bounding boxes). So if your particles are initially in the right place, and >> only move nearest neighbor, everything is fine. We are adding a hierarchy >> of bounding boxes so that we can communicate anywhere. >> >> Thanks, >> >> Matt >> >> >>> --Matt >>> ========================== >>> Matthew Young, PhD (he/him) >>> Research Scientist II >>> Space Science Center >>> University of New Hampshire >>> Matthew.Young at unh.edu >>> ========================== >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From myoung.space.science at gmail.com Fri May 12 09:33:42 2023 From: myoung.space.science at gmail.com (Matthew Young) Date: Fri, 12 May 2023 10:33:42 -0400 Subject: [petsc-users] DMSWARM particle coordinates per rank In-Reply-To: References: Message-ID: Okay, cool. Should I do that by explicitly setting each particle's DMSwarmField_rank when I assign its position? What role does DMSwarmMigrate play in all of this? --Matt ========================== Matthew Young, PhD (he/him) Research Scientist II Space Science Center University of New Hampshire Matthew.Young at unh.edu ========================== On Fri, May 12, 2023 at 10:05?AM Matthew Knepley wrote: > On Fri, May 12, 2023 at 9:40?AM Matthew Young < > myoung.space.science at gmail.com> wrote: > >> Got it. >> >> I'm specifically thinking about this in terms of the gather stage of my >> PIC code, where I loop over local particles to fill local density and flux >> arrays by linearly interpolating particle positions to the grid. The gather >> function currently assumes that the coordinates (i.e., the array >> representation of DMSwarmPICField_coor) of all particles on a given rank >> would correspond to only the global indices owned by that rank, via the >> relationship between indices and coordinates in the associated cell DM. >> Based on what you described, it sounds like I need to make sure that I >> initially lay down the particles so that their rank matches their >> coordinates. >> > > Right now, yes. I will fix that before August. > > Thanks, > > Matt > > >> --Matt >> ========================== >> Matthew Young, PhD (he/him) >> Research Scientist II >> Space Science Center >> University of New Hampshire >> Matthew.Young at unh.edu >> ========================== >> >> >> On Fri, May 12, 2023 at 5:15?AM Matthew Knepley >> wrote: >> >>> On Thu, May 11, 2023 at 9:15?PM Matthew Young < >>> myoung.space.science at gmail.com> wrote: >>> >>>> Does setting up a PIC-type DMSWARM with an associated cell DM guarantee >>>> that each MPI rank will own the particles with coordinates inside the >>>> bounds of the portion of the grid it owns? >>>> >>> >>> There is a caveat that we are currently fixing. Swarm communication is >>> setup to be nearest neighbor (since there is no coarse grid of >>> bounding boxes). So if your particles are initially in the right place, and >>> only move nearest neighbor, everything is fine. We are adding a hierarchy >>> of bounding boxes so that we can communicate anywhere. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> --Matt >>>> ========================== >>>> Matthew Young, PhD (he/him) >>>> Research Scientist II >>>> Space Science Center >>>> University of New Hampshire >>>> Matthew.Young at unh.edu >>>> ========================== >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 12 09:42:21 2023 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 May 2023 10:42:21 -0400 Subject: [petsc-users] DMSWARM particle coordinates per rank In-Reply-To: References: Message-ID: On Fri, May 12, 2023 at 10:34?AM Matthew Young < myoung.space.science at gmail.com> wrote: > Okay, cool. > Should I do that by explicitly setting each particle's DMSwarmField_rank > when I assign its position? > What role does DMSwarmMigrate play in all of this? > If you are using a DMDA or a DMPlex, then PETSc now how to do parallel point location (at least for neighbors). So if you add particles on the process which owns that part of the grid, and then call DMSwarmMigrate(), it will locate every particle, and set the 'rank' and 'cellid' correctly. Then if you move the particles (change the coordinates) and call DMSwarmMigrate(), it will move them to the correct process and reset those entries. Thanks, Matt > --Matt > ========================== > Matthew Young, PhD (he/him) > Research Scientist II > Space Science Center > University of New Hampshire > Matthew.Young at unh.edu > ========================== > > > On Fri, May 12, 2023 at 10:05?AM Matthew Knepley > wrote: > >> On Fri, May 12, 2023 at 9:40?AM Matthew Young < >> myoung.space.science at gmail.com> wrote: >> >>> Got it. >>> >>> I'm specifically thinking about this in terms of the gather stage of my >>> PIC code, where I loop over local particles to fill local density and flux >>> arrays by linearly interpolating particle positions to the grid. The gather >>> function currently assumes that the coordinates (i.e., the array >>> representation of DMSwarmPICField_coor) of all particles on a given rank >>> would correspond to only the global indices owned by that rank, via the >>> relationship between indices and coordinates in the associated cell DM. >>> Based on what you described, it sounds like I need to make sure that I >>> initially lay down the particles so that their rank matches their >>> coordinates. >>> >> >> Right now, yes. I will fix that before August. >> >> Thanks, >> >> Matt >> >> >>> --Matt >>> ========================== >>> Matthew Young, PhD (he/him) >>> Research Scientist II >>> Space Science Center >>> University of New Hampshire >>> Matthew.Young at unh.edu >>> ========================== >>> >>> >>> On Fri, May 12, 2023 at 5:15?AM Matthew Knepley >>> wrote: >>> >>>> On Thu, May 11, 2023 at 9:15?PM Matthew Young < >>>> myoung.space.science at gmail.com> wrote: >>>> >>>>> Does setting up a PIC-type DMSWARM with an associated cell DM >>>>> guarantee that each MPI rank will own the particles with coordinates inside >>>>> the bounds of the portion of the grid it owns? >>>>> >>>> >>>> There is a caveat that we are currently fixing. Swarm communication is >>>> setup to be nearest neighbor (since there is no coarse grid of >>>> bounding boxes). So if your particles are initially in the right place, and >>>> only move nearest neighbor, everything is fine. We are adding a hierarchy >>>> of bounding boxes so that we can communicate anywhere. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> --Matt >>>>> ========================== >>>>> Matthew Young, PhD (he/him) >>>>> Research Scientist II >>>>> Space Science Center >>>>> University of New Hampshire >>>>> Matthew.Young at unh.edu >>>>> ========================== >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sat May 13 05:08:23 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Sat, 13 May 2023 18:08:23 +0800 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: Hi, Matt, There seem to be ongoing issues with projecting high-order coordinates from a gmsh file to other spaces. I would like to inquire whether there are any plans to resolve this problem. Thank you for your attention to this matter. Best wishes, Zongze On Sat, 18 Jun 2022 at 20:31, Zongze Yang wrote: > Thank you for your reply. May I ask for some references on the order of > the dofs on PETSc's FE Space (especially high order elements)? > > Thanks, > > Zongze > > Matthew Knepley ?2022?6?18??? 20:02??? > >> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang wrote: >> >>> In order to check if I made mistakes in the python code, I try to use c >>> code to show the issue on DMProjectCoordinates. The code and mesh file is >>> attached. >>> If the code is correct, there must be something wrong with >>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>> >> >> Something is definitely wrong with high order, periodic simplices from >> Gmsh. We had not tested that case. I am at a conference and cannot look at >> it for a week. >> My suspicion is that the space we make when reading in the Gmsh >> coordinates does not match the values (wrong order). >> >> Thanks, >> >> Matt >> >> >>> The command and the output are listed below: (Obviously the bounding box >>> is changed.) >>> ``` >>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view -new_fe_view >>> Old Bounding Box: >>> 0: lo = 0. hi = 1. >>> 1: lo = 0. hi = 1. >>> 2: lo = 0. hi = 1. >>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>> type: basic >>> Basic Finite Element in 3 dimensions with 3 components >>> PetscSpace Object: P2 1 MPI processes >>> type: sum >>> Space in 3 variables with 3 components, size 30 >>> Sum space of 3 concatenated subspaces (all identical) >>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>> type: poly >>> Space in 3 variables with 1 components, size 10 >>> Polynomial space of degree 2 >>> PetscDualSpace Object: P2 1 MPI processes >>> type: lagrange >>> Dual space with 3 components, size 30 >>> Discontinuous Lagrange dual space >>> Quadrature of order 5 on 27 points (dim 3) >>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>> type: basic >>> Basic Finite Element in 3 dimensions with 3 components >>> PetscSpace Object: P2 1 MPI processes >>> type: sum >>> Space in 3 variables with 3 components, size 30 >>> Sum space of 3 concatenated subspaces (all identical) >>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>> type: poly >>> Space in 3 variables with 1 components, size 10 >>> Polynomial space of degree 2 >>> PetscDualSpace Object: P2 1 MPI processes >>> type: lagrange >>> Dual space with 3 components, size 30 >>> Continuous Lagrange dual space >>> Quadrature of order 5 on 27 points (dim 3) >>> New Bounding Box: >>> 0: lo = 2.5624e-17 hi = 8. >>> 1: lo = -9.23372e-17 hi = 7. >>> 2: lo = 2.72091e-17 hi = 8.5 >>> ``` >>> >>> Thanks, >>> Zongze >>> >>> Zongze Yang ?2022?6?17??? 14:54??? >>> >>>> I tried the projection operation. However, it seems that the projection >>>> gives the wrong solution. After projection, the bounding box is changed! >>>> See logs below. >>>> >>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>> ``` >>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>> index d8a58d183a..dbcdb280f1 100644 >>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>> PetscINCREF(c.obj) >>>> return c >>>> >>>> + def projectCoordinates(self, FE fe=None): >>>> + if fe is None: >>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>> + else: >>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>> + >>>> def getBoundingBox(self): >>>> cdef PetscInt i,dim=0 >>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>> index 514b6fa472..c778e39884 100644 >>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>> int DMLocalizeCoordinates(PetscDM) >>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>> >>>> int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>> ``` >>>> >>>> Then in python, I load a mesh and project the coordinates to P2: >>>> ``` >>>> import firedrake as fd >>>> from firedrake.petsc import PETSc >>>> >>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>> print('old bbox:', plex.getBoundingBox()) >>>> >>>> dim = plex.getDimension() >>>> # (dim, nc, isSimplex, k, >>>> qorder, comm=None) >>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>> PETSc.DETERMINE) >>>> plex.projectCoordinates(fe_new) >>>> fe_new.view() >>>> >>>> print('new bbox:', plex.getBoundingBox()) >>>> ``` >>>> >>>> The output is (The bounding box is changed!) >>>> ``` >>>> >>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>> PetscFE Object: P2 1 MPI processes >>>> type: basic >>>> Basic Finite Element in 3 dimensions with 3 components >>>> PetscSpace Object: P2 1 MPI processes >>>> type: sum >>>> Space in 3 variables with 3 components, size 30 >>>> Sum space of 3 concatenated subspaces (all identical) >>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>> type: poly >>>> Space in 3 variables with 1 components, size 10 >>>> Polynomial space of degree 2 >>>> PetscDualSpace Object: P2 1 MPI processes >>>> type: lagrange >>>> Dual space with 3 components, size 30 >>>> Continuous Lagrange dual space >>>> Quadrature of order 5 on 27 points (dim 3) >>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>> >>>> ``` >>>> >>>> >>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>> >>>> >>>> Thanks! >>>> >>>> >>>> Zongze >>>> >>>> >>>> >>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>> >>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>> >>>>>> ? >>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang >>>>>> wrote: >>>>>> >>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>> >>>>>> >>>>>> By default, they are stored as P2, not DG. >>>>>> >>>>>> >>>>>> I checked the coordinates vector, and found the dogs only defined on >>>>>> cell other than vertex and edge, so I said they are stored as DG. >>>>>> Then the function DMPlexVecGetClosure >>>>>> seems return >>>>>> the coordinates in lex order. >>>>>> >>>>>> Some code in reading gmsh file reads that >>>>>> >>>>>> >>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>> ; /* XXX >>>>>> FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>> >>>>>> >>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, nodeType, >>>>>> dim, coordDim, order, &fe) >>>>>> >>>>>> >>>>>> The continuity is set to false for simplex. >>>>>> >>>>> >>>>> Oh, yes. That needs to be fixed. For now, you can just project it to >>>>> P2 if you want using >>>>> >>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Zongze >>>>>> >>>>>> You can ask for the coordinates of a vertex or an edge directly using >>>>>> >>>>>> >>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>> >>>>>> by giving the vertex or edge point. You can get all the coordinates >>>>>> on a cell, in the closure order, using >>>>>> >>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Below is some code load the gmsh file, I want to know the relation >>>>>>> between `cl` and `cell_coords`. >>>>>>> >>>>>>> ``` >>>>>>> import firedrake as fd >>>>>>> import numpy as np >>>>>>> >>>>>>> # Load gmsh file (2rd) >>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>> >>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>> >>>>>>> cdm = plex.getCoordinateDM() >>>>>>> csec = dm.getCoordinateSection() >>>>>>> coords_gvec = dm.getCoordinates() >>>>>>> >>>>>>> for i in range(cs, ce): >>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>> print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, >>>>>>> 3])}') >>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>> print('closure:', cl) >>>>>>> break >>>>>>> ``` >>>>>>> >>>>>>> Best wishes, >>>>>>> Zongze >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Sat May 13 05:35:26 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Sat, 13 May 2023 12:35:26 +0200 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Message-ID: Hello Barry, I have seen you guys merged in main the minimum tolerance stuff. After compiling that branch, I have tried to call KSPSetMinimumIterations(this%ksp, this%minIter, ierr), but the compiler cannot find the function. I have included this module as standard practice: #include "petsc/finclude/petscksp.h" use petscksp Maybe I am missing something else? Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sat May 13 05:46:15 2023 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sat, 13 May 2023 13:46:15 +0300 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Message-ID: Run make allfortranstubs to generate the fortran interfaces, then make On Sat, May 13, 2023, 13:35 Edoardo alinovi wrote: > Hello Barry, > > I have seen you guys merged in main the minimum tolerance stuff. > > After compiling that branch, I have tried to > call KSPSetMinimumIterations(this%ksp, this%minIter, ierr), but the > compiler cannot find the function. > > I have included this module as standard practice: > #include "petsc/finclude/petscksp.h" > use petscksp > > Maybe I am missing something else? > > Thank you! > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Sat May 13 05:51:09 2023 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Sat, 13 May 2023 12:51:09 +0200 Subject: [petsc-users] Help with KSPSetConvergenceTest In-Reply-To: References: <02578B71-E871-4DC1-A06E-8F96990E5C13@petsc.dev> <7BACC494-991D-478F-8585-882AB997A7CA@petsc.dev> Message-ID: Ciao Stefano, My bad, maybe it is not fully merged yet :) [image: image.png] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 148914 bytes Desc: not available URL: From knepley at gmail.com Sun May 14 03:44:03 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 14 May 2023 04:44:03 -0400 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: On Sat, May 13, 2023 at 6:08?AM Zongze Yang wrote: > Hi, Matt, > > There seem to be ongoing issues with projecting high-order coordinates > from a gmsh file to other spaces. I would like to inquire whether there are > any plans to resolve this problem. > > Thank you for your attention to this matter. > Yes, I will look at it. The important thing is to have a good test. Here are the higher order geometry tests https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c I take shapes with known volume, mesh them with higher order geometry, and look at the convergence to the true volume. Could you add a GMsh test, meaning the .msh file and known volume, and I will fix it? Thanks, Matt > Best wishes, > Zongze > > > On Sat, 18 Jun 2022 at 20:31, Zongze Yang wrote: > >> Thank you for your reply. May I ask for some references on the order of >> the dofs on PETSc's FE Space (especially high order elements)? >> >> Thanks, >> >> Zongze >> >> Matthew Knepley ?2022?6?18??? 20:02??? >> >>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>> wrote: >>> >>>> In order to check if I made mistakes in the python code, I try to use c >>>> code to show the issue on DMProjectCoordinates. The code and mesh file is >>>> attached. >>>> If the code is correct, there must be something wrong with >>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>> >>> >>> Something is definitely wrong with high order, periodic simplices from >>> Gmsh. We had not tested that case. I am at a conference and cannot look at >>> it for a week. >>> My suspicion is that the space we make when reading in the Gmsh >>> coordinates does not match the values (wrong order). >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> The command and the output are listed below: (Obviously the bounding >>>> box is changed.) >>>> ``` >>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view -new_fe_view >>>> Old Bounding Box: >>>> 0: lo = 0. hi = 1. >>>> 1: lo = 0. hi = 1. >>>> 2: lo = 0. hi = 1. >>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>> type: basic >>>> Basic Finite Element in 3 dimensions with 3 components >>>> PetscSpace Object: P2 1 MPI processes >>>> type: sum >>>> Space in 3 variables with 3 components, size 30 >>>> Sum space of 3 concatenated subspaces (all identical) >>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>> type: poly >>>> Space in 3 variables with 1 components, size 10 >>>> Polynomial space of degree 2 >>>> PetscDualSpace Object: P2 1 MPI processes >>>> type: lagrange >>>> Dual space with 3 components, size 30 >>>> Discontinuous Lagrange dual space >>>> Quadrature of order 5 on 27 points (dim 3) >>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>> type: basic >>>> Basic Finite Element in 3 dimensions with 3 components >>>> PetscSpace Object: P2 1 MPI processes >>>> type: sum >>>> Space in 3 variables with 3 components, size 30 >>>> Sum space of 3 concatenated subspaces (all identical) >>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>> type: poly >>>> Space in 3 variables with 1 components, size 10 >>>> Polynomial space of degree 2 >>>> PetscDualSpace Object: P2 1 MPI processes >>>> type: lagrange >>>> Dual space with 3 components, size 30 >>>> Continuous Lagrange dual space >>>> Quadrature of order 5 on 27 points (dim 3) >>>> New Bounding Box: >>>> 0: lo = 2.5624e-17 hi = 8. >>>> 1: lo = -9.23372e-17 hi = 7. >>>> 2: lo = 2.72091e-17 hi = 8.5 >>>> ``` >>>> >>>> Thanks, >>>> Zongze >>>> >>>> Zongze Yang ?2022?6?17??? 14:54??? >>>> >>>>> I tried the projection operation. However, it seems that the >>>>> projection gives the wrong solution. After projection, the bounding box is >>>>> changed! See logs below. >>>>> >>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>> ``` >>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>> index d8a58d183a..dbcdb280f1 100644 >>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>> PetscINCREF(c.obj) >>>>> return c >>>>> >>>>> + def projectCoordinates(self, FE fe=None): >>>>> + if fe is None: >>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>> + else: >>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>> + >>>>> def getBoundingBox(self): >>>>> cdef PetscInt i,dim=0 >>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>> index 514b6fa472..c778e39884 100644 >>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>> int DMLocalizeCoordinates(PetscDM) >>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>> >>>>> int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>> ``` >>>>> >>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>> ``` >>>>> import firedrake as fd >>>>> from firedrake.petsc import PETSc >>>>> >>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>> print('old bbox:', plex.getBoundingBox()) >>>>> >>>>> dim = plex.getDimension() >>>>> # (dim, nc, isSimplex, k, >>>>> qorder, comm=None) >>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>> PETSc.DETERMINE) >>>>> plex.projectCoordinates(fe_new) >>>>> fe_new.view() >>>>> >>>>> print('new bbox:', plex.getBoundingBox()) >>>>> ``` >>>>> >>>>> The output is (The bounding box is changed!) >>>>> ``` >>>>> >>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>> PetscFE Object: P2 1 MPI processes >>>>> type: basic >>>>> Basic Finite Element in 3 dimensions with 3 components >>>>> PetscSpace Object: P2 1 MPI processes >>>>> type: sum >>>>> Space in 3 variables with 3 components, size 30 >>>>> Sum space of 3 concatenated subspaces (all identical) >>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>> type: poly >>>>> Space in 3 variables with 1 components, size 10 >>>>> Polynomial space of degree 2 >>>>> PetscDualSpace Object: P2 1 MPI processes >>>>> type: lagrange >>>>> Dual space with 3 components, size 30 >>>>> Continuous Lagrange dual space >>>>> Quadrature of order 5 on 27 points (dim 3) >>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>> >>>>> ``` >>>>> >>>>> >>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>> >>>>> >>>>> Thanks! >>>>> >>>>> >>>>> Zongze >>>>> >>>>> >>>>> >>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>> >>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>> >>>>>>> ? >>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>> >>>>>>> >>>>>>> By default, they are stored as P2, not DG. >>>>>>> >>>>>>> >>>>>>> I checked the coordinates vector, and found the dogs only defined on >>>>>>> cell other than vertex and edge, so I said they are stored as DG. >>>>>>> Then the function DMPlexVecGetClosure >>>>>>> seems return >>>>>>> the coordinates in lex order. >>>>>>> >>>>>>> Some code in reading gmsh file reads that >>>>>>> >>>>>>> >>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>> ; /* XXX >>>>>>> FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>> >>>>>>> >>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, nodeType, >>>>>>> dim, coordDim, order, &fe) >>>>>>> >>>>>>> >>>>>>> The continuity is set to false for simplex. >>>>>>> >>>>>> >>>>>> Oh, yes. That needs to be fixed. For now, you can just project it to >>>>>> P2 if you want using >>>>>> >>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Zongze >>>>>>> >>>>>>> You can ask for the coordinates of a vertex or an edge directly using >>>>>>> >>>>>>> >>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>> >>>>>>> by giving the vertex or edge point. You can get all the coordinates >>>>>>> on a cell, in the closure order, using >>>>>>> >>>>>>> >>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Below is some code load the gmsh file, I want to know the relation >>>>>>>> between `cl` and `cell_coords`. >>>>>>>> >>>>>>>> ``` >>>>>>>> import firedrake as fd >>>>>>>> import numpy as np >>>>>>>> >>>>>>>> # Load gmsh file (2rd) >>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>> >>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>> >>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>> csec = dm.getCoordinateSection() >>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>> >>>>>>>> for i in range(cs, ce): >>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>> print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, >>>>>>>> 3])}') >>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>> print('closure:', cl) >>>>>>>> break >>>>>>>> ``` >>>>>>>> >>>>>>>> Best wishes, >>>>>>>> Zongze >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sun May 14 08:20:58 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Sun, 14 May 2023 21:20:58 +0800 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: Hi, Matt, The issue has been resolved while testing on the latest version of PETSc. It seems that the problem has been fixed in the following merge request: https://gitlab.com/petsc/petsc/-/merge_requests/5970 I sincerely apologize for any inconvenience caused by my previous message. However, I would like to provide you with additional information regarding the test files. Attached to this email, you will find two Gmsh files: "square_2rd.msh" and "square_3rd.msh." These files contain high-order triangulated mesh data for the unit square. ``` $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh -dm_plex_gmsh_project -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true -dm_plex_gmsh_project_fe_view -volume 1 PetscFE Object: P2 1 MPI process type: basic Basic Finite Element in 2 dimensions with 2 components PetscSpace Object: P2 1 MPI process type: sum Space in 2 variables with 2 components, size 12 Sum space of 2 concatenated subspaces (all identical) PetscSpace Object: sum component (sumcomp_) 1 MPI process type: poly Space in 2 variables with 1 components, size 6 Polynomial space of degree 2 PetscDualSpace Object: P2 1 MPI process type: lagrange Dual space with 2 components, size 12 Continuous Lagrange dual space Quadrature on a triangle of order 5 on 9 points (dim 2) Volume: 1. $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh -dm_plex_gmsh_project -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true -dm_plex_gmsh_project_fe_view -volume 1 PetscFE Object: P3 1 MPI process type: basic Basic Finite Element in 2 dimensions with 2 components PetscSpace Object: P3 1 MPI process type: sum Space in 2 variables with 2 components, size 20 Sum space of 2 concatenated subspaces (all identical) PetscSpace Object: sum component (sumcomp_) 1 MPI process type: poly Space in 2 variables with 1 components, size 10 Polynomial space of degree 3 PetscDualSpace Object: P3 1 MPI process type: lagrange Dual space with 2 components, size 20 Continuous Lagrange dual space Quadrature on a triangle of order 7 on 16 points (dim 2) Volume: 1. ``` Thank you for your attention and understanding. I apologize once again for my previous oversight. Best wishes, Zongze On Sun, 14 May 2023 at 16:44, Matthew Knepley wrote: > On Sat, May 13, 2023 at 6:08?AM Zongze Yang wrote: > >> Hi, Matt, >> >> There seem to be ongoing issues with projecting high-order coordinates >> from a gmsh file to other spaces. I would like to inquire whether there are >> any plans to resolve this problem. >> >> Thank you for your attention to this matter. >> > > Yes, I will look at it. The important thing is to have a good test. Here > are the higher order geometry tests > > > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c > > I take shapes with known volume, mesh them with higher order geometry, and > look at the convergence to the true volume. Could you add a GMsh test, > meaning the .msh file and known volume, and I will fix it? > > Thanks, > > Matt > > >> Best wishes, >> Zongze >> >> >> On Sat, 18 Jun 2022 at 20:31, Zongze Yang wrote: >> >>> Thank you for your reply. May I ask for some references on the order of >>> the dofs on PETSc's FE Space (especially high order elements)? >>> >>> Thanks, >>> >>> Zongze >>> >>> Matthew Knepley ?2022?6?18??? 20:02??? >>> >>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>>> wrote: >>>> >>>>> In order to check if I made mistakes in the python code, I try to use >>>>> c code to show the issue on DMProjectCoordinates. The code and mesh file is >>>>> attached. >>>>> If the code is correct, there must be something wrong with >>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>> >>>> >>>> Something is definitely wrong with high order, periodic simplices from >>>> Gmsh. We had not tested that case. I am at a conference and cannot look at >>>> it for a week. >>>> My suspicion is that the space we make when reading in the Gmsh >>>> coordinates does not match the values (wrong order). >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> The command and the output are listed below: (Obviously the bounding >>>>> box is changed.) >>>>> ``` >>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view -new_fe_view >>>>> Old Bounding Box: >>>>> 0: lo = 0. hi = 1. >>>>> 1: lo = 0. hi = 1. >>>>> 2: lo = 0. hi = 1. >>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>> type: basic >>>>> Basic Finite Element in 3 dimensions with 3 components >>>>> PetscSpace Object: P2 1 MPI processes >>>>> type: sum >>>>> Space in 3 variables with 3 components, size 30 >>>>> Sum space of 3 concatenated subspaces (all identical) >>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>> type: poly >>>>> Space in 3 variables with 1 components, size 10 >>>>> Polynomial space of degree 2 >>>>> PetscDualSpace Object: P2 1 MPI processes >>>>> type: lagrange >>>>> Dual space with 3 components, size 30 >>>>> Discontinuous Lagrange dual space >>>>> Quadrature of order 5 on 27 points (dim 3) >>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>> type: basic >>>>> Basic Finite Element in 3 dimensions with 3 components >>>>> PetscSpace Object: P2 1 MPI processes >>>>> type: sum >>>>> Space in 3 variables with 3 components, size 30 >>>>> Sum space of 3 concatenated subspaces (all identical) >>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>> type: poly >>>>> Space in 3 variables with 1 components, size 10 >>>>> Polynomial space of degree 2 >>>>> PetscDualSpace Object: P2 1 MPI processes >>>>> type: lagrange >>>>> Dual space with 3 components, size 30 >>>>> Continuous Lagrange dual space >>>>> Quadrature of order 5 on 27 points (dim 3) >>>>> New Bounding Box: >>>>> 0: lo = 2.5624e-17 hi = 8. >>>>> 1: lo = -9.23372e-17 hi = 7. >>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>> ``` >>>>> >>>>> Thanks, >>>>> Zongze >>>>> >>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>> >>>>>> I tried the projection operation. However, it seems that the >>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>> changed! See logs below. >>>>>> >>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>> ``` >>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>> PetscINCREF(c.obj) >>>>>> return c >>>>>> >>>>>> + def projectCoordinates(self, FE fe=None): >>>>>> + if fe is None: >>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>> + else: >>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>> + >>>>>> def getBoundingBox(self): >>>>>> cdef PetscInt i,dim=0 >>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>> index 514b6fa472..c778e39884 100644 >>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>> >>>>>> int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>> ``` >>>>>> >>>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>>> ``` >>>>>> import firedrake as fd >>>>>> from firedrake.petsc import PETSc >>>>>> >>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>> >>>>>> dim = plex.getDimension() >>>>>> # (dim, nc, isSimplex, k, >>>>>> qorder, comm=None) >>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>> PETSc.DETERMINE) >>>>>> plex.projectCoordinates(fe_new) >>>>>> fe_new.view() >>>>>> >>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>> ``` >>>>>> >>>>>> The output is (The bounding box is changed!) >>>>>> ``` >>>>>> >>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>> PetscFE Object: P2 1 MPI processes >>>>>> type: basic >>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>> PetscSpace Object: P2 1 MPI processes >>>>>> type: sum >>>>>> Space in 3 variables with 3 components, size 30 >>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>> type: poly >>>>>> Space in 3 variables with 1 components, size 10 >>>>>> Polynomial space of degree 2 >>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>> type: lagrange >>>>>> Dual space with 3 components, size 30 >>>>>> Continuous Lagrange dual space >>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>> >>>>>> ``` >>>>>> >>>>>> >>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>> >>>>>> >>>>>> Thanks! >>>>>> >>>>>> >>>>>> Zongze >>>>>> >>>>>> >>>>>> >>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>> >>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>> >>>>>>>> ? >>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>> >>>>>>>> >>>>>>>> By default, they are stored as P2, not DG. >>>>>>>> >>>>>>>> >>>>>>>> I checked the coordinates vector, and found the dogs only defined >>>>>>>> on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>> seems return >>>>>>>> the coordinates in lex order. >>>>>>>> >>>>>>>> Some code in reading gmsh file reads that >>>>>>>> >>>>>>>> >>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>> ; /* XXX >>>>>>>> FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>> >>>>>>>> >>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>> >>>>>>>> >>>>>>>> The continuity is set to false for simplex. >>>>>>>> >>>>>>> >>>>>>> Oh, yes. That needs to be fixed. For now, you can just project it to >>>>>>> P2 if you want using >>>>>>> >>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Zongze >>>>>>>> >>>>>>>> You can ask for the coordinates of a vertex or an edge directly >>>>>>>> using >>>>>>>> >>>>>>>> >>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>> >>>>>>>> by giving the vertex or edge point. You can get all the coordinates >>>>>>>> on a cell, in the closure order, using >>>>>>>> >>>>>>>> >>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Below is some code load the gmsh file, I want to know the relation >>>>>>>>> between `cl` and `cell_coords`. >>>>>>>>> >>>>>>>>> ``` >>>>>>>>> import firedrake as fd >>>>>>>>> import numpy as np >>>>>>>>> >>>>>>>>> # Load gmsh file (2rd) >>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>> >>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>> >>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>> >>>>>>>>> for i in range(cs, ce): >>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>> print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, >>>>>>>>> 3])}') >>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>> print('closure:', cl) >>>>>>>>> break >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> Best wishes, >>>>>>>>> Zongze >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: square_3rd.msh Type: application/octet-stream Size: 1670 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: square_2rd.msh Type: application/octet-stream Size: 1072 bytes Desc: not available URL: From jed at jedbrown.org Sun May 14 09:36:23 2023 From: jed at jedbrown.org (Jed Brown) Date: Sun, 14 May 2023 08:36:23 -0600 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: <87r0rjm13s.fsf@jedbrown.org> Good to hear this works for you. I believe there is still a problem with high order tetrahedral elements (we've been coping with it for months and someone asked last week) and plan to look at it as soon as possible now that my semester finished. Zongze Yang writes: > Hi, Matt, > > The issue has been resolved while testing on the latest version of PETSc. > It seems that the problem has been fixed in the following merge request: > https://gitlab.com/petsc/petsc/-/merge_requests/5970 > > I sincerely apologize for any inconvenience caused by my previous message. > However, I would like to provide you with additional information regarding > the test files. Attached to this email, you will find two Gmsh files: > "square_2rd.msh" and "square_3rd.msh." These files contain high-order > triangulated mesh data for the unit square. > > ``` > $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh > -dm_plex_gmsh_project > -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true > -dm_plex_gmsh_project_fe_view -volume 1 > PetscFE Object: P2 1 MPI process > type: basic > Basic Finite Element in 2 dimensions with 2 components > PetscSpace Object: P2 1 MPI process > type: sum > Space in 2 variables with 2 components, size 12 > Sum space of 2 concatenated subspaces (all identical) > PetscSpace Object: sum component (sumcomp_) 1 MPI process > type: poly > Space in 2 variables with 1 components, size 6 > Polynomial space of degree 2 > PetscDualSpace Object: P2 1 MPI process > type: lagrange > Dual space with 2 components, size 12 > Continuous Lagrange dual space > Quadrature on a triangle of order 5 on 9 points (dim 2) > Volume: 1. > $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh > -dm_plex_gmsh_project > -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true > -dm_plex_gmsh_project_fe_view -volume 1 > PetscFE Object: P3 1 MPI process > type: basic > Basic Finite Element in 2 dimensions with 2 components > PetscSpace Object: P3 1 MPI process > type: sum > Space in 2 variables with 2 components, size 20 > Sum space of 2 concatenated subspaces (all identical) > PetscSpace Object: sum component (sumcomp_) 1 MPI process > type: poly > Space in 2 variables with 1 components, size 10 > Polynomial space of degree 3 > PetscDualSpace Object: P3 1 MPI process > type: lagrange > Dual space with 2 components, size 20 > Continuous Lagrange dual space > Quadrature on a triangle of order 7 on 16 points (dim 2) > Volume: 1. > ``` > > Thank you for your attention and understanding. I apologize once again for > my previous oversight. > > Best wishes, > Zongze > > > On Sun, 14 May 2023 at 16:44, Matthew Knepley wrote: > >> On Sat, May 13, 2023 at 6:08?AM Zongze Yang wrote: >> >>> Hi, Matt, >>> >>> There seem to be ongoing issues with projecting high-order coordinates >>> from a gmsh file to other spaces. I would like to inquire whether there are >>> any plans to resolve this problem. >>> >>> Thank you for your attention to this matter. >>> >> >> Yes, I will look at it. The important thing is to have a good test. Here >> are the higher order geometry tests >> >> >> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >> >> I take shapes with known volume, mesh them with higher order geometry, and >> look at the convergence to the true volume. Could you add a GMsh test, >> meaning the .msh file and known volume, and I will fix it? >> >> Thanks, >> >> Matt >> >> >>> Best wishes, >>> Zongze >>> >>> >>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang wrote: >>> >>>> Thank you for your reply. May I ask for some references on the order of >>>> the dofs on PETSc's FE Space (especially high order elements)? >>>> >>>> Thanks, >>>> >>>> Zongze >>>> >>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>> >>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>>>> wrote: >>>>> >>>>>> In order to check if I made mistakes in the python code, I try to use >>>>>> c code to show the issue on DMProjectCoordinates. The code and mesh file is >>>>>> attached. >>>>>> If the code is correct, there must be something wrong with >>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>> >>>>> >>>>> Something is definitely wrong with high order, periodic simplices from >>>>> Gmsh. We had not tested that case. I am at a conference and cannot look at >>>>> it for a week. >>>>> My suspicion is that the space we make when reading in the Gmsh >>>>> coordinates does not match the values (wrong order). >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> The command and the output are listed below: (Obviously the bounding >>>>>> box is changed.) >>>>>> ``` >>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view -new_fe_view >>>>>> Old Bounding Box: >>>>>> 0: lo = 0. hi = 1. >>>>>> 1: lo = 0. hi = 1. >>>>>> 2: lo = 0. hi = 1. >>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>> type: basic >>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>> PetscSpace Object: P2 1 MPI processes >>>>>> type: sum >>>>>> Space in 3 variables with 3 components, size 30 >>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>> type: poly >>>>>> Space in 3 variables with 1 components, size 10 >>>>>> Polynomial space of degree 2 >>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>> type: lagrange >>>>>> Dual space with 3 components, size 30 >>>>>> Discontinuous Lagrange dual space >>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>> type: basic >>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>> PetscSpace Object: P2 1 MPI processes >>>>>> type: sum >>>>>> Space in 3 variables with 3 components, size 30 >>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>> type: poly >>>>>> Space in 3 variables with 1 components, size 10 >>>>>> Polynomial space of degree 2 >>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>> type: lagrange >>>>>> Dual space with 3 components, size 30 >>>>>> Continuous Lagrange dual space >>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>> New Bounding Box: >>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>> ``` >>>>>> >>>>>> Thanks, >>>>>> Zongze >>>>>> >>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>> >>>>>>> I tried the projection operation. However, it seems that the >>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>> changed! See logs below. >>>>>>> >>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>> ``` >>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>> PetscINCREF(c.obj) >>>>>>> return c >>>>>>> >>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>> + if fe is None: >>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>> + else: >>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>> + >>>>>>> def getBoundingBox(self): >>>>>>> cdef PetscInt i,dim=0 >>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>> >>>>>>> int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>> ``` >>>>>>> >>>>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>>>> ``` >>>>>>> import firedrake as fd >>>>>>> from firedrake.petsc import PETSc >>>>>>> >>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>> >>>>>>> dim = plex.getDimension() >>>>>>> # (dim, nc, isSimplex, k, >>>>>>> qorder, comm=None) >>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>> PETSc.DETERMINE) >>>>>>> plex.projectCoordinates(fe_new) >>>>>>> fe_new.view() >>>>>>> >>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>> ``` >>>>>>> >>>>>>> The output is (The bounding box is changed!) >>>>>>> ``` >>>>>>> >>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>> type: basic >>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>> type: sum >>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>> type: poly >>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>> Polynomial space of degree 2 >>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>> type: lagrange >>>>>>> Dual space with 3 components, size 30 >>>>>>> Continuous Lagrange dual space >>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>> >>>>>>> ``` >>>>>>> >>>>>>> >>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>> >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> >>>>>>> Zongze >>>>>>> >>>>>>> >>>>>>> >>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>> >>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>> >>>>>>>>> ? >>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>> >>>>>>>>> >>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>> >>>>>>>>> >>>>>>>>> I checked the coordinates vector, and found the dogs only defined >>>>>>>>> on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>> seems return >>>>>>>>> the coordinates in lex order. >>>>>>>>> >>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>> >>>>>>>>> >>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>> ; /* XXX >>>>>>>>> FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>> >>>>>>>>> >>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>> >>>>>>>>> >>>>>>>>> The continuity is set to false for simplex. >>>>>>>>> >>>>>>>> >>>>>>>> Oh, yes. That needs to be fixed. For now, you can just project it to >>>>>>>> P2 if you want using >>>>>>>> >>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Zongze >>>>>>>>> >>>>>>>>> You can ask for the coordinates of a vertex or an edge directly >>>>>>>>> using >>>>>>>>> >>>>>>>>> >>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>> >>>>>>>>> by giving the vertex or edge point. You can get all the coordinates >>>>>>>>> on a cell, in the closure order, using >>>>>>>>> >>>>>>>>> >>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Below is some code load the gmsh file, I want to know the relation >>>>>>>>>> between `cl` and `cell_coords`. >>>>>>>>>> >>>>>>>>>> ``` >>>>>>>>>> import firedrake as fd >>>>>>>>>> import numpy as np >>>>>>>>>> >>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>> >>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>> >>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>> >>>>>>>>>> for i in range(cs, ce): >>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>> print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, >>>>>>>>>> 3])}') >>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>> print('closure:', cl) >>>>>>>>>> break >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> Best wishes, >>>>>>>>>> Zongze >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> From yangzongze at gmail.com Sun May 14 10:09:24 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Sun, 14 May 2023 23:09:24 +0800 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: <87r0rjm13s.fsf@jedbrown.org> References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> <87r0rjm13s.fsf@jedbrown.org> Message-ID: Yes, you are correct. I have conducted tests using high-order 3D meshes of a unit cube, and regrettably, the tests have failed. I have attached the files for your reference. Kindly review the output provided below: ( The volume should be 1) ``` $ mpiexec -n 3 ./ex33 -coord_space 0 -dm_plex_filename cube_2rd.msh -dm_plex_gmsh_project -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true -dm_plex_gmsh_project_fe_view PetscFE Object: P2 3 MPI processes type: basic Basic Finite Element in 3 dimensions with 3 components PetscSpace Object: P2 3 MPI processes type: sum Space in 3 variables with 3 components, size 30 Sum space of 3 concatenated subspaces (all identical) PetscSpace Object: sum component (sumcomp_) 3 MPI processes type: poly Space in 3 variables with 1 components, size 10 Polynomial space of degree 2 PetscDualSpace Object: P2 3 MPI processes type: lagrange Dual space with 3 components, size 30 Continuous Lagrange dual space Quadrature on a tetrahedron of order 5 on 27 points (dim 3) Volume: 0.46875 $ mpiexec -n 3 ./ex33 -coord_space 0 -dm_plex_filename cube_3rd.msh -dm_plex_gmsh_project -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true -dm_plex_gmsh_project_fe_view PetscFE Object: P3 3 MPI processes type: basic Basic Finite Element in 3 dimensions with 3 components PetscSpace Object: P3 3 MPI processes type: sum Space in 3 variables with 3 components, size 60 Sum space of 3 concatenated subspaces (all identical) PetscSpace Object: sum component (sumcomp_) 3 MPI processes type: poly Space in 3 variables with 1 components, size 20 Polynomial space of degree 3 PetscDualSpace Object: P3 3 MPI processes type: lagrange Dual space with 3 components, size 60 Continuous Lagrange dual space Quadrature on a tetrahedron of order 7 on 64 points (dim 3) Volume: 0.536855 ``` Best wishes, Zongze On Sun, 14 May 2023 at 22:36, Jed Brown wrote: > Good to hear this works for you. I believe there is still a problem with > high order tetrahedral elements (we've been coping with it for months and > someone asked last week) and plan to look at it as soon as possible now > that my semester finished. > > Zongze Yang writes: > > > Hi, Matt, > > > > The issue has been resolved while testing on the latest version of PETSc. > > It seems that the problem has been fixed in the following merge request: > > https://gitlab.com/petsc/petsc/-/merge_requests/5970 > > > > I sincerely apologize for any inconvenience caused by my previous > message. > > However, I would like to provide you with additional information > regarding > > the test files. Attached to this email, you will find two Gmsh files: > > "square_2rd.msh" and "square_3rd.msh." These files contain high-order > > triangulated mesh data for the unit square. > > > > ``` > > $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh > > -dm_plex_gmsh_project > > -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true > > -dm_plex_gmsh_project_fe_view -volume 1 > > PetscFE Object: P2 1 MPI process > > type: basic > > Basic Finite Element in 2 dimensions with 2 components > > PetscSpace Object: P2 1 MPI process > > type: sum > > Space in 2 variables with 2 components, size 12 > > Sum space of 2 concatenated subspaces (all identical) > > PetscSpace Object: sum component (sumcomp_) 1 MPI process > > type: poly > > Space in 2 variables with 1 components, size 6 > > Polynomial space of degree 2 > > PetscDualSpace Object: P2 1 MPI process > > type: lagrange > > Dual space with 2 components, size 12 > > Continuous Lagrange dual space > > Quadrature on a triangle of order 5 on 9 points (dim 2) > > Volume: 1. > > $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh > > -dm_plex_gmsh_project > > -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true > > -dm_plex_gmsh_project_fe_view -volume 1 > > PetscFE Object: P3 1 MPI process > > type: basic > > Basic Finite Element in 2 dimensions with 2 components > > PetscSpace Object: P3 1 MPI process > > type: sum > > Space in 2 variables with 2 components, size 20 > > Sum space of 2 concatenated subspaces (all identical) > > PetscSpace Object: sum component (sumcomp_) 1 MPI process > > type: poly > > Space in 2 variables with 1 components, size 10 > > Polynomial space of degree 3 > > PetscDualSpace Object: P3 1 MPI process > > type: lagrange > > Dual space with 2 components, size 20 > > Continuous Lagrange dual space > > Quadrature on a triangle of order 7 on 16 points (dim 2) > > Volume: 1. > > ``` > > > > Thank you for your attention and understanding. I apologize once again > for > > my previous oversight. > > > > Best wishes, > > Zongze > > > > > > On Sun, 14 May 2023 at 16:44, Matthew Knepley wrote: > > > >> On Sat, May 13, 2023 at 6:08?AM Zongze Yang > wrote: > >> > >>> Hi, Matt, > >>> > >>> There seem to be ongoing issues with projecting high-order coordinates > >>> from a gmsh file to other spaces. I would like to inquire whether > there are > >>> any plans to resolve this problem. > >>> > >>> Thank you for your attention to this matter. > >>> > >> > >> Yes, I will look at it. The important thing is to have a good test. Here > >> are the higher order geometry tests > >> > >> > >> > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c > >> > >> I take shapes with known volume, mesh them with higher order geometry, > and > >> look at the convergence to the true volume. Could you add a GMsh test, > >> meaning the .msh file and known volume, and I will fix it? > >> > >> Thanks, > >> > >> Matt > >> > >> > >>> Best wishes, > >>> Zongze > >>> > >>> > >>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang > wrote: > >>> > >>>> Thank you for your reply. May I ask for some references on the order > of > >>>> the dofs on PETSc's FE Space (especially high order elements)? > >>>> > >>>> Thanks, > >>>> > >>>> Zongze > >>>> > >>>> Matthew Knepley ?2022?6?18??? 20:02??? > >>>> > >>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang > >>>>> wrote: > >>>>> > >>>>>> In order to check if I made mistakes in the python code, I try to > use > >>>>>> c code to show the issue on DMProjectCoordinates. The code and mesh > file is > >>>>>> attached. > >>>>>> If the code is correct, there must be something wrong with > >>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order > mesh. > >>>>>> > >>>>> > >>>>> Something is definitely wrong with high order, periodic simplices > from > >>>>> Gmsh. We had not tested that case. I am at a conference and cannot > look at > >>>>> it for a week. > >>>>> My suspicion is that the space we make when reading in the Gmsh > >>>>> coordinates does not match the values (wrong order). > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Matt > >>>>> > >>>>> > >>>>>> The command and the output are listed below: (Obviously the bounding > >>>>>> box is changed.) > >>>>>> ``` > >>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view > -new_fe_view > >>>>>> Old Bounding Box: > >>>>>> 0: lo = 0. hi = 1. > >>>>>> 1: lo = 0. hi = 1. > >>>>>> 2: lo = 0. hi = 1. > >>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes > >>>>>> type: basic > >>>>>> Basic Finite Element in 3 dimensions with 3 components > >>>>>> PetscSpace Object: P2 1 MPI processes > >>>>>> type: sum > >>>>>> Space in 3 variables with 3 components, size 30 > >>>>>> Sum space of 3 concatenated subspaces (all identical) > >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes > >>>>>> type: poly > >>>>>> Space in 3 variables with 1 components, size 10 > >>>>>> Polynomial space of degree 2 > >>>>>> PetscDualSpace Object: P2 1 MPI processes > >>>>>> type: lagrange > >>>>>> Dual space with 3 components, size 30 > >>>>>> Discontinuous Lagrange dual space > >>>>>> Quadrature of order 5 on 27 points (dim 3) > >>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes > >>>>>> type: basic > >>>>>> Basic Finite Element in 3 dimensions with 3 components > >>>>>> PetscSpace Object: P2 1 MPI processes > >>>>>> type: sum > >>>>>> Space in 3 variables with 3 components, size 30 > >>>>>> Sum space of 3 concatenated subspaces (all identical) > >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes > >>>>>> type: poly > >>>>>> Space in 3 variables with 1 components, size 10 > >>>>>> Polynomial space of degree 2 > >>>>>> PetscDualSpace Object: P2 1 MPI processes > >>>>>> type: lagrange > >>>>>> Dual space with 3 components, size 30 > >>>>>> Continuous Lagrange dual space > >>>>>> Quadrature of order 5 on 27 points (dim 3) > >>>>>> New Bounding Box: > >>>>>> 0: lo = 2.5624e-17 hi = 8. > >>>>>> 1: lo = -9.23372e-17 hi = 7. > >>>>>> 2: lo = 2.72091e-17 hi = 8.5 > >>>>>> ``` > >>>>>> > >>>>>> Thanks, > >>>>>> Zongze > >>>>>> > >>>>>> Zongze Yang ?2022?6?17??? 14:54??? > >>>>>> > >>>>>>> I tried the projection operation. However, it seems that the > >>>>>>> projection gives the wrong solution. After projection, the > bounding box is > >>>>>>> changed! See logs below. > >>>>>>> > >>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: > >>>>>>> ``` > >>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx > >>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx > >>>>>>> index d8a58d183a..dbcdb280f1 100644 > >>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx > >>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx > >>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): > >>>>>>> PetscINCREF(c.obj) > >>>>>>> return c > >>>>>>> > >>>>>>> + def projectCoordinates(self, FE fe=None): > >>>>>>> + if fe is None: > >>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) > >>>>>>> + else: > >>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) > >>>>>>> + > >>>>>>> def getBoundingBox(self): > >>>>>>> cdef PetscInt i,dim=0 > >>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) > >>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi > >>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi > >>>>>>> index 514b6fa472..c778e39884 100644 > >>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi > >>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi > >>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: > >>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) > >>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) > >>>>>>> int DMLocalizeCoordinates(PetscDM) > >>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) > >>>>>>> > >>>>>>> int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) > >>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) > >>>>>>> ``` > >>>>>>> > >>>>>>> Then in python, I load a mesh and project the coordinates to P2: > >>>>>>> ``` > >>>>>>> import firedrake as fd > >>>>>>> from firedrake.petsc import PETSc > >>>>>>> > >>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') > >>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') > >>>>>>> print('old bbox:', plex.getBoundingBox()) > >>>>>>> > >>>>>>> dim = plex.getDimension() > >>>>>>> # (dim, nc, isSimplex, k, > >>>>>>> qorder, comm=None) > >>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, > >>>>>>> PETSc.DETERMINE) > >>>>>>> plex.projectCoordinates(fe_new) > >>>>>>> fe_new.view() > >>>>>>> > >>>>>>> print('new bbox:', plex.getBoundingBox()) > >>>>>>> ``` > >>>>>>> > >>>>>>> The output is (The bounding box is changed!) > >>>>>>> ``` > >>>>>>> > >>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) > >>>>>>> PetscFE Object: P2 1 MPI processes > >>>>>>> type: basic > >>>>>>> Basic Finite Element in 3 dimensions with 3 components > >>>>>>> PetscSpace Object: P2 1 MPI processes > >>>>>>> type: sum > >>>>>>> Space in 3 variables with 3 components, size 30 > >>>>>>> Sum space of 3 concatenated subspaces (all identical) > >>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes > >>>>>>> type: poly > >>>>>>> Space in 3 variables with 1 components, size 10 > >>>>>>> Polynomial space of degree 2 > >>>>>>> PetscDualSpace Object: P2 1 MPI processes > >>>>>>> type: lagrange > >>>>>>> Dual space with 3 components, size 30 > >>>>>>> Continuous Lagrange dual space > >>>>>>> Quadrature of order 5 on 27 points (dim 3) > >>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), > (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, > 36.111577025012224)) > >>>>>>> > >>>>>>> ``` > >>>>>>> > >>>>>>> > >>>>>>> By the way, for the original DG coordinates, where can I find the > relation of the closure and the order of the dofs for the cell? > >>>>>>> > >>>>>>> > >>>>>>> Thanks! > >>>>>>> > >>>>>>> > >>>>>>> Zongze > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? > >>>>>>> > >>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang < > yangzongze at gmail.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? > >>>>>>>>> > >>>>>>>>> ? > >>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang < > yangzongze at gmail.com> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the > >>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the > >>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex > and edge? > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> By default, they are stored as P2, not DG. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I checked the coordinates vector, and found the dogs only defined > >>>>>>>>> on cell other than vertex and edge, so I said they are stored as > DG. > >>>>>>>>> Then the function DMPlexVecGetClosure > >>>>>>>>> < > https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/> > seems return > >>>>>>>>> the coordinates in lex order. > >>>>>>>>> > >>>>>>>>> Some code in reading gmsh file reads that > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE > >>>>>>>>> ; /* > XXX > >>>>>>>>> FIXME Requires DMPlexSetClosurePermutationLexicographic() */ > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, > >>>>>>>>> nodeType, dim, coordDim, order, &fe) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> The continuity is set to false for simplex. > >>>>>>>>> > >>>>>>>> > >>>>>>>> Oh, yes. That needs to be fixed. For now, you can just project it > to > >>>>>>>> P2 if you want using > >>>>>>>> > >>>>>>>> > https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> > >>>>>>>> Matt > >>>>>>>> > >>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Zongze > >>>>>>>>> > >>>>>>>>> You can ask for the coordinates of a vertex or an edge directly > >>>>>>>>> using > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ > >>>>>>>>> > >>>>>>>>> by giving the vertex or edge point. You can get all the > coordinates > >>>>>>>>> on a cell, in the closure order, using > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> > >>>>>>>>> Matt > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> Below is some code load the gmsh file, I want to know the > relation > >>>>>>>>>> between `cl` and `cell_coords`. > >>>>>>>>>> > >>>>>>>>>> ``` > >>>>>>>>>> import firedrake as fd > >>>>>>>>>> import numpy as np > >>>>>>>>>> > >>>>>>>>>> # Load gmsh file (2rd) > >>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') > >>>>>>>>>> > >>>>>>>>>> cs, ce = plex.getHeightStratum(0) > >>>>>>>>>> > >>>>>>>>>> cdm = plex.getCoordinateDM() > >>>>>>>>>> csec = dm.getCoordinateSection() > >>>>>>>>>> coords_gvec = dm.getCoordinates() > >>>>>>>>>> > >>>>>>>>>> for i in range(cs, ce): > >>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) > >>>>>>>>>> print(f'coordinates for cell {i} > :\n{cell_coords.reshape([-1, > >>>>>>>>>> 3])}') > >>>>>>>>>> cl = dm.getTransitiveClosure(i) > >>>>>>>>>> print('closure:', cl) > >>>>>>>>>> break > >>>>>>>>>> ``` > >>>>>>>>>> > >>>>>>>>>> Best wishes, > >>>>>>>>>> Zongze > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> What most experimenters take for granted before they begin their > >>>>>>>>> experiments is infinitely more interesting than any results to > which their > >>>>>>>>> experiments lead. > >>>>>>>>> -- Norbert Wiener > >>>>>>>>> > >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> What most experimenters take for granted before they begin their > >>>>>>>> experiments is infinitely more interesting than any results to > which their > >>>>>>>> experiments lead. > >>>>>>>> -- Norbert Wiener > >>>>>>>> > >>>>>>>> https://www.cse.buffalo.edu/~knepley/ > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>> > >>>>> -- > >>>>> What most experimenters take for granted before they begin their > >>>>> experiments is infinitely more interesting than any results to which > their > >>>>> experiments lead. > >>>>> -- Norbert Wiener > >>>>> > >>>>> https://www.cse.buffalo.edu/~knepley/ > >>>>> > >>>>> > >>>> > >> > >> -- > >> What most experimenters take for granted before they begin their > >> experiments is infinitely more interesting than any results to which > their > >> experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube_3rd.msh Type: application/octet-stream Size: 12223 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube_2rd.msh Type: application/octet-stream Size: 4926 bytes Desc: not available URL: From knepley at gmail.com Sun May 14 10:54:37 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 14 May 2023 11:54:37 -0400 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: On Sun, May 14, 2023 at 9:21?AM Zongze Yang wrote: > Hi, Matt, > > The issue has been resolved while testing on the latest version of PETSc. > It seems that the problem has been fixed in the following merge request: > https://gitlab.com/petsc/petsc/-/merge_requests/5970 > No problem. Glad it is working. > I sincerely apologize for any inconvenience caused by my previous message. > However, I would like to provide you with additional information regarding > the test files. Attached to this email, you will find two Gmsh files: > "square_2rd.msh" and "square_3rd.msh." These files contain high-order > triangulated mesh data for the unit square. > > ``` > $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh > -dm_plex_gmsh_project > -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true > -dm_plex_gmsh_project_fe_view -volume 1 > PetscFE Object: P2 1 MPI process > type: basic > Basic Finite Element in 2 dimensions with 2 components > PetscSpace Object: P2 1 MPI process > type: sum > Space in 2 variables with 2 components, size 12 > Sum space of 2 concatenated subspaces (all identical) > PetscSpace Object: sum component (sumcomp_) 1 MPI process > type: poly > Space in 2 variables with 1 components, size 6 > Polynomial space of degree 2 > PetscDualSpace Object: P2 1 MPI process > type: lagrange > Dual space with 2 components, size 12 > Continuous Lagrange dual space > Quadrature on a triangle of order 5 on 9 points (dim 2) > Volume: 1. > $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh > -dm_plex_gmsh_project > -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true > -dm_plex_gmsh_project_fe_view -volume 1 > PetscFE Object: P3 1 MPI process > type: basic > Basic Finite Element in 2 dimensions with 2 components > PetscSpace Object: P3 1 MPI process > type: sum > Space in 2 variables with 2 components, size 20 > Sum space of 2 concatenated subspaces (all identical) > PetscSpace Object: sum component (sumcomp_) 1 MPI process > type: poly > Space in 2 variables with 1 components, size 10 > Polynomial space of degree 3 > PetscDualSpace Object: P3 1 MPI process > type: lagrange > Dual space with 2 components, size 20 > Continuous Lagrange dual space > Quadrature on a triangle of order 7 on 16 points (dim 2) > Volume: 1. > ``` > > Thank you for your attention and understanding. I apologize once again for > my previous oversight. > Great! If you make an MR for this, you will be included on the next list of PETSc contributors. Otherwise, I can do it. Thanks, Matt > Best wishes, > Zongze > > > On Sun, 14 May 2023 at 16:44, Matthew Knepley wrote: > >> On Sat, May 13, 2023 at 6:08?AM Zongze Yang wrote: >> >>> Hi, Matt, >>> >>> There seem to be ongoing issues with projecting high-order coordinates >>> from a gmsh file to other spaces. I would like to inquire whether there are >>> any plans to resolve this problem. >>> >>> Thank you for your attention to this matter. >>> >> >> Yes, I will look at it. The important thing is to have a good test. Here >> are the higher order geometry tests >> >> >> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >> >> I take shapes with known volume, mesh them with higher order geometry, >> and look at the convergence to the true volume. Could you add a GMsh test, >> meaning the .msh file and known volume, and I will fix it? >> >> Thanks, >> >> Matt >> >> >>> Best wishes, >>> Zongze >>> >>> >>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang wrote: >>> >>>> Thank you for your reply. May I ask for some references on the order of >>>> the dofs on PETSc's FE Space (especially high order elements)? >>>> >>>> Thanks, >>>> >>>> Zongze >>>> >>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>> >>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>>>> wrote: >>>>> >>>>>> In order to check if I made mistakes in the python code, I try to use >>>>>> c code to show the issue on DMProjectCoordinates. The code and mesh file is >>>>>> attached. >>>>>> If the code is correct, there must be something wrong with >>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>> >>>>> >>>>> Something is definitely wrong with high order, periodic simplices from >>>>> Gmsh. We had not tested that case. I am at a conference and cannot look at >>>>> it for a week. >>>>> My suspicion is that the space we make when reading in the Gmsh >>>>> coordinates does not match the values (wrong order). >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> The command and the output are listed below: (Obviously the bounding >>>>>> box is changed.) >>>>>> ``` >>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view -new_fe_view >>>>>> Old Bounding Box: >>>>>> 0: lo = 0. hi = 1. >>>>>> 1: lo = 0. hi = 1. >>>>>> 2: lo = 0. hi = 1. >>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>> type: basic >>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>> PetscSpace Object: P2 1 MPI processes >>>>>> type: sum >>>>>> Space in 3 variables with 3 components, size 30 >>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>> type: poly >>>>>> Space in 3 variables with 1 components, size 10 >>>>>> Polynomial space of degree 2 >>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>> type: lagrange >>>>>> Dual space with 3 components, size 30 >>>>>> Discontinuous Lagrange dual space >>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>> type: basic >>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>> PetscSpace Object: P2 1 MPI processes >>>>>> type: sum >>>>>> Space in 3 variables with 3 components, size 30 >>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>> type: poly >>>>>> Space in 3 variables with 1 components, size 10 >>>>>> Polynomial space of degree 2 >>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>> type: lagrange >>>>>> Dual space with 3 components, size 30 >>>>>> Continuous Lagrange dual space >>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>> New Bounding Box: >>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>> ``` >>>>>> >>>>>> Thanks, >>>>>> Zongze >>>>>> >>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>> >>>>>>> I tried the projection operation. However, it seems that the >>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>> changed! See logs below. >>>>>>> >>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>> ``` >>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>> PetscINCREF(c.obj) >>>>>>> return c >>>>>>> >>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>> + if fe is None: >>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>> + else: >>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>> + >>>>>>> def getBoundingBox(self): >>>>>>> cdef PetscInt i,dim=0 >>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>> >>>>>>> int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>> ``` >>>>>>> >>>>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>>>> ``` >>>>>>> import firedrake as fd >>>>>>> from firedrake.petsc import PETSc >>>>>>> >>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>> >>>>>>> dim = plex.getDimension() >>>>>>> # (dim, nc, isSimplex, k, >>>>>>> qorder, comm=None) >>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>> PETSc.DETERMINE) >>>>>>> plex.projectCoordinates(fe_new) >>>>>>> fe_new.view() >>>>>>> >>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>> ``` >>>>>>> >>>>>>> The output is (The bounding box is changed!) >>>>>>> ``` >>>>>>> >>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>> type: basic >>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>> type: sum >>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>> type: poly >>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>> Polynomial space of degree 2 >>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>> type: lagrange >>>>>>> Dual space with 3 components, size 30 >>>>>>> Continuous Lagrange dual space >>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>> >>>>>>> ``` >>>>>>> >>>>>>> >>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>> >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> >>>>>>> Zongze >>>>>>> >>>>>>> >>>>>>> >>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>> >>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>> >>>>>>>>> ? >>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>> >>>>>>>>> >>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>> >>>>>>>>> >>>>>>>>> I checked the coordinates vector, and found the dogs only defined >>>>>>>>> on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>> seems return >>>>>>>>> the coordinates in lex order. >>>>>>>>> >>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>> >>>>>>>>> >>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>> ; /* >>>>>>>>> XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>> >>>>>>>>> >>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>> >>>>>>>>> >>>>>>>>> The continuity is set to false for simplex. >>>>>>>>> >>>>>>>> >>>>>>>> Oh, yes. That needs to be fixed. For now, you can just project it >>>>>>>> to P2 if you want using >>>>>>>> >>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Zongze >>>>>>>>> >>>>>>>>> You can ask for the coordinates of a vertex or an edge directly >>>>>>>>> using >>>>>>>>> >>>>>>>>> >>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>> >>>>>>>>> by giving the vertex or edge point. You can get all the >>>>>>>>> coordinates on a cell, in the closure order, using >>>>>>>>> >>>>>>>>> >>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Below is some code load the gmsh file, I want to know the >>>>>>>>>> relation between `cl` and `cell_coords`. >>>>>>>>>> >>>>>>>>>> ``` >>>>>>>>>> import firedrake as fd >>>>>>>>>> import numpy as np >>>>>>>>>> >>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>> >>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>> >>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>> >>>>>>>>>> for i in range(cs, ce): >>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>> print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, >>>>>>>>>> 3])}') >>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>> print('closure:', cl) >>>>>>>>>> break >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> Best wishes, >>>>>>>>>> Zongze >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kabdelaz at purdue.edu Sun May 14 11:06:24 2023 From: kabdelaz at purdue.edu (Khaled Nabil Shar Abdelaziz) Date: Sun, 14 May 2023 16:06:24 +0000 Subject: [petsc-users] SNESDMDASNESSetFunctionLocal in Fortran Message-ID: Hey there, I'm having a problem with the DMDASNESSetFunctionLocal() function in C and its Fortran counterpart. The thing is, in C, you can pass a bunch of variables using the ctx parameter, but in Fortran, it only seems to accept one variable. What's weird is that the SNESSetFunction() function has a similar ctx parameter, but in Fortran, it can handle multiple variables for ctx, unlike DMDASNESSetFunctionLocal(). Do you know if this is on purpose, or am I missing something? Thanks in advance! Best regards, Khaled -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun May 14 11:24:02 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 14 May 2023 12:24:02 -0400 Subject: [petsc-users] SNESDMDASNESSetFunctionLocal in Fortran In-Reply-To: References: Message-ID: On Sun, May 14, 2023 at 12:06?PM Khaled Nabil Shar Abdelaziz < kabdelaz at purdue.edu> wrote: > Hey there, > > > > I'm having a problem with the DMDASNESSetFunctionLocal() function in C and > its Fortran counterpart. The thing is, in C, you can pass a bunch of > variables using the ctx parameter, but in Fortran, it only seems to accept > one variable. > > > > What's weird is that the SNESSetFunction() function has a similar ctx > parameter, but in Fortran, it can handle multiple variables for ctx, unlike > DMDASNESSetFunctionLocal(). Do you know if this is on purpose, or am I > missing something? > I think we show how to do this here: https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex5f90t.F90 Thanks, Matt > Thanks in advance! > > > > Best regards, > > Khaled > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sun May 14 11:27:16 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Mon, 15 May 2023 00:27:16 +0800 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: On Sun, 14 May 2023 at 23:54, Matthew Knepley wrote: > On Sun, May 14, 2023 at 9:21?AM Zongze Yang wrote: > >> Hi, Matt, >> >> The issue has been resolved while testing on the latest version of PETSc. >> It seems that the problem has been fixed in the following merge request: >> https://gitlab.com/petsc/petsc/-/merge_requests/5970 >> > > No problem. Glad it is working. > > >> I sincerely apologize for any inconvenience caused by my previous >> message. However, I would like to provide you with additional information >> regarding the test files. Attached to this email, you will find two Gmsh >> files: "square_2rd.msh" and "square_3rd.msh." These files contain >> high-order triangulated mesh data for the unit square. >> >> ``` >> $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh >> -dm_plex_gmsh_project >> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >> -dm_plex_gmsh_project_fe_view -volume 1 >> PetscFE Object: P2 1 MPI process >> type: basic >> Basic Finite Element in 2 dimensions with 2 components >> PetscSpace Object: P2 1 MPI process >> type: sum >> Space in 2 variables with 2 components, size 12 >> Sum space of 2 concatenated subspaces (all identical) >> PetscSpace Object: sum component (sumcomp_) 1 MPI process >> type: poly >> Space in 2 variables with 1 components, size 6 >> Polynomial space of degree 2 >> PetscDualSpace Object: P2 1 MPI process >> type: lagrange >> Dual space with 2 components, size 12 >> Continuous Lagrange dual space >> Quadrature on a triangle of order 5 on 9 points (dim 2) >> Volume: 1. >> $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh >> -dm_plex_gmsh_project >> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >> -dm_plex_gmsh_project_fe_view -volume 1 >> PetscFE Object: P3 1 MPI process >> type: basic >> Basic Finite Element in 2 dimensions with 2 components >> PetscSpace Object: P3 1 MPI process >> type: sum >> Space in 2 variables with 2 components, size 20 >> Sum space of 2 concatenated subspaces (all identical) >> PetscSpace Object: sum component (sumcomp_) 1 MPI process >> type: poly >> Space in 2 variables with 1 components, size 10 >> Polynomial space of degree 3 >> PetscDualSpace Object: P3 1 MPI process >> type: lagrange >> Dual space with 2 components, size 20 >> Continuous Lagrange dual space >> Quadrature on a triangle of order 7 on 16 points (dim 2) >> Volume: 1. >> ``` >> >> Thank you for your attention and understanding. I apologize once again >> for my previous oversight. >> > > Great! If you make an MR for this, you will be included on the next list > of PETSc contributors. Otherwise, I can do it. > > I appreciate your offer to handle the MR. Please go ahead and take care of it. Thank you! Best Wishes, Zongze > Thanks, > > Matt > > >> Best wishes, >> Zongze >> >> >> On Sun, 14 May 2023 at 16:44, Matthew Knepley wrote: >> >>> On Sat, May 13, 2023 at 6:08?AM Zongze Yang >>> wrote: >>> >>>> Hi, Matt, >>>> >>>> There seem to be ongoing issues with projecting high-order coordinates >>>> from a gmsh file to other spaces. I would like to inquire whether there are >>>> any plans to resolve this problem. >>>> >>>> Thank you for your attention to this matter. >>>> >>> >>> Yes, I will look at it. The important thing is to have a good test. Here >>> are the higher order geometry tests >>> >>> >>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >>> >>> I take shapes with known volume, mesh them with higher order geometry, >>> and look at the convergence to the true volume. Could you add a GMsh test, >>> meaning the .msh file and known volume, and I will fix it? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> >>>> Best wishes, >>>> Zongze >>>> >>>> >>>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang wrote: >>>> >>>>> Thank you for your reply. May I ask for some references on the order >>>>> of the dofs on PETSc's FE Space (especially high order elements)? >>>>> >>>>> Thanks, >>>>> >>>>> Zongze >>>>> >>>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>>> >>>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>>>>> wrote: >>>>>> >>>>>>> In order to check if I made mistakes in the python code, I try to >>>>>>> use c code to show the issue on DMProjectCoordinates. The code and mesh >>>>>>> file is attached. >>>>>>> If the code is correct, there must be something wrong with >>>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>>> >>>>>> >>>>>> Something is definitely wrong with high order, periodic simplices >>>>>> from Gmsh. We had not tested that case. I am at a conference and cannot >>>>>> look at it for a week. >>>>>> My suspicion is that the space we make when reading in the Gmsh >>>>>> coordinates does not match the values (wrong order). >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> The command and the output are listed below: (Obviously the bounding >>>>>>> box is changed.) >>>>>>> ``` >>>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view >>>>>>> -new_fe_view >>>>>>> Old Bounding Box: >>>>>>> 0: lo = 0. hi = 1. >>>>>>> 1: lo = 0. hi = 1. >>>>>>> 2: lo = 0. hi = 1. >>>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>>> type: basic >>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>> type: sum >>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>> type: poly >>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>> Polynomial space of degree 2 >>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>> type: lagrange >>>>>>> Dual space with 3 components, size 30 >>>>>>> Discontinuous Lagrange dual space >>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>>> type: basic >>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>> type: sum >>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>> type: poly >>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>> Polynomial space of degree 2 >>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>> type: lagrange >>>>>>> Dual space with 3 components, size 30 >>>>>>> Continuous Lagrange dual space >>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>> New Bounding Box: >>>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>>> ``` >>>>>>> >>>>>>> Thanks, >>>>>>> Zongze >>>>>>> >>>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>>> >>>>>>>> I tried the projection operation. However, it seems that the >>>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>>> changed! See logs below. >>>>>>>> >>>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>>> ``` >>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>>> PetscINCREF(c.obj) >>>>>>>> return c >>>>>>>> >>>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>>> + if fe is None: >>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>>> + else: >>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>>> + >>>>>>>> def getBoundingBox(self): >>>>>>>> cdef PetscInt i,dim=0 >>>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>>> >>>>>>>> int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>>> ``` >>>>>>>> >>>>>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>>>>> ``` >>>>>>>> import firedrake as fd >>>>>>>> from firedrake.petsc import PETSc >>>>>>>> >>>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>>> >>>>>>>> dim = plex.getDimension() >>>>>>>> # (dim, nc, isSimplex, k, >>>>>>>> qorder, comm=None) >>>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>>> PETSc.DETERMINE) >>>>>>>> plex.projectCoordinates(fe_new) >>>>>>>> fe_new.view() >>>>>>>> >>>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>>> ``` >>>>>>>> >>>>>>>> The output is (The bounding box is changed!) >>>>>>>> ``` >>>>>>>> >>>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>>> type: basic >>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>> type: sum >>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>> type: poly >>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>> Polynomial space of degree 2 >>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>> type: lagrange >>>>>>>> Dual space with 3 components, size 30 >>>>>>>> Continuous Lagrange dual space >>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>>> >>>>>>>> ``` >>>>>>>> >>>>>>>> >>>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>>> >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> >>>>>>>> Zongze >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>>> >>>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>>> >>>>>>>>>> ? >>>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang < >>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I checked the coordinates vector, and found the dogs only defined >>>>>>>>>> on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>>> seems return >>>>>>>>>> the coordinates in lex order. >>>>>>>>>> >>>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>>> ; /* >>>>>>>>>> XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The continuity is set to false for simplex. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Oh, yes. That needs to be fixed. For now, you can just project it >>>>>>>>> to P2 if you want using >>>>>>>>> >>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Zongze >>>>>>>>>> >>>>>>>>>> You can ask for the coordinates of a vertex or an edge directly >>>>>>>>>> using >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>>> >>>>>>>>>> by giving the vertex or edge point. You can get all the >>>>>>>>>> coordinates on a cell, in the closure order, using >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Below is some code load the gmsh file, I want to know the >>>>>>>>>>> relation between `cl` and `cell_coords`. >>>>>>>>>>> >>>>>>>>>>> ``` >>>>>>>>>>> import firedrake as fd >>>>>>>>>>> import numpy as np >>>>>>>>>>> >>>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>> >>>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>>> >>>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>>> >>>>>>>>>>> for i in range(cs, ce): >>>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>>> print(f'coordinates for cell {i} >>>>>>>>>>> :\n{cell_coords.reshape([-1, 3])}') >>>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>>> print('closure:', cl) >>>>>>>>>>> break >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> Best wishes, >>>>>>>>>>> Zongze >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun May 14 15:24:31 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 14 May 2023 16:24:31 -0400 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: On Sun, May 14, 2023 at 12:27?PM Zongze Yang wrote: > > > > On Sun, 14 May 2023 at 23:54, Matthew Knepley wrote: > >> On Sun, May 14, 2023 at 9:21?AM Zongze Yang wrote: >> >>> Hi, Matt, >>> >>> The issue has been resolved while testing on the latest version of >>> PETSc. It seems that the problem has been fixed in the following merge >>> request: https://gitlab.com/petsc/petsc/-/merge_requests/5970 >>> >> >> No problem. Glad it is working. >> >> >>> I sincerely apologize for any inconvenience caused by my previous >>> message. However, I would like to provide you with additional information >>> regarding the test files. Attached to this email, you will find two Gmsh >>> files: "square_2rd.msh" and "square_3rd.msh." These files contain >>> high-order triangulated mesh data for the unit square. >>> >>> ``` >>> $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh >>> -dm_plex_gmsh_project >>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>> -dm_plex_gmsh_project_fe_view -volume 1 >>> PetscFE Object: P2 1 MPI process >>> type: basic >>> Basic Finite Element in 2 dimensions with 2 components >>> PetscSpace Object: P2 1 MPI process >>> type: sum >>> Space in 2 variables with 2 components, size 12 >>> Sum space of 2 concatenated subspaces (all identical) >>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>> type: poly >>> Space in 2 variables with 1 components, size 6 >>> Polynomial space of degree 2 >>> PetscDualSpace Object: P2 1 MPI process >>> type: lagrange >>> Dual space with 2 components, size 12 >>> Continuous Lagrange dual space >>> Quadrature on a triangle of order 5 on 9 points (dim 2) >>> Volume: 1. >>> $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh >>> -dm_plex_gmsh_project >>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>> -dm_plex_gmsh_project_fe_view -volume 1 >>> PetscFE Object: P3 1 MPI process >>> type: basic >>> Basic Finite Element in 2 dimensions with 2 components >>> PetscSpace Object: P3 1 MPI process >>> type: sum >>> Space in 2 variables with 2 components, size 20 >>> Sum space of 2 concatenated subspaces (all identical) >>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>> type: poly >>> Space in 2 variables with 1 components, size 10 >>> Polynomial space of degree 3 >>> PetscDualSpace Object: P3 1 MPI process >>> type: lagrange >>> Dual space with 2 components, size 20 >>> Continuous Lagrange dual space >>> Quadrature on a triangle of order 7 on 16 points (dim 2) >>> Volume: 1. >>> ``` >>> >>> Thank you for your attention and understanding. I apologize once again >>> for my previous oversight. >>> >> >> Great! If you make an MR for this, you will be included on the next list >> of PETSc contributors. Otherwise, I can do it. >> >> > I appreciate your offer to handle the MR. Please go ahead and take care of > it. Thank you! > I have created the MR with your tests. They are working for me: https://gitlab.com/petsc/petsc/-/merge_requests/6463 Thanks, Matt > Best Wishes, > Zongze > > >> Thanks, >> >> Matt >> >> >>> Best wishes, >>> Zongze >>> >>> >>> On Sun, 14 May 2023 at 16:44, Matthew Knepley wrote: >>> >>>> On Sat, May 13, 2023 at 6:08?AM Zongze Yang >>>> wrote: >>>> >>>>> Hi, Matt, >>>>> >>>>> There seem to be ongoing issues with projecting high-order coordinates >>>>> from a gmsh file to other spaces. I would like to inquire whether there are >>>>> any plans to resolve this problem. >>>>> >>>>> Thank you for your attention to this matter. >>>>> >>>> >>>> Yes, I will look at it. The important thing is to have a good test. >>>> Here are the higher order geometry tests >>>> >>>> >>>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >>>> >>>> I take shapes with known volume, mesh them with higher order geometry, >>>> and look at the convergence to the true volume. Could you add a GMsh test, >>>> meaning the .msh file and known volume, and I will fix it? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> >>>>> Best wishes, >>>>> Zongze >>>>> >>>>> >>>>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang >>>>> wrote: >>>>> >>>>>> Thank you for your reply. May I ask for some references on the order >>>>>> of the dofs on PETSc's FE Space (especially high order elements)? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Zongze >>>>>> >>>>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>>>> >>>>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>>>>>> wrote: >>>>>>> >>>>>>>> In order to check if I made mistakes in the python code, I try to >>>>>>>> use c code to show the issue on DMProjectCoordinates. The code and mesh >>>>>>>> file is attached. >>>>>>>> If the code is correct, there must be something wrong with >>>>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>>>> >>>>>>> >>>>>>> Something is definitely wrong with high order, periodic simplices >>>>>>> from Gmsh. We had not tested that case. I am at a conference and cannot >>>>>>> look at it for a week. >>>>>>> My suspicion is that the space we make when reading in the Gmsh >>>>>>> coordinates does not match the values (wrong order). >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> The command and the output are listed below: (Obviously the >>>>>>>> bounding box is changed.) >>>>>>>> ``` >>>>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view >>>>>>>> -new_fe_view >>>>>>>> Old Bounding Box: >>>>>>>> 0: lo = 0. hi = 1. >>>>>>>> 1: lo = 0. hi = 1. >>>>>>>> 2: lo = 0. hi = 1. >>>>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>>>> type: basic >>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>> type: sum >>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>> type: poly >>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>> Polynomial space of degree 2 >>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>> type: lagrange >>>>>>>> Dual space with 3 components, size 30 >>>>>>>> Discontinuous Lagrange dual space >>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>>>> type: basic >>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>> type: sum >>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>> type: poly >>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>> Polynomial space of degree 2 >>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>> type: lagrange >>>>>>>> Dual space with 3 components, size 30 >>>>>>>> Continuous Lagrange dual space >>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>> New Bounding Box: >>>>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>>>> ``` >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Zongze >>>>>>>> >>>>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>>>> >>>>>>>>> I tried the projection operation. However, it seems that the >>>>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>>>> changed! See logs below. >>>>>>>>> >>>>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>>>> ``` >>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>>>> PetscINCREF(c.obj) >>>>>>>>> return c >>>>>>>>> >>>>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>>>> + if fe is None: >>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>>>> + else: >>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>>>> + >>>>>>>>> def getBoundingBox(self): >>>>>>>>> cdef PetscInt i,dim=0 >>>>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>>>> >>>>>>>>> int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>>>>>> ``` >>>>>>>>> import firedrake as fd >>>>>>>>> from firedrake.petsc import PETSc >>>>>>>>> >>>>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>>>> >>>>>>>>> dim = plex.getDimension() >>>>>>>>> # (dim, nc, isSimplex, k, >>>>>>>>> qorder, comm=None) >>>>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>>>> PETSc.DETERMINE) >>>>>>>>> plex.projectCoordinates(fe_new) >>>>>>>>> fe_new.view() >>>>>>>>> >>>>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> The output is (The bounding box is changed!) >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>>>> type: basic >>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>> type: sum >>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>> type: poly >>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>> Polynomial space of degree 2 >>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>> type: lagrange >>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>> Continuous Lagrange dual space >>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>>>> >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> >>>>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> >>>>>>>>> Zongze >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>>>> >>>>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang < >>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>>>> >>>>>>>>>>> ? >>>>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang < >>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I checked the coordinates vector, and found the dogs only >>>>>>>>>>> defined on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>>>> seems return >>>>>>>>>>> the coordinates in lex order. >>>>>>>>>>> >>>>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>>>> ; /* >>>>>>>>>>> XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The continuity is set to false for simplex. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Oh, yes. That needs to be fixed. For now, you can just project it >>>>>>>>>> to P2 if you want using >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Zongze >>>>>>>>>>> >>>>>>>>>>> You can ask for the coordinates of a vertex or an edge directly >>>>>>>>>>> using >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>>>> >>>>>>>>>>> by giving the vertex or edge point. You can get all the >>>>>>>>>>> coordinates on a cell, in the closure order, using >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Below is some code load the gmsh file, I want to know the >>>>>>>>>>>> relation between `cl` and `cell_coords`. >>>>>>>>>>>> >>>>>>>>>>>> ``` >>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>> import numpy as np >>>>>>>>>>>> >>>>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>> >>>>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>>>> >>>>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>>>> >>>>>>>>>>>> for i in range(cs, ce): >>>>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>>>> print(f'coordinates for cell {i} >>>>>>>>>>>> :\n{cell_coords.reshape([-1, 3])}') >>>>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>>>> print('closure:', cl) >>>>>>>>>>>> break >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> Best wishes, >>>>>>>>>>>> Zongze >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sun May 14 18:23:47 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Mon, 15 May 2023 07:23:47 +0800 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: Could you try to project the coordinates into the continuity space by enabling the option `-dm_plex_gmsh_project_petscdualspace_lagrange_continuity true`? Best wishes, Zongze On Mon, 15 May 2023 at 04:24, Matthew Knepley wrote: > On Sun, May 14, 2023 at 12:27?PM Zongze Yang wrote: > >> >> >> >> On Sun, 14 May 2023 at 23:54, Matthew Knepley wrote: >> >>> On Sun, May 14, 2023 at 9:21?AM Zongze Yang >>> wrote: >>> >>>> Hi, Matt, >>>> >>>> The issue has been resolved while testing on the latest version of >>>> PETSc. It seems that the problem has been fixed in the following merge >>>> request: https://gitlab.com/petsc/petsc/-/merge_requests/5970 >>>> >>> >>> No problem. Glad it is working. >>> >>> >>>> I sincerely apologize for any inconvenience caused by my previous >>>> message. However, I would like to provide you with additional information >>>> regarding the test files. Attached to this email, you will find two Gmsh >>>> files: "square_2rd.msh" and "square_3rd.msh." These files contain >>>> high-order triangulated mesh data for the unit square. >>>> >>>> ``` >>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh >>>> -dm_plex_gmsh_project >>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>> PetscFE Object: P2 1 MPI process >>>> type: basic >>>> Basic Finite Element in 2 dimensions with 2 components >>>> PetscSpace Object: P2 1 MPI process >>>> type: sum >>>> Space in 2 variables with 2 components, size 12 >>>> Sum space of 2 concatenated subspaces (all identical) >>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>> type: poly >>>> Space in 2 variables with 1 components, size 6 >>>> Polynomial space of degree 2 >>>> PetscDualSpace Object: P2 1 MPI process >>>> type: lagrange >>>> Dual space with 2 components, size 12 >>>> Continuous Lagrange dual space >>>> Quadrature on a triangle of order 5 on 9 points (dim 2) >>>> Volume: 1. >>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh >>>> -dm_plex_gmsh_project >>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>> PetscFE Object: P3 1 MPI process >>>> type: basic >>>> Basic Finite Element in 2 dimensions with 2 components >>>> PetscSpace Object: P3 1 MPI process >>>> type: sum >>>> Space in 2 variables with 2 components, size 20 >>>> Sum space of 2 concatenated subspaces (all identical) >>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>> type: poly >>>> Space in 2 variables with 1 components, size 10 >>>> Polynomial space of degree 3 >>>> PetscDualSpace Object: P3 1 MPI process >>>> type: lagrange >>>> Dual space with 2 components, size 20 >>>> Continuous Lagrange dual space >>>> Quadrature on a triangle of order 7 on 16 points (dim 2) >>>> Volume: 1. >>>> ``` >>>> >>>> Thank you for your attention and understanding. I apologize once again >>>> for my previous oversight. >>>> >>> >>> Great! If you make an MR for this, you will be included on the next list >>> of PETSc contributors. Otherwise, I can do it. >>> >>> >> I appreciate your offer to handle the MR. Please go ahead and take care >> of it. Thank you! >> > > I have created the MR with your tests. They are working for me: > > https://gitlab.com/petsc/petsc/-/merge_requests/6463 > > Thanks, > > Matt > > >> Best Wishes, >> Zongze >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Best wishes, >>>> Zongze >>>> >>>> >>>> On Sun, 14 May 2023 at 16:44, Matthew Knepley >>>> wrote: >>>> >>>>> On Sat, May 13, 2023 at 6:08?AM Zongze Yang >>>>> wrote: >>>>> >>>>>> Hi, Matt, >>>>>> >>>>>> There seem to be ongoing issues with projecting high-order >>>>>> coordinates from a gmsh file to other spaces. I would like to inquire >>>>>> whether there are any plans to resolve this problem. >>>>>> >>>>>> Thank you for your attention to this matter. >>>>>> >>>>> >>>>> Yes, I will look at it. The important thing is to have a good test. >>>>> Here are the higher order geometry tests >>>>> >>>>> >>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >>>>> >>>>> I take shapes with known volume, mesh them with higher order geometry, >>>>> and look at the convergence to the true volume. Could you add a GMsh test, >>>>> meaning the .msh file and known volume, and I will fix it? >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> >>>>>> Best wishes, >>>>>> Zongze >>>>>> >>>>>> >>>>>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang >>>>>> wrote: >>>>>> >>>>>>> Thank you for your reply. May I ask for some references on the order >>>>>>> of the dofs on PETSc's FE Space (especially high order elements)? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Zongze >>>>>>> >>>>>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>>>>> >>>>>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>>>>>>> wrote: >>>>>>>> >>>>>>>>> In order to check if I made mistakes in the python code, I try to >>>>>>>>> use c code to show the issue on DMProjectCoordinates. The code and mesh >>>>>>>>> file is attached. >>>>>>>>> If the code is correct, there must be something wrong with >>>>>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>>>>> >>>>>>>> >>>>>>>> Something is definitely wrong with high order, periodic simplices >>>>>>>> from Gmsh. We had not tested that case. I am at a conference and cannot >>>>>>>> look at it for a week. >>>>>>>> My suspicion is that the space we make when reading in the Gmsh >>>>>>>> coordinates does not match the values (wrong order). >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> The command and the output are listed below: (Obviously the >>>>>>>>> bounding box is changed.) >>>>>>>>> ``` >>>>>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view >>>>>>>>> -new_fe_view >>>>>>>>> Old Bounding Box: >>>>>>>>> 0: lo = 0. hi = 1. >>>>>>>>> 1: lo = 0. hi = 1. >>>>>>>>> 2: lo = 0. hi = 1. >>>>>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>>>>> type: basic >>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>> type: sum >>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>> type: poly >>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>> Polynomial space of degree 2 >>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>> type: lagrange >>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>> Discontinuous Lagrange dual space >>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>>>>> type: basic >>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>> type: sum >>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>> type: poly >>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>> Polynomial space of degree 2 >>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>> type: lagrange >>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>> Continuous Lagrange dual space >>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>> New Bounding Box: >>>>>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Zongze >>>>>>>>> >>>>>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>>>>> >>>>>>>>>> I tried the projection operation. However, it seems that the >>>>>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>>>>> changed! See logs below. >>>>>>>>>> >>>>>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>>>>> ``` >>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>>>>> PetscINCREF(c.obj) >>>>>>>>>> return c >>>>>>>>>> >>>>>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>>>>> + if fe is None: >>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>>>>> + else: >>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>>>>> + >>>>>>>>>> def getBoundingBox(self): >>>>>>>>>> cdef PetscInt i,dim=0 >>>>>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>>>>> >>>>>>>>>> int >>>>>>>>>> DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>>>>>>> ``` >>>>>>>>>> import firedrake as fd >>>>>>>>>> from firedrake.petsc import PETSc >>>>>>>>>> >>>>>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>>>>> >>>>>>>>>> dim = plex.getDimension() >>>>>>>>>> # (dim, nc, isSimplex, k, >>>>>>>>>> qorder, comm=None) >>>>>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>>>>> PETSc.DETERMINE) >>>>>>>>>> plex.projectCoordinates(fe_new) >>>>>>>>>> fe_new.view() >>>>>>>>>> >>>>>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> The output is (The bounding box is changed!) >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>>>>> type: basic >>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>> type: sum >>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>> type: poly >>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>> type: lagrange >>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>>>>> >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Zongze >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>>>>> >>>>>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang < >>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>>>>> >>>>>>>>>>>> ? >>>>>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang < >>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I checked the coordinates vector, and found the dogs only >>>>>>>>>>>> defined on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>>>>> seems return >>>>>>>>>>>> the coordinates in lex order. >>>>>>>>>>>> >>>>>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>>>>> ; /* >>>>>>>>>>>> XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The continuity is set to false for simplex. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Oh, yes. That needs to be fixed. For now, you can just project >>>>>>>>>>> it to P2 if you want using >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Zongze >>>>>>>>>>>> >>>>>>>>>>>> You can ask for the coordinates of a vertex or an edge directly >>>>>>>>>>>> using >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>>>>> >>>>>>>>>>>> by giving the vertex or edge point. You can get all the >>>>>>>>>>>> coordinates on a cell, in the closure order, using >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Below is some code load the gmsh file, I want to know the >>>>>>>>>>>>> relation between `cl` and `cell_coords`. >>>>>>>>>>>>> >>>>>>>>>>>>> ``` >>>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>>> import numpy as np >>>>>>>>>>>>> >>>>>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>>> >>>>>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>>>>> >>>>>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>>>>> >>>>>>>>>>>>> for i in range(cs, ce): >>>>>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>>>>> print(f'coordinates for cell {i} >>>>>>>>>>>>> :\n{cell_coords.reshape([-1, 3])}') >>>>>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>>>>> print('closure:', cl) >>>>>>>>>>>>> break >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>> Zongze >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 15 04:24:44 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 May 2023 05:24:44 -0400 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: On Sun, May 14, 2023 at 7:23?PM Zongze Yang wrote: > Could you try to project the coordinates into the continuity space by > enabling the option > `-dm_plex_gmsh_project_petscdualspace_lagrange_continuity true`? > There is a comment in the code about that: /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ So what is currently done is you project into the discontinuous space from the GMsh coordinates, and then we get the continuous coordinates from those later. This is why we get the right answer. Thanks, Matt > Best wishes, > Zongze > > > On Mon, 15 May 2023 at 04:24, Matthew Knepley wrote: > >> On Sun, May 14, 2023 at 12:27?PM Zongze Yang >> wrote: >> >>> >>> >>> >>> On Sun, 14 May 2023 at 23:54, Matthew Knepley wrote: >>> >>>> On Sun, May 14, 2023 at 9:21?AM Zongze Yang >>>> wrote: >>>> >>>>> Hi, Matt, >>>>> >>>>> The issue has been resolved while testing on the latest version of >>>>> PETSc. It seems that the problem has been fixed in the following merge >>>>> request: https://gitlab.com/petsc/petsc/-/merge_requests/5970 >>>>> >>>> >>>> No problem. Glad it is working. >>>> >>>> >>>>> I sincerely apologize for any inconvenience caused by my previous >>>>> message. However, I would like to provide you with additional information >>>>> regarding the test files. Attached to this email, you will find two Gmsh >>>>> files: "square_2rd.msh" and "square_3rd.msh." These files contain >>>>> high-order triangulated mesh data for the unit square. >>>>> >>>>> ``` >>>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh >>>>> -dm_plex_gmsh_project >>>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>>> PetscFE Object: P2 1 MPI process >>>>> type: basic >>>>> Basic Finite Element in 2 dimensions with 2 components >>>>> PetscSpace Object: P2 1 MPI process >>>>> type: sum >>>>> Space in 2 variables with 2 components, size 12 >>>>> Sum space of 2 concatenated subspaces (all identical) >>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>>> type: poly >>>>> Space in 2 variables with 1 components, size 6 >>>>> Polynomial space of degree 2 >>>>> PetscDualSpace Object: P2 1 MPI process >>>>> type: lagrange >>>>> Dual space with 2 components, size 12 >>>>> Continuous Lagrange dual space >>>>> Quadrature on a triangle of order 5 on 9 points (dim 2) >>>>> Volume: 1. >>>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh >>>>> -dm_plex_gmsh_project >>>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>>> PetscFE Object: P3 1 MPI process >>>>> type: basic >>>>> Basic Finite Element in 2 dimensions with 2 components >>>>> PetscSpace Object: P3 1 MPI process >>>>> type: sum >>>>> Space in 2 variables with 2 components, size 20 >>>>> Sum space of 2 concatenated subspaces (all identical) >>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>>> type: poly >>>>> Space in 2 variables with 1 components, size 10 >>>>> Polynomial space of degree 3 >>>>> PetscDualSpace Object: P3 1 MPI process >>>>> type: lagrange >>>>> Dual space with 2 components, size 20 >>>>> Continuous Lagrange dual space >>>>> Quadrature on a triangle of order 7 on 16 points (dim 2) >>>>> Volume: 1. >>>>> ``` >>>>> >>>>> Thank you for your attention and understanding. I apologize once again >>>>> for my previous oversight. >>>>> >>>> >>>> Great! If you make an MR for this, you will be included on the next >>>> list of PETSc contributors. Otherwise, I can do it. >>>> >>>> >>> I appreciate your offer to handle the MR. Please go ahead and take care >>> of it. Thank you! >>> >> >> I have created the MR with your tests. They are working for me: >> >> https://gitlab.com/petsc/petsc/-/merge_requests/6463 >> >> Thanks, >> >> Matt >> >> >>> Best Wishes, >>> Zongze >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Best wishes, >>>>> Zongze >>>>> >>>>> >>>>> On Sun, 14 May 2023 at 16:44, Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Sat, May 13, 2023 at 6:08?AM Zongze Yang >>>>>> wrote: >>>>>> >>>>>>> Hi, Matt, >>>>>>> >>>>>>> There seem to be ongoing issues with projecting high-order >>>>>>> coordinates from a gmsh file to other spaces. I would like to inquire >>>>>>> whether there are any plans to resolve this problem. >>>>>>> >>>>>>> Thank you for your attention to this matter. >>>>>>> >>>>>> >>>>>> Yes, I will look at it. The important thing is to have a good test. >>>>>> Here are the higher order geometry tests >>>>>> >>>>>> >>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >>>>>> >>>>>> I take shapes with known volume, mesh them with higher order >>>>>> geometry, and look at the convergence to the true volume. Could you add a >>>>>> GMsh test, meaning the .msh file and known volume, and I will fix it? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> >>>>>>> Best wishes, >>>>>>> Zongze >>>>>>> >>>>>>> >>>>>>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang >>>>>>> wrote: >>>>>>> >>>>>>>> Thank you for your reply. May I ask for some references on the >>>>>>>> order of the dofs on PETSc's FE Space (especially high order elements)? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Zongze >>>>>>>> >>>>>>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>>>>>> >>>>>>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> In order to check if I made mistakes in the python code, I try to >>>>>>>>>> use c code to show the issue on DMProjectCoordinates. The code and mesh >>>>>>>>>> file is attached. >>>>>>>>>> If the code is correct, there must be something wrong with >>>>>>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Something is definitely wrong with high order, periodic simplices >>>>>>>>> from Gmsh. We had not tested that case. I am at a conference and cannot >>>>>>>>> look at it for a week. >>>>>>>>> My suspicion is that the space we make when reading in the Gmsh >>>>>>>>> coordinates does not match the values (wrong order). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> The command and the output are listed below: (Obviously the >>>>>>>>>> bounding box is changed.) >>>>>>>>>> ``` >>>>>>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view >>>>>>>>>> -new_fe_view >>>>>>>>>> Old Bounding Box: >>>>>>>>>> 0: lo = 0. hi = 1. >>>>>>>>>> 1: lo = 0. hi = 1. >>>>>>>>>> 2: lo = 0. hi = 1. >>>>>>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>>>>>> type: basic >>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>> type: sum >>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>> type: poly >>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>> type: lagrange >>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>> Discontinuous Lagrange dual space >>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>>>>>> type: basic >>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>> type: sum >>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>> type: poly >>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>> type: lagrange >>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>> New Bounding Box: >>>>>>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Zongze >>>>>>>>>> >>>>>>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>>>>>> >>>>>>>>>>> I tried the projection operation. However, it seems that the >>>>>>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>>>>>> changed! See logs below. >>>>>>>>>>> >>>>>>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>>>>>> ``` >>>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>>>>>> PetscINCREF(c.obj) >>>>>>>>>>> return c >>>>>>>>>>> >>>>>>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>>>>>> + if fe is None: >>>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>>>>>> + else: >>>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>>>>>> + >>>>>>>>>>> def getBoundingBox(self): >>>>>>>>>>> cdef PetscInt i,dim=0 >>>>>>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>>>>>> >>>>>>>>>>> int >>>>>>>>>>> DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>>>>>>>> ``` >>>>>>>>>>> import firedrake as fd >>>>>>>>>>> from firedrake.petsc import PETSc >>>>>>>>>>> >>>>>>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>>>>>> >>>>>>>>>>> dim = plex.getDimension() >>>>>>>>>>> # (dim, nc, isSimplex, k, >>>>>>>>>>> qorder, comm=None) >>>>>>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>>>>>> PETSc.DETERMINE) >>>>>>>>>>> plex.projectCoordinates(fe_new) >>>>>>>>>>> fe_new.view() >>>>>>>>>>> >>>>>>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> The output is (The bounding box is changed!) >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>>>>>> type: basic >>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>> type: sum >>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>>> type: poly >>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>> type: lagrange >>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>>>>>> >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Zongze >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>>>>>> >>>>>>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang < >>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>>>>>> >>>>>>>>>>>>> ? >>>>>>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang < >>>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I checked the coordinates vector, and found the dogs only >>>>>>>>>>>>> defined on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>>>>>> seems return >>>>>>>>>>>>> the coordinates in lex order. >>>>>>>>>>>>> >>>>>>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>>>>>> ; >>>>>>>>>>>>> /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> The continuity is set to false for simplex. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Oh, yes. That needs to be fixed. For now, you can just project >>>>>>>>>>>> it to P2 if you want using >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Zongze >>>>>>>>>>>>> >>>>>>>>>>>>> You can ask for the coordinates of a vertex or an edge >>>>>>>>>>>>> directly using >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>>>>>> >>>>>>>>>>>>> by giving the vertex or edge point. You can get all the >>>>>>>>>>>>> coordinates on a cell, in the closure order, using >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Below is some code load the gmsh file, I want to know the >>>>>>>>>>>>>> relation between `cl` and `cell_coords`. >>>>>>>>>>>>>> >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>>>> import numpy as np >>>>>>>>>>>>>> >>>>>>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>>>> >>>>>>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>>>>>> >>>>>>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>>>>>> >>>>>>>>>>>>>> for i in range(cs, ce): >>>>>>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>>>>>> print(f'coordinates for cell {i} >>>>>>>>>>>>>> :\n{cell_coords.reshape([-1, 3])}') >>>>>>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>>>>>> print('closure:', cl) >>>>>>>>>>>>>> break >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 15 07:02:51 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 May 2023 08:02:51 -0400 Subject: [petsc-users] Issues creating DMPlex from higher order mesh generated by gmsh In-Reply-To: <3d6762c6d99c4d319e8e985c91bc739e@solid.lth.se> References: <3d6762c6d99c4d319e8e985c91bc739e@solid.lth.se> Message-ID: On Fri, May 5, 2023 at 10:55?AM Vilmer Dahlberg via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi. > > > I'm trying to read a mesh of higher element order, in this example a mesh > consisting of 10-node tetrahedral elements, from gmsh, into PETSC. But It > looks like the mesh is not properly being loaded and converted into a > DMPlex. gmsh tells me it has generated a mesh with 7087 nodes, but when I > view my dm object it tells me it has 1081 0-cells. This is the printout I > get > Hi Vilmer, Plex makes a distinction between topological entities, like vertices, edges and cells, and the function spaces used to represent fields, like velocity or coordinates. When formats use "nodes", they mix the two concepts together. You see that if you add the number of vertices and edges, you get 7087, since for P2 there is a "node" on every edge. Is anything else wrong? Thanks, Matt > ... > > > Info : Done meshing order 2 (Wall 0.0169823s, CPU 0.016662s) > > Info : 7087 nodes 5838 elements > > ... > > DM Object: DM_0x84000000_0 1 MPI process > type: plex > DM_0x84000000_0 in 3 dimensions: > Number of 0-cells per rank: 1081 > Number of 1-cells per rank: 6006 > Number of 2-cells per rank: 9104 > Number of 3-cells per rank: 4178 > Labels: > celltype: 4 strata with value/size (0 (1081), 6 (4178), 3 (9104), 1 > (6006)) > depth: 4 strata with value/size (0 (1081), 1 (6006), 2 (9104), 3 (4178)) > Cell Sets: 1 strata with value/size (2 (4178)) > Face Sets: 6 strata with value/size (12 (190), 21 (242), 20 (242), 11 > (192), 22 (242), 10 (188)) > Field P2: > adjacency FEM > ... > > > To replicate the error try generating a mesh according to > > > https://gmsh.info/doc/texinfo/gmsh.html#t5 > > > > > setting the element order to 2, and then loading the mesh using > > > DMPlexCreateGmshFromFile > > > I don't have any issues when i set the element order to 1. > > > Thanks in advance, > > Vilmer > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon May 15 08:30:31 2023 From: jed at jedbrown.org (Jed Brown) Date: Mon, 15 May 2023 07:30:31 -0600 Subject: [petsc-users] Issues creating DMPlex from higher order mesh generated by gmsh In-Reply-To: References: <3d6762c6d99c4d319e8e985c91bc739e@solid.lth.se> Message-ID: <87o7mln2mg.fsf@jedbrown.org> Matthew Knepley writes: > On Fri, May 5, 2023 at 10:55?AM Vilmer Dahlberg via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi. >> >> >> I'm trying to read a mesh of higher element order, in this example a mesh >> consisting of 10-node tetrahedral elements, from gmsh, into PETSC. But It >> looks like the mesh is not properly being loaded and converted into a >> DMPlex. gmsh tells me it has generated a mesh with 7087 nodes, but when I >> view my dm object it tells me it has 1081 0-cells. This is the printout I >> get >> > > Hi Vilmer, > > Plex makes a distinction between topological entities, like vertices, edges > and cells, and the function spaces used to represent fields, like velocity > or coordinates. When formats use "nodes", they mix the two concepts > together. > > You see that if you add the number of vertices and edges, you get 7087, > since for P2 there is a "node" on every edge. Is anything else wrong? Note that quadratic (and higher order) tets are broken with the Gmsh reader. It's been on my todo list for a while. As an example, this works when using linear elements (the projection makes them quadratic and visualization is correct), but is tangled when holes.msh is quadratic. $ $PETSC_ARCH/tests/dm/impls/plex/tutorials/ex1 -dm_plex_filename ~/meshes/holes.msh -dm_view cgns:s.cgns -dm_coord_petscspace_degree 2 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: holes.geo URL: From knepley at gmail.com Mon May 15 08:42:20 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 May 2023 09:42:20 -0400 Subject: [petsc-users] Issues creating DMPlex from higher order mesh generated by gmsh In-Reply-To: <87o7mln2mg.fsf@jedbrown.org> References: <3d6762c6d99c4d319e8e985c91bc739e@solid.lth.se> <87o7mln2mg.fsf@jedbrown.org> Message-ID: On Mon, May 15, 2023 at 9:30?AM Jed Brown wrote: > Matthew Knepley writes: > > > On Fri, May 5, 2023 at 10:55?AM Vilmer Dahlberg via petsc-users < > > petsc-users at mcs.anl.gov> wrote: > > > >> Hi. > >> > >> > >> I'm trying to read a mesh of higher element order, in this example a > mesh > >> consisting of 10-node tetrahedral elements, from gmsh, into PETSC. But > It > >> looks like the mesh is not properly being loaded and converted into a > >> DMPlex. gmsh tells me it has generated a mesh with 7087 nodes, but when > I > >> view my dm object it tells me it has 1081 0-cells. This is the printout > I > >> get > >> > > > > Hi Vilmer, > > > > Plex makes a distinction between topological entities, like vertices, > edges > > and cells, and the function spaces used to represent fields, like > velocity > > or coordinates. When formats use "nodes", they mix the two concepts > > together. > > > > You see that if you add the number of vertices and edges, you get 7087, > > since for P2 there is a "node" on every edge. Is anything else wrong? > > Note that quadratic (and higher order) tets are broken with the Gmsh > reader. It's been on my todo list for a while. > > As an example, this works when using linear elements (the projection makes > them quadratic and visualization is correct), but is tangled when holes.msh > is quadratic. > > $ $PETSC_ARCH/tests/dm/impls/plex/tutorials/ex1 -dm_plex_filename > ~/meshes/holes.msh -dm_view cgns:s.cgns -dm_coord_petscspace_degree 2 > Projection to the continuous space is broken because we do not have the lexicographic order on simplicies done. Are you sure you are projecting into the broken space? Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Mon May 15 08:54:55 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Mon, 15 May 2023 21:54:55 +0800 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: On Mon, 15 May 2023 at 17:24, Matthew Knepley wrote: > On Sun, May 14, 2023 at 7:23?PM Zongze Yang wrote: > >> Could you try to project the coordinates into the continuity space by >> enabling the option >> `-dm_plex_gmsh_project_petscdualspace_lagrange_continuity true`? >> > > There is a comment in the code about that: > > /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ > > So what is currently done is you project into the discontinuous space from > the GMsh coordinates, > and then we get the continuous coordinates from those later. This is why > we get the right answer. > > Sorry, I'm having difficulty understanding the comment and fully understanding your intended meaning. Are you saying that we can only project the space to a discontinuous space? Additionally, should we always set `dm_plex_gmsh_project_petscdualspace_lagrange_continuity` to false for high-order gmsh files? With the option set to `true`, I got the following error: ``` $ $PETSC_DIR/$PETSC_ARCH/tests/dm/impls/plex/tests/runex33_gmsh_3d_q2.sh -e "-dm_plex_gmsh_project_petscdualspace_lagrange_continuity true" not ok dm_impls_plex_tests-ex33_gmsh_3d_q2 # Error code: 77 # Volume: 0.46875 # [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- # [0]PETSC ERROR: Petsc has generated inconsistent data # [0]PETSC ERROR: Calculated volume 0.46875 != 1. actual volume (error 0.53125 > 1e-06 tol) # [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. # [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-294-g9cc24bc9b93 GIT Date: 2023-05-15 12:07:10 +0000 # [0]PETSC ERROR: ../ex33 on a arch-linux-c-debug named AMA-PC-RA18 by yzz Mon May 15 21:53:43 2023 # [0]PETSC ERROR: Configure options --CFLAGS=-I/opt/intel/oneapi/mkl/latest/include --CXXFLAGS=-I/opt/intel/oneapi/mkl/latest/include --LDFLAGS="-Wl,-rpath,/opt/intel/oneapi/mkl/latest/lib/intel64 -L/opt/intel/oneapi/mkl/latest/lib/intel64" --download-bison --download-chaco --download-cmake --download-eigen="/home/yzz/firedrake/complex-int32-mkl-X-debug/src/eigen-3.3.3.tgz " --download-fftw --download-hdf5 --download-hpddm --download-hwloc --download-libpng --download-metis --download-mmg --download-mpich --download-mumps --download-netcdf --download-p4est --download-parmmg --download-pastix --download-pnetcdf --download-ptscotch --download-scalapack --download-slepc --download-suitesparse --download-superlu_dist --download-tetgen --download-triangle --with-blaslapack-dir=/opt/intel/oneapi/mkl/latest --with-c2html=0 --with-debugging=1 --with-fortran-bindings=0 --with-mkl_cpardiso-dir=/opt/intel/oneapi/mkl/latest --with-mkl_pardiso-dir=/opt/intel/oneapi/mkl/latest --with-scalar-type=complex --with-shared-libraries=1 --with-x=1 --with-zlib PETSC_ARCH=arch-linux-c-debug # [0]PETSC ERROR: #1 CheckVolume() at /home/yzz/opt/petsc/src/dm/impls/plex/tests/ex33.c:246 # [0]PETSC ERROR: #2 main() at /home/yzz/opt/petsc/src/dm/impls/plex/tests/ex33.c:261 # [0]PETSC ERROR: PETSc Option Table entries: # [0]PETSC ERROR: -coord_space 0 (source: command line) # [0]PETSC ERROR: -dm_plex_filename /home/yzz/opt/petsc/share/petsc/datafiles/meshes/cube_q2.msh (source: command line) # [0]PETSC ERROR: -dm_plex_gmsh_project (source: command line) # [0]PETSC ERROR: -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true (source: command line) # [0]PETSC ERROR: -tol 1e-6 (source: command line) # [0]PETSC ERROR: -volume 1.0 (source: command line) # [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- # application called MPI_Abort(MPI_COMM_SELF, 77) - process 0 ok dm_impls_plex_tests-ex33_gmsh_3d_q2 # SKIP Command failed so no diff ``` Best wishes, Zongze Thanks, > > Matt > > >> Best wishes, >> Zongze >> >> >> On Mon, 15 May 2023 at 04:24, Matthew Knepley wrote: >> >>> On Sun, May 14, 2023 at 12:27?PM Zongze Yang >>> wrote: >>> >>>> >>>> >>>> >>>> On Sun, 14 May 2023 at 23:54, Matthew Knepley >>>> wrote: >>>> >>>>> On Sun, May 14, 2023 at 9:21?AM Zongze Yang >>>>> wrote: >>>>> >>>>>> Hi, Matt, >>>>>> >>>>>> The issue has been resolved while testing on the latest version of >>>>>> PETSc. It seems that the problem has been fixed in the following merge >>>>>> request: https://gitlab.com/petsc/petsc/-/merge_requests/5970 >>>>>> >>>>> >>>>> No problem. Glad it is working. >>>>> >>>>> >>>>>> I sincerely apologize for any inconvenience caused by my previous >>>>>> message. However, I would like to provide you with additional information >>>>>> regarding the test files. Attached to this email, you will find two Gmsh >>>>>> files: "square_2rd.msh" and "square_3rd.msh." These files contain >>>>>> high-order triangulated mesh data for the unit square. >>>>>> >>>>>> ``` >>>>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh >>>>>> -dm_plex_gmsh_project >>>>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>>>> PetscFE Object: P2 1 MPI process >>>>>> type: basic >>>>>> Basic Finite Element in 2 dimensions with 2 components >>>>>> PetscSpace Object: P2 1 MPI process >>>>>> type: sum >>>>>> Space in 2 variables with 2 components, size 12 >>>>>> Sum space of 2 concatenated subspaces (all identical) >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>>>> type: poly >>>>>> Space in 2 variables with 1 components, size 6 >>>>>> Polynomial space of degree 2 >>>>>> PetscDualSpace Object: P2 1 MPI process >>>>>> type: lagrange >>>>>> Dual space with 2 components, size 12 >>>>>> Continuous Lagrange dual space >>>>>> Quadrature on a triangle of order 5 on 9 points (dim 2) >>>>>> Volume: 1. >>>>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh >>>>>> -dm_plex_gmsh_project >>>>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>>>> PetscFE Object: P3 1 MPI process >>>>>> type: basic >>>>>> Basic Finite Element in 2 dimensions with 2 components >>>>>> PetscSpace Object: P3 1 MPI process >>>>>> type: sum >>>>>> Space in 2 variables with 2 components, size 20 >>>>>> Sum space of 2 concatenated subspaces (all identical) >>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>>>> type: poly >>>>>> Space in 2 variables with 1 components, size 10 >>>>>> Polynomial space of degree 3 >>>>>> PetscDualSpace Object: P3 1 MPI process >>>>>> type: lagrange >>>>>> Dual space with 2 components, size 20 >>>>>> Continuous Lagrange dual space >>>>>> Quadrature on a triangle of order 7 on 16 points (dim 2) >>>>>> Volume: 1. >>>>>> ``` >>>>>> >>>>>> Thank you for your attention and understanding. I apologize once >>>>>> again for my previous oversight. >>>>>> >>>>> >>>>> Great! If you make an MR for this, you will be included on the next >>>>> list of PETSc contributors. Otherwise, I can do it. >>>>> >>>>> >>>> I appreciate your offer to handle the MR. Please go ahead and take care >>>> of it. Thank you! >>>> >>> >>> I have created the MR with your tests. They are working for me: >>> >>> https://gitlab.com/petsc/petsc/-/merge_requests/6463 >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Best Wishes, >>>> Zongze >>>> >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Best wishes, >>>>>> Zongze >>>>>> >>>>>> >>>>>> On Sun, 14 May 2023 at 16:44, Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Sat, May 13, 2023 at 6:08?AM Zongze Yang >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, Matt, >>>>>>>> >>>>>>>> There seem to be ongoing issues with projecting high-order >>>>>>>> coordinates from a gmsh file to other spaces. I would like to inquire >>>>>>>> whether there are any plans to resolve this problem. >>>>>>>> >>>>>>>> Thank you for your attention to this matter. >>>>>>>> >>>>>>> >>>>>>> Yes, I will look at it. The important thing is to have a good test. >>>>>>> Here are the higher order geometry tests >>>>>>> >>>>>>> >>>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >>>>>>> >>>>>>> I take shapes with known volume, mesh them with higher order >>>>>>> geometry, and look at the convergence to the true volume. Could you add a >>>>>>> GMsh test, meaning the .msh file and known volume, and I will fix it? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Best wishes, >>>>>>>> Zongze >>>>>>>> >>>>>>>> >>>>>>>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thank you for your reply. May I ask for some references on the >>>>>>>>> order of the dofs on PETSc's FE Space (especially high order elements)? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Zongze >>>>>>>>> >>>>>>>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>>>>>>> >>>>>>>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> In order to check if I made mistakes in the python code, I try >>>>>>>>>>> to use c code to show the issue on DMProjectCoordinates. The code and mesh >>>>>>>>>>> file is attached. >>>>>>>>>>> If the code is correct, there must be something wrong with >>>>>>>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Something is definitely wrong with high order, periodic simplices >>>>>>>>>> from Gmsh. We had not tested that case. I am at a conference and cannot >>>>>>>>>> look at it for a week. >>>>>>>>>> My suspicion is that the space we make when reading in the Gmsh >>>>>>>>>> coordinates does not match the values (wrong order). >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> The command and the output are listed below: (Obviously the >>>>>>>>>>> bounding box is changed.) >>>>>>>>>>> ``` >>>>>>>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view >>>>>>>>>>> -new_fe_view >>>>>>>>>>> Old Bounding Box: >>>>>>>>>>> 0: lo = 0. hi = 1. >>>>>>>>>>> 1: lo = 0. hi = 1. >>>>>>>>>>> 2: lo = 0. hi = 1. >>>>>>>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>>>>>>> type: basic >>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>> type: sum >>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>>> type: poly >>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>> type: lagrange >>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>> Discontinuous Lagrange dual space >>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>>>>>>> type: basic >>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>> type: sum >>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>>> type: poly >>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>> type: lagrange >>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>> New Bounding Box: >>>>>>>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>>>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>>>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Zongze >>>>>>>>>>> >>>>>>>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>>>>>>> >>>>>>>>>>>> I tried the projection operation. However, it seems that the >>>>>>>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>>>>>>> changed! See logs below. >>>>>>>>>>>> >>>>>>>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>>>>>>> ``` >>>>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>>>>>>> PetscINCREF(c.obj) >>>>>>>>>>>> return c >>>>>>>>>>>> >>>>>>>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>>>>>>> + if fe is None: >>>>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>>>>>>> + else: >>>>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>>>>>>> + >>>>>>>>>>>> def getBoundingBox(self): >>>>>>>>>>>> cdef PetscInt i,dim=0 >>>>>>>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>>>>>>> >>>>>>>>>>>> int >>>>>>>>>>>> DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> Then in python, I load a mesh and project the coordinates to P2: >>>>>>>>>>>> ``` >>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>> from firedrake.petsc import PETSc >>>>>>>>>>>> >>>>>>>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>>>>>>> >>>>>>>>>>>> dim = plex.getDimension() >>>>>>>>>>>> # (dim, nc, isSimplex, k, >>>>>>>>>>>> qorder, comm=None) >>>>>>>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>>>>>>> PETSc.DETERMINE) >>>>>>>>>>>> plex.projectCoordinates(fe_new) >>>>>>>>>>>> fe_new.view() >>>>>>>>>>>> >>>>>>>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> The output is (The bounding box is changed!) >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>>>>>>> type: basic >>>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>>> type: sum >>>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>>>> type: poly >>>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>>> type: lagrange >>>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>>>>>>> >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Zongze >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang < >>>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>>>>>>> >>>>>>>>>>>>>> ? >>>>>>>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang < >>>>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I checked the coordinates vector, and found the dogs only >>>>>>>>>>>>>> defined on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>>>>>>> seems return >>>>>>>>>>>>>> the coordinates in lex order. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>>>>>>> ; >>>>>>>>>>>>>> /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The continuity is set to false for simplex. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Oh, yes. That needs to be fixed. For now, you can just project >>>>>>>>>>>>> it to P2 if you want using >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>> >>>>>>>>>>>>>> You can ask for the coordinates of a vertex or an edge >>>>>>>>>>>>>> directly using >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> by giving the vertex or edge point. You can get all the >>>>>>>>>>>>>> coordinates on a cell, in the closure order, using >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Matt >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Below is some code load the gmsh file, I want to know the >>>>>>>>>>>>>>> relation between `cl` and `cell_coords`. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>>>>> import numpy as np >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> for i in range(cs, ce): >>>>>>>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>>>>>>> print(f'coordinates for cell {i} >>>>>>>>>>>>>>> :\n{cell_coords.reshape([-1, 3])}') >>>>>>>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>>>>>>> print('closure:', cl) >>>>>>>>>>>>>>> break >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xiongziming2010 at gmail.com Mon May 15 09:03:52 2023 From: xiongziming2010 at gmail.com (ziming xiong) Date: Mon, 15 May 2023 16:03:52 +0200 Subject: [petsc-users] problem for using PCBDDC Message-ID: Hello sir, I am a PhD student and am trying to use the PCBDDC method in petsc to solve the matrix, but the final result is wrong. So I would like to ask you a few questions. First I will describe the flow of my code, I first used the finite element method to build the total matrix in CSR format (total boundary conditions have been imposed), where I did not build the total matrix, but only the parameters ia, ja,value in CSR format, through which the parameters of the metis (xadj, adjncy) are derived. The matrix is successfully divided into 2 subdomains using metis. After getting the global index of the points of each subdomain by the part parameter of metis. I apply ISLocalToGlobalMappingCreate to case mapping and use ISGlobalToLocalMappingApply to convert the global index of points within each process to local index and use MatSetValueLocal to populate the corresponding subdomain matrix for each process. Here I am missing the relationship of the boundary points between subdomains, and by using ISGlobalToLocalMappingApply (I use IS_GTOLM_MASK to get the points outside the subdomains converted to -1) I can get the index of the missing relationship in the global matrix as well as the value. After creating the global MATIS use MatISSetLocalMat to synchronize the subdomain matrix to the global MATIS. After using MatSetValues to add the relationship of the boundary points between subdomains into the global MATIS. The final calculation is performed, but the final result is not correct. My question is: 1. in PetscCall(MatAssemblyBegin(matIS, MAT_FINAL_ASSEMBLY)). PetscCall(MatAssemblyEnd(matIS, MAT_FINAL_ASSEMBLY)). After that, when viewing the matrix by PetscCall(MatView(matIS,PETSC_VIEWER_STDOUT_WORLD));, each process will output the non-zero items of the matrix separately, but this index is the local index is this normal? 2. I found that after using MatSetValues to add the relationship of boundary points between subdomains into the global MATIS, the calculation result does not change. Why is this? Can I interpolate directly into the global MATIS if I know the global matrix index of the missing relations? Best regards, Ziming XIONG -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Mon May 15 09:22:13 2023 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Mon, 15 May 2023 17:22:13 +0300 Subject: [petsc-users] problem for using PCBDDC In-Reply-To: References: Message-ID: BDDC is a domain decomposition solver of the non-overlapping type and cannot be used on assembled operators. If you want to use it, you need to restructure your code a bit. I presume from your message that your current approach is 1) generate_assembled_csr 2) decompose_csr? or decompose_mesh? 3) get_subdomain_relevant_entries 4) set in local matrix this is wrong since you are summing up redundant matrix values in the final MATIS format (you can check that MatMultEqual return false if you compare the assembled operator and the MATIS operator) You should restructure your code as 1) decompose mesh 2) generate_csr_only_for_local_subdomain 3) set values in local ordering into the MATIS object you can start with a simple 2 cells problem, each assigned to a different process to understand how to move forward. You can play with src/ksp/ksp/tutorials/ex71.c which uses a structured grid to understand how to setup a MATIS for a PDE solve Hope this helps Il giorno lun 15 mag 2023 alle ore 17:08 ziming xiong < xiongziming2010 at gmail.com> ha scritto: > Hello sir, > I am a PhD student and am trying to use the PCBDDC method in petsc to > solve the matrix, but the final result is wrong. So I would like to ask you > a few questions. > First I will describe the flow of my code, I first used the finite element > method to build the total matrix in CSR format (total boundary conditions > have been imposed), where I did not build the total matrix, but only the > parameters ia, ja,value in CSR format, through which the parameters of the > metis (xadj, adjncy) are derived. The matrix is successfully divided into 2 > subdomains using metis. After getting the global index of the points of > each subdomain by the part parameter of metis. I apply > ISLocalToGlobalMappingCreate to case mapping and use > ISGlobalToLocalMappingApply to convert the global index of points within > each process to local index and use MatSetValueLocal to populate the > corresponding subdomain matrix for each process. Here I am missing the > relationship of the boundary points between subdomains, and by using > ISGlobalToLocalMappingApply (I use IS_GTOLM_MASK to get the points outside > the subdomains converted to -1) I can get the index of the missing > relationship in the global matrix as well as the value. After creating the > global MATIS use MatISSetLocalMat to synchronize the subdomain matrix to > the global MATIS. After using MatSetValues to add the relationship of the > boundary points between subdomains into the global MATIS. The final > calculation is performed, but the final result is not correct. > My question is: > 1. in PetscCall(MatAssemblyBegin(matIS, MAT_FINAL_ASSEMBLY)). > PetscCall(MatAssemblyEnd(matIS, MAT_FINAL_ASSEMBLY)). > After that, when viewing the matrix by > PetscCall(MatView(matIS,PETSC_VIEWER_STDOUT_WORLD));, each process will > output the non-zero items of the matrix separately, but this index is the > local index is this normal? > 2. I found that after using MatSetValues to add the relationship of > boundary points between subdomains into the global MATIS, the calculation > result does not change. Why is this? Can I interpolate directly into the > global MATIS if I know the global matrix index of the missing relations? > > > Best regards, > Ziming XIONG > > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Mon May 15 10:04:00 2023 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Mon, 15 May 2023 15:04:00 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Message-ID: Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts not ok sys_classes_draw_tests-ex1_1 # Error code: 139 #?????[excess:98681] *** Process received signal *** #?????[excess:98681] Signal: Segmentation fault: 11 (11) #?????[excess:98681] Signal code: Address not mapped (1) #?????[excess:98681] Failing at address: 0x7f #?????[excess:98681] *** End of error message *** #?????-------------------------------------------------------------------------- #?????Primary job terminated normally, but 1 process returned #?????a non-zero exit code. Per user-direction, the job has been aborted. #?????-------------------------------------------------------------------------- #?????-------------------------------------------------------------------------- #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). #?????-------------------------------------------------------------------------- ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff I see the same segfault error in all PETSc examples. Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! Marcos -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 15 10:28:31 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 May 2023 11:28:31 -0400 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: On Mon, May 15, 2023 at 9:55?AM Zongze Yang wrote: > On Mon, 15 May 2023 at 17:24, Matthew Knepley wrote: > >> On Sun, May 14, 2023 at 7:23?PM Zongze Yang wrote: >> >>> Could you try to project the coordinates into the continuity space by >>> enabling the option >>> `-dm_plex_gmsh_project_petscdualspace_lagrange_continuity true`? >>> >> >> There is a comment in the code about that: >> >> /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >> >> So what is currently done is you project into the discontinuous space >> from the GMsh coordinates, >> and then we get the continuous coordinates from those later. This is why >> we get the right answer. >> >> > Sorry, I'm having difficulty understanding the comment and fully > understanding your intended meaning. Are you saying that we can only > project the space to a discontinuous space? > For higher order simplices, because we do not have the mapping to the GMsh order yet. > Additionally, should we always set > `dm_plex_gmsh_project_petscdualspace_lagrange_continuity` to false for > high-order gmsh files? > This is done automatically if you do not override it. > With the option set to `true`, I got the following error: > Yes, do not do that. Thanks, Matt > ``` > $ $PETSC_DIR/$PETSC_ARCH/tests/dm/impls/plex/tests/runex33_gmsh_3d_q2.sh > -e "-dm_plex_gmsh_project_petscdualspace_lagrange_continuity true" > not ok dm_impls_plex_tests-ex33_gmsh_3d_q2 # Error code: 77 > # Volume: 0.46875 > # [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > # [0]PETSC ERROR: Petsc has generated inconsistent data > # [0]PETSC ERROR: Calculated volume 0.46875 != 1. actual volume > (error 0.53125 > 1e-06 tol) > # [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble > shooting. > # [0]PETSC ERROR: Petsc Development GIT revision: > v3.19.1-294-g9cc24bc9b93 GIT Date: 2023-05-15 12:07:10 +0000 > # [0]PETSC ERROR: ../ex33 on a arch-linux-c-debug named AMA-PC-RA18 > by yzz Mon May 15 21:53:43 2023 > # [0]PETSC ERROR: Configure options > --CFLAGS=-I/opt/intel/oneapi/mkl/latest/include > --CXXFLAGS=-I/opt/intel/oneapi/mkl/latest/include > --LDFLAGS="-Wl,-rpath,/opt/intel/oneapi/mkl/latest/lib/intel64 > -L/opt/intel/oneapi/mkl/latest/lib/intel64" --download-bison > --download-chaco --download-cmake > --download-eigen="/home/yzz/firedrake/complex-int32-mkl-X-debug/src/eigen-3.3.3.tgz > " --download-fftw --download-hdf5 --download-hpddm --download-hwloc > --download-libpng --download-metis --download-mmg --download-mpich > --download-mumps --download-netcdf --download-p4est --download-parmmg > --download-pastix --download-pnetcdf --download-ptscotch > --download-scalapack --download-slepc --download-suitesparse > --download-superlu_dist --download-tetgen --download-triangle > --with-blaslapack-dir=/opt/intel/oneapi/mkl/latest --with-c2html=0 > --with-debugging=1 --with-fortran-bindings=0 > --with-mkl_cpardiso-dir=/opt/intel/oneapi/mkl/latest > --with-mkl_pardiso-dir=/opt/intel/oneapi/mkl/latest > --with-scalar-type=complex --with-shared-libraries=1 --with-x=1 --with-zlib > PETSC_ARCH=arch-linux-c-debug > # [0]PETSC ERROR: #1 CheckVolume() at > /home/yzz/opt/petsc/src/dm/impls/plex/tests/ex33.c:246 > # [0]PETSC ERROR: #2 main() at > /home/yzz/opt/petsc/src/dm/impls/plex/tests/ex33.c:261 > # [0]PETSC ERROR: PETSc Option Table entries: > # [0]PETSC ERROR: -coord_space 0 (source: command line) > # [0]PETSC ERROR: -dm_plex_filename > /home/yzz/opt/petsc/share/petsc/datafiles/meshes/cube_q2.msh (source: > command line) > # [0]PETSC ERROR: -dm_plex_gmsh_project (source: command line) > # [0]PETSC ERROR: > -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true (source: > command line) > # [0]PETSC ERROR: -tol 1e-6 (source: command line) > # [0]PETSC ERROR: -volume 1.0 (source: command line) > # [0]PETSC ERROR: ----------------End of Error Message -------send > entire error message to petsc-maint at mcs.anl.gov---------- > # application called MPI_Abort(MPI_COMM_SELF, 77) - process 0 > ok dm_impls_plex_tests-ex33_gmsh_3d_q2 # SKIP Command failed so no diff > ``` > > > Best wishes, > Zongze > > Thanks, >> >> Matt >> >> >>> Best wishes, >>> Zongze >>> >>> >>> On Mon, 15 May 2023 at 04:24, Matthew Knepley wrote: >>> >>>> On Sun, May 14, 2023 at 12:27?PM Zongze Yang >>>> wrote: >>>> >>>>> >>>>> >>>>> >>>>> On Sun, 14 May 2023 at 23:54, Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Sun, May 14, 2023 at 9:21?AM Zongze Yang >>>>>> wrote: >>>>>> >>>>>>> Hi, Matt, >>>>>>> >>>>>>> The issue has been resolved while testing on the latest version of >>>>>>> PETSc. It seems that the problem has been fixed in the following merge >>>>>>> request: https://gitlab.com/petsc/petsc/-/merge_requests/5970 >>>>>>> >>>>>> >>>>>> No problem. Glad it is working. >>>>>> >>>>>> >>>>>>> I sincerely apologize for any inconvenience caused by my previous >>>>>>> message. However, I would like to provide you with additional information >>>>>>> regarding the test files. Attached to this email, you will find two Gmsh >>>>>>> files: "square_2rd.msh" and "square_3rd.msh." These files contain >>>>>>> high-order triangulated mesh data for the unit square. >>>>>>> >>>>>>> ``` >>>>>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh >>>>>>> -dm_plex_gmsh_project >>>>>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>>>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>>>>> PetscFE Object: P2 1 MPI process >>>>>>> type: basic >>>>>>> Basic Finite Element in 2 dimensions with 2 components >>>>>>> PetscSpace Object: P2 1 MPI process >>>>>>> type: sum >>>>>>> Space in 2 variables with 2 components, size 12 >>>>>>> Sum space of 2 concatenated subspaces (all identical) >>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>>>>> type: poly >>>>>>> Space in 2 variables with 1 components, size 6 >>>>>>> Polynomial space of degree 2 >>>>>>> PetscDualSpace Object: P2 1 MPI process >>>>>>> type: lagrange >>>>>>> Dual space with 2 components, size 12 >>>>>>> Continuous Lagrange dual space >>>>>>> Quadrature on a triangle of order 5 on 9 points (dim 2) >>>>>>> Volume: 1. >>>>>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh >>>>>>> -dm_plex_gmsh_project >>>>>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>>>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>>>>> PetscFE Object: P3 1 MPI process >>>>>>> type: basic >>>>>>> Basic Finite Element in 2 dimensions with 2 components >>>>>>> PetscSpace Object: P3 1 MPI process >>>>>>> type: sum >>>>>>> Space in 2 variables with 2 components, size 20 >>>>>>> Sum space of 2 concatenated subspaces (all identical) >>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>>>>> type: poly >>>>>>> Space in 2 variables with 1 components, size 10 >>>>>>> Polynomial space of degree 3 >>>>>>> PetscDualSpace Object: P3 1 MPI process >>>>>>> type: lagrange >>>>>>> Dual space with 2 components, size 20 >>>>>>> Continuous Lagrange dual space >>>>>>> Quadrature on a triangle of order 7 on 16 points (dim 2) >>>>>>> Volume: 1. >>>>>>> ``` >>>>>>> >>>>>>> Thank you for your attention and understanding. I apologize once >>>>>>> again for my previous oversight. >>>>>>> >>>>>> >>>>>> Great! If you make an MR for this, you will be included on the next >>>>>> list of PETSc contributors. Otherwise, I can do it. >>>>>> >>>>>> >>>>> I appreciate your offer to handle the MR. Please go ahead and take >>>>> care of it. Thank you! >>>>> >>>> >>>> I have created the MR with your tests. They are working for me: >>>> >>>> https://gitlab.com/petsc/petsc/-/merge_requests/6463 >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Best Wishes, >>>>> Zongze >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Best wishes, >>>>>>> Zongze >>>>>>> >>>>>>> >>>>>>> On Sun, 14 May 2023 at 16:44, Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> On Sat, May 13, 2023 at 6:08?AM Zongze Yang >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, Matt, >>>>>>>>> >>>>>>>>> There seem to be ongoing issues with projecting high-order >>>>>>>>> coordinates from a gmsh file to other spaces. I would like to inquire >>>>>>>>> whether there are any plans to resolve this problem. >>>>>>>>> >>>>>>>>> Thank you for your attention to this matter. >>>>>>>>> >>>>>>>> >>>>>>>> Yes, I will look at it. The important thing is to have a good test. >>>>>>>> Here are the higher order geometry tests >>>>>>>> >>>>>>>> >>>>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >>>>>>>> >>>>>>>> I take shapes with known volume, mesh them with higher order >>>>>>>> geometry, and look at the convergence to the true volume. Could you add a >>>>>>>> GMsh test, meaning the .msh file and known volume, and I will fix it? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Best wishes, >>>>>>>>> Zongze >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Thank you for your reply. May I ask for some references on the >>>>>>>>>> order of the dofs on PETSc's FE Space (especially high order elements)? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Zongze >>>>>>>>>> >>>>>>>>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>>>>>>>> >>>>>>>>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang < >>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> In order to check if I made mistakes in the python code, I try >>>>>>>>>>>> to use c code to show the issue on DMProjectCoordinates. The code and mesh >>>>>>>>>>>> file is attached. >>>>>>>>>>>> If the code is correct, there must be something wrong with >>>>>>>>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Something is definitely wrong with high order, periodic >>>>>>>>>>> simplices from Gmsh. We had not tested that case. I am at a conference and >>>>>>>>>>> cannot look at it for a week. >>>>>>>>>>> My suspicion is that the space we make when reading in the Gmsh >>>>>>>>>>> coordinates does not match the values (wrong order). >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> The command and the output are listed below: (Obviously the >>>>>>>>>>>> bounding box is changed.) >>>>>>>>>>>> ``` >>>>>>>>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view >>>>>>>>>>>> -new_fe_view >>>>>>>>>>>> Old Bounding Box: >>>>>>>>>>>> 0: lo = 0. hi = 1. >>>>>>>>>>>> 1: lo = 0. hi = 1. >>>>>>>>>>>> 2: lo = 0. hi = 1. >>>>>>>>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>>>>>>>> type: basic >>>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>>> type: sum >>>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI >>>>>>>>>>>> processes >>>>>>>>>>>> type: poly >>>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>>> type: lagrange >>>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>>> Discontinuous Lagrange dual space >>>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>>>>>>>> type: basic >>>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>>> type: sum >>>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI >>>>>>>>>>>> processes >>>>>>>>>>>> type: poly >>>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>>> type: lagrange >>>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>>> New Bounding Box: >>>>>>>>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>>>>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>>>>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Zongze >>>>>>>>>>>> >>>>>>>>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>>>>>>>> >>>>>>>>>>>>> I tried the projection operation. However, it seems that the >>>>>>>>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>>>>>>>> changed! See logs below. >>>>>>>>>>>>> >>>>>>>>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>>>>>>>> ``` >>>>>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>>>>>>>> PetscINCREF(c.obj) >>>>>>>>>>>>> return c >>>>>>>>>>>>> >>>>>>>>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>>>>>>>> + if fe is None: >>>>>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>>>>>>>> + else: >>>>>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>>>>>>>> + >>>>>>>>>>>>> def getBoundingBox(self): >>>>>>>>>>>>> cdef PetscInt i,dim=0 >>>>>>>>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>>>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>>>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>>>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>>>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>>>>>>>> >>>>>>>>>>>>> int >>>>>>>>>>>>> DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>>>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> Then in python, I load a mesh and project the coordinates to >>>>>>>>>>>>> P2: >>>>>>>>>>>>> ``` >>>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>>> from firedrake.petsc import PETSc >>>>>>>>>>>>> >>>>>>>>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>>>>>>>> >>>>>>>>>>>>> dim = plex.getDimension() >>>>>>>>>>>>> # (dim, nc, isSimplex, k, >>>>>>>>>>>>> qorder, comm=None) >>>>>>>>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>>>>>>>> PETSc.DETERMINE) >>>>>>>>>>>>> plex.projectCoordinates(fe_new) >>>>>>>>>>>>> fe_new.view() >>>>>>>>>>>>> >>>>>>>>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> The output is (The bounding box is changed!) >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>>>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>>>>>>>> type: basic >>>>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>>>> type: sum >>>>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>>>>> type: poly >>>>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>>>> type: lagrange >>>>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>>>>>>>> >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks! >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Zongze >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang < >>>>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang < >>>>>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, the >>>>>>>>>>>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I checked the coordinates vector, and found the dogs only >>>>>>>>>>>>>>> defined on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>>>>>>>> seems return >>>>>>>>>>>>>>> the coordinates in lex order. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>>>>>>>> ; >>>>>>>>>>>>>>> /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The continuity is set to false for simplex. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Oh, yes. That needs to be fixed. For now, you can just >>>>>>>>>>>>>> project it to P2 if you want using >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Matt >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You can ask for the coordinates of a vertex or an edge >>>>>>>>>>>>>>> directly using >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> by giving the vertex or edge point. You can get all the >>>>>>>>>>>>>>> coordinates on a cell, in the closure order, using >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Below is some code load the gmsh file, I want to know the >>>>>>>>>>>>>>>> relation between `cl` and `cell_coords`. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>>>>>> import numpy as np >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> for i in range(cs, ce): >>>>>>>>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>>>>>>>> print(f'coordinates for cell {i} >>>>>>>>>>>>>>>> :\n{cell_coords.reshape([-1, 3])}') >>>>>>>>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>>>>>>>> print('closure:', cl) >>>>>>>>>>>>>>>> break >>>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 15 11:08:09 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 May 2023 12:08:09 -0400 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI > 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX > Ventura 13.3.1. > I can compile PETSc in debug mode with this configure and make lines. I > can run the PETSC tests, which seem fine. > When I compile the library in optimized mode, either using -O3 or O1, for > example configuring with: > I hate to yell "compiler bug" when this happens, but it sure seems like one. Can you just use --with-debugging=0 without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is almost certainly a compiler bug. If not, then we can go in the debugger and see what is failing. Thanks, Matt > $ ./configure --prefix=/opt/petsc-oneapi22u3 > --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g > -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' > FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 > --with-shared-libraries=0 --download-make > > and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib > compiles. Yet, I see right off the bat this segfault error in the first > PETSc example: > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > PETSC_ARCH=arch-darwin-c-opt test > /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make > --no-print-directory -f > /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test > PETSC_ARCH=arch-darwin-c-opt > PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test > /opt/intel/oneapi/intelpython/latest/bin/python3 > /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py > --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 > --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests > Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt > PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o > In file included from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > from > /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): > warning #2621: attribute "warn_unused_result" does not apply here > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > ^ > > CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 > TEST > arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts > not ok sys_classes_draw_tests-ex1_1 *# Error code: 139* > *# [excess:98681] *** Process received signal **** > *# [excess:98681] Signal: Segmentation fault: 11 (11)* > *# [excess:98681] Signal code: Address not mapped (1)* > *# [excess:98681] Failing at address: 0x7f* > *# [excess:98681] *** End of error message **** > # > -------------------------------------------------------------------------- > # Primary job terminated normally, but 1 process returned > # a non-zero exit code. Per user-direction, the job has been aborted. > # > -------------------------------------------------------------------------- > # > -------------------------------------------------------------------------- > # mpiexec noticed that process rank 0 with PID 0 on node excess exited on > signal 11 (Segmentation fault: 11). > # > -------------------------------------------------------------------------- > ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff > > I see the same segfault error in all PETSc examples. > Any help is mostly appreciated, I'm starting to work with PETSc. Our plan > is to use the linear solver from PETSc for the Poisson equation on our > numerical scheme and test this on a GPU cluster. So also, any guideline on > how to interface PETSc with a fortran code and personal experience is also > most appreciated! > > Marcos > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon May 15 11:10:46 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 15 May 2023 21:40:46 +0530 (IST) Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: I see Intel compilers here are building x86_64 binaries - that get run on the Arm M1 CPU - perhaps there are issues here with this mode of usage.. > I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. What does intel compilers provide you for this use case? Why not use xcode/clang with gfortran here - i.e native ARM binaries? Satish On Mon, 15 May 2023, Vanella, Marcos (Fed) via petsc-users wrote: > Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. > I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. > When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: > > $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make > > and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test > /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test > /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests > Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o > In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > ^ > > CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 > TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts > not ok sys_classes_draw_tests-ex1_1 # Error code: 139 > #?????[excess:98681] *** Process received signal *** > #?????[excess:98681] Signal: Segmentation fault: 11 (11) > #?????[excess:98681] Signal code: Address not mapped (1) > #?????[excess:98681] Failing at address: 0x7f > #?????[excess:98681] *** End of error message *** > #?????-------------------------------------------------------------------------- > #?????Primary job terminated normally, but 1 process returned > #?????a non-zero exit code. Per user-direction, the job has been aborted. > #?????-------------------------------------------------------------------------- > #?????-------------------------------------------------------------------------- > #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). > #?????-------------------------------------------------------------------------- > ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff > > I see the same segfault error in all PETSc examples. > Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! > > Marcos > > > > From marcos.vanella at nist.gov Mon May 15 11:20:34 2023 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Mon, 15 May 2023 16:20:34 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: Thank you Matt I'll try this and let you know. Marcos ________________________________ From: Matthew Knepley Sent: Monday, May 15, 2023 12:08 PM To: Vanella, Marcos (Fed) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users > wrote: Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: I hate to yell "compiler bug" when this happens, but it sure seems like one. Can you just use --with-debugging=0 without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is almost certainly a compiler bug. If not, then we can go in the debugger and see what is failing. Thanks, Matt $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts not ok sys_classes_draw_tests-ex1_1 # Error code: 139 #?????[excess:98681] *** Process received signal *** #?????[excess:98681] Signal: Segmentation fault: 11 (11) #?????[excess:98681] Signal code: Address not mapped (1) #?????[excess:98681] Failing at address: 0x7f #?????[excess:98681] *** End of error message *** #?????-------------------------------------------------------------------------- #?????Primary job terminated normally, but 1 process returned #?????a non-zero exit code. Per user-direction, the job has been aborted. #?????-------------------------------------------------------------------------- #?????-------------------------------------------------------------------------- #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). #?????-------------------------------------------------------------------------- ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff I see the same segfault error in all PETSc examples. Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! Marcos -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Mon May 15 11:24:38 2023 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Mon, 15 May 2023 16:24:38 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: Hi Satish, well turns out this is not an M1 Mac, it is an older Intel Mac (2019). I'm trying to get a local computer to do development and tests, but I also have access to linux clusters with GPU which we plan to go to next. Thanks for the suggestion, I might also try compiling a gcc/gfortran version of the lib on this computer. Marcos ________________________________ From: Satish Balay Sent: Monday, May 15, 2023 12:10 PM To: Vanella, Marcos (Fed) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI I see Intel compilers here are building x86_64 binaries - that get run on the Arm M1 CPU - perhaps there are issues here with this mode of usage.. > I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. What does intel compilers provide you for this use case? Why not use xcode/clang with gfortran here - i.e native ARM binaries? Satish On Mon, 15 May 2023, Vanella, Marcos (Fed) via petsc-users wrote: > Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. > I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. > When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: > > $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make > > and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test > /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test > /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests > Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o > In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > ^ > > CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 > TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts > not ok sys_classes_draw_tests-ex1_1 # Error code: 139 > #?????[excess:98681] *** Process received signal *** > #?????[excess:98681] Signal: Segmentation fault: 11 (11) > #?????[excess:98681] Signal code: Address not mapped (1) > #?????[excess:98681] Failing at address: 0x7f > #?????[excess:98681] *** End of error message *** > #?????-------------------------------------------------------------------------- > #?????Primary job terminated normally, but 1 process returned > #?????a non-zero exit code. Per user-direction, the job has been aborted. > #?????-------------------------------------------------------------------------- > #?????-------------------------------------------------------------------------- > #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). > #?????-------------------------------------------------------------------------- > ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff > > I see the same segfault error in all PETSc examples. > Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! > > Marcos > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon May 15 11:48:49 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 15 May 2023 22:18:49 +0530 (IST) Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: <6de054c1-43b6-1c83-c8eb-5617f2f354f7@mcs.anl.gov> Ops - for some reason I assumed this build is on Mac M1. [likely due to the usage of '-m64' - that was strange].. But yeah - our general usage on Mac is with xcode/clang and brew gfortran (on both Intel and ARM CPUs) - and unless you need Intel compilers for specific needs - clang/gfortran should work better for this development work. Satish On Mon, 15 May 2023, Vanella, Marcos (Fed) via petsc-users wrote: > Hi Satish, well turns out this is not an M1 Mac, it is an older Intel Mac (2019). > I'm trying to get a local computer to do development and tests, but I also have access to linux clusters with GPU which we plan to go to next. > Thanks for the suggestion, I might also try compiling a gcc/gfortran version of the lib on this computer. > Marcos > ________________________________ > From: Satish Balay > Sent: Monday, May 15, 2023 12:10 PM > To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI > > I see Intel compilers here are building x86_64 binaries - that get run on the Arm M1 CPU - perhaps there are issues here with this mode of usage.. > > > I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. > > What does intel compilers provide you for this use case? > > Why not use xcode/clang with gfortran here - i.e native ARM binaries? > > > Satish > > On Mon, 15 May 2023, Vanella, Marcos (Fed) via petsc-users wrote: > > > Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. > > I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. > > When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: > > > > $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make > > > > and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: > > > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test > > /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test > > /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests > > Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > > CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o > > In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > > from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): > > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here > > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > > ^ > > > > CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 > > TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts > > not ok sys_classes_draw_tests-ex1_1 # Error code: 139 > > #?????[excess:98681] *** Process received signal *** > > #?????[excess:98681] Signal: Segmentation fault: 11 (11) > > #?????[excess:98681] Signal code: Address not mapped (1) > > #?????[excess:98681] Failing at address: 0x7f > > #?????[excess:98681] *** End of error message *** > > #?????-------------------------------------------------------------------------- > > #?????Primary job terminated normally, but 1 process returned > > #?????a non-zero exit code. Per user-direction, the job has been aborted. > > #?????-------------------------------------------------------------------------- > > #?????-------------------------------------------------------------------------- > > #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). > > #?????-------------------------------------------------------------------------- > > ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff > > > > I see the same segfault error in all PETSc examples. > > Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! > > > > Marcos > > > > > > > > > From marcos.vanella at nist.gov Mon May 15 11:49:34 2023 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Mon, 15 May 2023 16:49:34 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: Hi Matt, I configured the lib like this: $ ./configure --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 --with-debugging=0 --with-shared-libraries=0 --download-make and compiled. I still get some check segfault error. See below: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt check Running check examples to verify correct installation Using PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 and PETSC_ARCH=arch-darwin-c-opt *******************Error detected during compile or link!******************* See https://petsc.org/release/faq/ /Users/mnv/Documents/Software/petsc-3.19.1/src/snes/tutorials ex19 ********************************************************************************* mpicc -Wl,-bind_at_load -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -wd1572 -Wno-unknown-pragmas -g -O3 -I/Users/mnv/Documents/Software/petsc-3.19.1/include -I/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/include -I/opt/X11/include -std=c99 ex19.c -L/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/lib -Wl,-rpath,/opt/intel/oneapi/mkl/2022.2.1/lib -L/opt/intel/oneapi/mkl/2022.2.1/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -L/opt/openmpi414_oneapi22u3/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/opt/intel/oneapi/tbb/2021.7.1/lib -L/opt/intel/oneapi/ippcp/2021.6.2/lib -L/opt/intel/oneapi/ipp/2021.6.2/lib -L/opt/intel/oneapi/dnnl/2022.2.1/cpu_iomp/lib -L/opt/intel/oneapi/dal/2021.7.1/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -L/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -lpetsc -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lifport -lifcoremt -lsvml -lipgo -lirc -lpthread -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -o ex19 icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscvec.h(9), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscmat.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscpc.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscksp.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsnes.h(7), from ex19.c(68): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process See https://petsc.org/release/faq/ [excess:37807] *** Process received signal *** [excess:37807] Signal: Segmentation fault: 11 (11) [excess:37807] Signal code: Address not mapped (1) [excess:37807] Failing at address: 0x7f [excess:37807] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes See https://petsc.org/release/faq/ [excess:37831] *** Process received signal *** [excess:37831] Signal: Segmentation fault: 11 (11) [excess:37831] Signal code: Address not mapped (1) [excess:37831] Failing at address: 0x7f [excess:37831] *** End of error message *** [excess:37832] *** Process received signal *** [excess:37832] Signal: Segmentation fault: 11 (11) [excess:37832] Signal code: Address not mapped (1) [excess:37832] Failing at address: 0x7f [excess:37832] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process See https://petsc.org/release/faq/ forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libifcoremt.dylib 000000010B7F7FE4 for__signal_handl Unknown Unknown libsystem_platfor 00007FF8024C25ED _sigtramp Unknown Unknown ex5f 00000001087AFA38 PetscGetArchType Unknown Unknown ex5f 000000010887913B PetscErrorPrintfI Unknown Unknown ex5f 000000010878D227 PetscInitialize_C Unknown Unknown ex5f 000000010879D289 petscinitializef_ Unknown Unknown ex5f 0000000108713C09 petscsys_mp_petsc Unknown Unknown ex5f 0000000108710B5D MAIN__ Unknown Unknown ex5f 0000000108710AEE main Unknown Unknown dyld 00007FF80213B41F start Unknown Unknown -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[48108,1],0] Exit code: 174 -------------------------------------------------------------------------- Completed test examples Error while running make check make[1]: *** [check] Error 1 make: *** [check] Error 2 ________________________________ From: Vanella, Marcos (Fed) Sent: Monday, May 15, 2023 12:20 PM To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Thank you Matt I'll try this and let you know. Marcos ________________________________ From: Matthew Knepley Sent: Monday, May 15, 2023 12:08 PM To: Vanella, Marcos (Fed) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users > wrote: Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: I hate to yell "compiler bug" when this happens, but it sure seems like one. Can you just use --with-debugging=0 without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is almost certainly a compiler bug. If not, then we can go in the debugger and see what is failing. Thanks, Matt $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts not ok sys_classes_draw_tests-ex1_1 # Error code: 139 #?????[excess:98681] *** Process received signal *** #?????[excess:98681] Signal: Segmentation fault: 11 (11) #?????[excess:98681] Signal code: Address not mapped (1) #?????[excess:98681] Failing at address: 0x7f #?????[excess:98681] *** End of error message *** #?????-------------------------------------------------------------------------- #?????Primary job terminated normally, but 1 process returned #?????a non-zero exit code. Per user-direction, the job has been aborted. #?????-------------------------------------------------------------------------- #?????-------------------------------------------------------------------------- #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). #?????-------------------------------------------------------------------------- ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff I see the same segfault error in all PETSc examples. Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! Marcos -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 15 11:53:33 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 May 2023 12:53:33 -0400 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: Send us $PETSC_ARCH/include/petscconf.h Thanks, Matt On Mon, May 15, 2023 at 12:49?PM Vanella, Marcos (Fed) < marcos.vanella at nist.gov> wrote: > Hi Matt, I configured the lib like this: > > $ ./configure --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 > --with-debugging=0 --with-shared-libraries=0 --download-make > > and compiled. I still get some check segfault error. See below: > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > PETSC_ARCH=arch-darwin-c-opt check > Running check examples to verify correct installation > Using PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 and > PETSC_ARCH=arch-darwin-c-opt > *******************Error detected during compile or > link!******************* > See https://petsc.org/release/faq/ > /Users/mnv/Documents/Software/petsc-3.19.1/src/snes/tutorials ex19 > > ********************************************************************************* > mpicc -Wl,-bind_at_load -Wl,-multiply_defined,suppress > -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs > -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -wd1572 > -Wno-unknown-pragmas -g -O3 > -I/Users/mnv/Documents/Software/petsc-3.19.1/include > -I/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/include > -I/opt/X11/include -std=c99 ex19.c > -L/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/lib > -Wl,-rpath,/opt/intel/oneapi/mkl/2022.2.1/lib > -L/opt/intel/oneapi/mkl/2022.2.1/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib > -L/opt/openmpi414_oneapi22u3/lib > -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib > -L/opt/intel/oneapi/tbb/2021.7.1/lib -L/opt/intel/oneapi/ippcp/2021.6.2/lib > -L/opt/intel/oneapi/ipp/2021.6.2/lib > -L/opt/intel/oneapi/dnnl/2022.2.1/cpu_iomp/lib > -L/opt/intel/oneapi/dal/2021.7.1/lib > -L/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib > -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib > -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib > -L/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib > -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin > -L/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin > -lpetsc -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lX11 > -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte > -lopen-pal -limf -lm -lz -lifport -lifcoremt -lsvml -lipgo -lirc -lpthread > -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng > -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -lmpi -lopen-rte -lopen-pal > -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -o > ex19 > icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated > and will be removed from product release in the second half of 2023. The > Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving > forward. Please transition to use this compiler. Use '-diag-disable=10441' > to disable this message. > In file included from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscvec.h(9), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscmat.h(7), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscpc.h(7), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscksp.h(7), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsnes.h(7), > from ex19.c(68): > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): > warning #2621: attribute "warn_unused_result" does not apply here > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > ^ > > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process > See https://petsc.org/release/faq/ > [excess:37807] *** Process received signal *** > [excess:37807] Signal: Segmentation fault: 11 (11) > [excess:37807] Signal code: Address not mapped (1) > [excess:37807] Failing at address: 0x7f > [excess:37807] *** End of error message *** > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec noticed that process rank 0 with PID 0 on node excess exited on > signal 11 (Segmentation fault: 11). > -------------------------------------------------------------------------- > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes > See https://petsc.org/release/faq/ > [excess:37831] *** Process received signal *** > [excess:37831] Signal: Segmentation fault: 11 (11) > [excess:37831] Signal code: Address not mapped (1) > [excess:37831] Failing at address: 0x7f > [excess:37831] *** End of error message *** > [excess:37832] *** Process received signal *** > [excess:37832] Signal: Segmentation fault: 11 (11) > [excess:37832] Signal code: Address not mapped (1) > [excess:37832] Failing at address: 0x7f > [excess:37832] *** End of error message *** > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec noticed that process rank 1 with PID 0 on node excess exited on > signal 11 (Segmentation fault: 11). > -------------------------------------------------------------------------- > Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI > process > See https://petsc.org/release/faq/ > forrtl: severe (174): SIGSEGV, segmentation fault occurred > Image PC Routine Line Source > > libifcoremt.dylib 000000010B7F7FE4 for__signal_handl Unknown Unknown > libsystem_platfor 00007FF8024C25ED _sigtramp Unknown Unknown > ex5f 00000001087AFA38 PetscGetArchType Unknown Unknown > ex5f 000000010887913B PetscErrorPrintfI Unknown Unknown > ex5f 000000010878D227 PetscInitialize_C Unknown Unknown > ex5f 000000010879D289 petscinitializef_ Unknown Unknown > ex5f 0000000108713C09 petscsys_mp_petsc Unknown Unknown > ex5f 0000000108710B5D MAIN__ Unknown Unknown > ex5f 0000000108710AEE main Unknown Unknown > dyld 00007FF80213B41F start Unknown Unknown > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec detected that one or more processes exited with non-zero status, > thus causing > the job to be terminated. The first process to do so was: > > Process name: [[48108,1],0] > Exit code: 174 > -------------------------------------------------------------------------- > Completed test examples > Error while running make check > make[1]: *** [check] Error 1 > make: *** [check] Error 2 > > ------------------------------ > *From:* Vanella, Marcos (Fed) > *Sent:* Monday, May 15, 2023 12:20 PM > *To:* Matthew Knepley > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers > and OpenMPI > > Thank you Matt I'll try this and let you know. > Marcos > ------------------------------ > *From:* Matthew Knepley > *Sent:* Monday, May 15, 2023 12:08 PM > *To:* Vanella, Marcos (Fed) > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers > and OpenMPI > > On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI > 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX > Ventura 13.3.1. > I can compile PETSc in debug mode with this configure and make lines. I > can run the PETSC tests, which seem fine. > When I compile the library in optimized mode, either using -O3 or O1, for > example configuring with: > > > I hate to yell "compiler bug" when this happens, but it sure seems like > one. Can you just use > > --with-debugging=0 > > without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is > almost > certainly a compiler bug. If not, then we can go in the debugger and see > what is failing. > > Thanks, > > Matt > > > $ ./configure --prefix=/opt/petsc-oneapi22u3 > --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g > -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' > FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 > --with-shared-libraries=0 --download-make > > and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib > compiles. Yet, I see right off the bat this segfault error in the first > PETSc example: > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > PETSC_ARCH=arch-darwin-c-opt test > /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make > --no-print-directory -f > /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test > PETSC_ARCH=arch-darwin-c-opt > PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test > /opt/intel/oneapi/intelpython/latest/bin/python3 > /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py > --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 > --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests > Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt > PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o > In file included from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > from > /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): > warning #2621: attribute "warn_unused_result" does not apply here > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > ^ > > CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 > TEST > arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts > not ok sys_classes_draw_tests-ex1_1 *# Error code: 139* > *# [excess:98681] *** Process received signal **** > *# [excess:98681] Signal: Segmentation fault: 11 (11)* > *# [excess:98681] Signal code: Address not mapped (1)* > *# [excess:98681] Failing at address: 0x7f* > *# [excess:98681] *** End of error message **** > # > -------------------------------------------------------------------------- > # Primary job terminated normally, but 1 process returned > # a non-zero exit code. Per user-direction, the job has been aborted. > # > -------------------------------------------------------------------------- > # > -------------------------------------------------------------------------- > # mpiexec noticed that process rank 0 with PID 0 on node excess exited on > signal 11 (Segmentation fault: 11). > # > -------------------------------------------------------------------------- > ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff > > I see the same segfault error in all PETSc examples. > Any help is mostly appreciated, I'm starting to work with PETSc. Our plan > is to use the linear solver from PETSc for the Poisson equation on our > numerical scheme and test this on a GPU cluster. So also, any guideline on > how to interface PETSc with a fortran code and personal experience is also > most appreciated! > > Marcos > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Mon May 15 12:01:51 2023 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Mon, 15 May 2023 17:01:51 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: <6de054c1-43b6-1c83-c8eb-5617f2f354f7@mcs.anl.gov> References: <6de054c1-43b6-1c83-c8eb-5617f2f354f7@mcs.anl.gov> Message-ID: Hi Satish, yes the -m64 flag tells the compilers the target cpu is intel 64. The only reason I'm trying to get PETSc working with intel is that the bundles for the software we release use Intel compilers for Linux, Mac and Windows (OneAPI intelMPI for linux and Windows, OpenMPI compiled with intel for MacOS). I'm just trying to get PETSc compiled with intel to maintain the scheme we have and keep these compilers, which would be handy if we are to release an alternative Poisson solver using PETSc in the future. For our research projects I'm thinking we'll use gcc/openmpi in linux clusters. Marcos ________________________________ From: Satish Balay Sent: Monday, May 15, 2023 12:48 PM To: Vanella, Marcos (Fed) Cc: petsc-users Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Ops - for some reason I assumed this build is on Mac M1. [likely due to the usage of '-m64' - that was strange].. But yeah - our general usage on Mac is with xcode/clang and brew gfortran (on both Intel and ARM CPUs) - and unless you need Intel compilers for specific needs - clang/gfortran should work better for this development work. Satish On Mon, 15 May 2023, Vanella, Marcos (Fed) via petsc-users wrote: > Hi Satish, well turns out this is not an M1 Mac, it is an older Intel Mac (2019). > I'm trying to get a local computer to do development and tests, but I also have access to linux clusters with GPU which we plan to go to next. > Thanks for the suggestion, I might also try compiling a gcc/gfortran version of the lib on this computer. > Marcos > ________________________________ > From: Satish Balay > Sent: Monday, May 15, 2023 12:10 PM > To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI > > I see Intel compilers here are building x86_64 binaries - that get run on the Arm M1 CPU - perhaps there are issues here with this mode of usage.. > > > I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. > > What does intel compilers provide you for this use case? > > Why not use xcode/clang with gfortran here - i.e native ARM binaries? > > > Satish > > On Mon, 15 May 2023, Vanella, Marcos (Fed) via petsc-users wrote: > > > Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. > > I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. > > When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: > > > > $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make > > > > and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: > > > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test > > /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test > > /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests > > Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > > CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o > > In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > > from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): > > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here > > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > > ^ > > > > CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 > > TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts > > not ok sys_classes_draw_tests-ex1_1 # Error code: 139 > > #?????[excess:98681] *** Process received signal *** > > #?????[excess:98681] Signal: Segmentation fault: 11 (11) > > #?????[excess:98681] Signal code: Address not mapped (1) > > #?????[excess:98681] Failing at address: 0x7f > > #?????[excess:98681] *** End of error message *** > > #?????-------------------------------------------------------------------------- > > #?????Primary job terminated normally, but 1 process returned > > #?????a non-zero exit code. Per user-direction, the job has been aborted. > > #?????-------------------------------------------------------------------------- > > #?????-------------------------------------------------------------------------- > > #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). > > #?????-------------------------------------------------------------------------- > > ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff > > > > I see the same segfault error in all PETSc examples. > > Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! > > > > Marcos > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Mon May 15 12:04:49 2023 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Mon, 15 May 2023 17:04:49 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: Hi Matt, attached is the file. Thanks! Marcos ________________________________ From: Matthew Knepley Sent: Monday, May 15, 2023 12:53 PM To: Vanella, Marcos (Fed) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Send us $PETSC_ARCH/include/petscconf.h Thanks, Matt On Mon, May 15, 2023 at 12:49?PM Vanella, Marcos (Fed) > wrote: Hi Matt, I configured the lib like this: $ ./configure --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 --with-debugging=0 --with-shared-libraries=0 --download-make and compiled. I still get some check segfault error. See below: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt check Running check examples to verify correct installation Using PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 and PETSC_ARCH=arch-darwin-c-opt *******************Error detected during compile or link!******************* See https://petsc.org/release/faq/ /Users/mnv/Documents/Software/petsc-3.19.1/src/snes/tutorials ex19 ********************************************************************************* mpicc -Wl,-bind_at_load -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -wd1572 -Wno-unknown-pragmas -g -O3 -I/Users/mnv/Documents/Software/petsc-3.19.1/include -I/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/include -I/opt/X11/include -std=c99 ex19.c -L/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/lib -Wl,-rpath,/opt/intel/oneapi/mkl/2022.2.1/lib -L/opt/intel/oneapi/mkl/2022.2.1/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -L/opt/openmpi414_oneapi22u3/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/opt/intel/oneapi/tbb/2021.7.1/lib -L/opt/intel/oneapi/ippcp/2021.6.2/lib -L/opt/intel/oneapi/ipp/2021.6.2/lib -L/opt/intel/oneapi/dnnl/2022.2.1/cpu_iomp/lib -L/opt/intel/oneapi/dal/2021.7.1/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -L/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -lpetsc -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lifport -lifcoremt -lsvml -lipgo -lirc -lpthread -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -o ex19 icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscvec.h(9), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscmat.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscpc.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscksp.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsnes.h(7), from ex19.c(68): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process See https://petsc.org/release/faq/ [excess:37807] *** Process received signal *** [excess:37807] Signal: Segmentation fault: 11 (11) [excess:37807] Signal code: Address not mapped (1) [excess:37807] Failing at address: 0x7f [excess:37807] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes See https://petsc.org/release/faq/ [excess:37831] *** Process received signal *** [excess:37831] Signal: Segmentation fault: 11 (11) [excess:37831] Signal code: Address not mapped (1) [excess:37831] Failing at address: 0x7f [excess:37831] *** End of error message *** [excess:37832] *** Process received signal *** [excess:37832] Signal: Segmentation fault: 11 (11) [excess:37832] Signal code: Address not mapped (1) [excess:37832] Failing at address: 0x7f [excess:37832] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process See https://petsc.org/release/faq/ forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libifcoremt.dylib 000000010B7F7FE4 for__signal_handl Unknown Unknown libsystem_platfor 00007FF8024C25ED _sigtramp Unknown Unknown ex5f 00000001087AFA38 PetscGetArchType Unknown Unknown ex5f 000000010887913B PetscErrorPrintfI Unknown Unknown ex5f 000000010878D227 PetscInitialize_C Unknown Unknown ex5f 000000010879D289 petscinitializef_ Unknown Unknown ex5f 0000000108713C09 petscsys_mp_petsc Unknown Unknown ex5f 0000000108710B5D MAIN__ Unknown Unknown ex5f 0000000108710AEE main Unknown Unknown dyld 00007FF80213B41F start Unknown Unknown -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[48108,1],0] Exit code: 174 -------------------------------------------------------------------------- Completed test examples Error while running make check make[1]: *** [check] Error 1 make: *** [check] Error 2 ________________________________ From: Vanella, Marcos (Fed) > Sent: Monday, May 15, 2023 12:20 PM To: Matthew Knepley > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Thank you Matt I'll try this and let you know. Marcos ________________________________ From: Matthew Knepley > Sent: Monday, May 15, 2023 12:08 PM To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users > wrote: Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: I hate to yell "compiler bug" when this happens, but it sure seems like one. Can you just use --with-debugging=0 without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is almost certainly a compiler bug. If not, then we can go in the debugger and see what is failing. Thanks, Matt $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts not ok sys_classes_draw_tests-ex1_1 # Error code: 139 #?????[excess:98681] *** Process received signal *** #?????[excess:98681] Signal: Segmentation fault: 11 (11) #?????[excess:98681] Signal code: Address not mapped (1) #?????[excess:98681] Failing at address: 0x7f #?????[excess:98681] *** End of error message *** #?????-------------------------------------------------------------------------- #?????Primary job terminated normally, but 1 process returned #?????a non-zero exit code. Per user-direction, the job has been aborted. #?????-------------------------------------------------------------------------- #?????-------------------------------------------------------------------------- #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). #?????-------------------------------------------------------------------------- ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff I see the same segfault error in all PETSc examples. Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! Marcos -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: petscconf.h URL: From samar.khatiwala at earth.ox.ac.uk Mon May 15 12:05:02 2023 From: samar.khatiwala at earth.ox.ac.uk (Samar Khatiwala) Date: Mon, 15 May 2023 17:05:02 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: Hi, for what it?s worth, clang + ifort from OneAPI 2023 update 1 works fine for me on both Intel and M2 Macs. So it might just be a matter of upgrading. Samar On May 15, 2023, at 5:53 PM, Matthew Knepley wrote: Send us $PETSC_ARCH/include/petscconf.h Thanks, Matt On Mon, May 15, 2023 at 12:49?PM Vanella, Marcos (Fed) > wrote: Hi Matt, I configured the lib like this: $ ./configure --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 --with-debugging=0 --with-shared-libraries=0 --download-make and compiled. I still get some check segfault error. See below: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt check Running check examples to verify correct installation Using PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 and PETSC_ARCH=arch-darwin-c-opt *******************Error detected during compile or link!******************* See https://petsc.org/release/faq/ /Users/mnv/Documents/Software/petsc-3.19.1/src/snes/tutorials ex19 ********************************************************************************* mpicc -Wl,-bind_at_load -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -wd1572 -Wno-unknown-pragmas -g -O3 -I/Users/mnv/Documents/Software/petsc-3.19.1/include -I/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/include -I/opt/X11/include -std=c99 ex19.c -L/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/lib -Wl,-rpath,/opt/intel/oneapi/mkl/2022.2.1/lib -L/opt/intel/oneapi/mkl/2022.2.1/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -L/opt/openmpi414_oneapi22u3/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/opt/intel/oneapi/tbb/2021.7.1/lib -L/opt/intel/oneapi/ippcp/2021.6.2/lib -L/opt/intel/oneapi/ipp/2021.6.2/lib -L/opt/intel/oneapi/dnnl/2022.2.1/cpu_iomp/lib -L/opt/intel/oneapi/dal/2021.7.1/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -L/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -lpetsc -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lifport -lifcoremt -lsvml -lipgo -lirc -lpthread -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -o ex19 icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscvec.h(9), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscmat.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscpc.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscksp.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsnes.h(7), from ex19.c(68): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process See https://petsc.org/release/faq/ [excess:37807] *** Process received signal *** [excess:37807] Signal: Segmentation fault: 11 (11) [excess:37807] Signal code: Address not mapped (1) [excess:37807] Failing at address: 0x7f [excess:37807] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes See https://petsc.org/release/faq/ [excess:37831] *** Process received signal *** [excess:37831] Signal: Segmentation fault: 11 (11) [excess:37831] Signal code: Address not mapped (1) [excess:37831] Failing at address: 0x7f [excess:37831] *** End of error message *** [excess:37832] *** Process received signal *** [excess:37832] Signal: Segmentation fault: 11 (11) [excess:37832] Signal code: Address not mapped (1) [excess:37832] Failing at address: 0x7f [excess:37832] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process See https://petsc.org/release/faq/ forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libifcoremt.dylib 000000010B7F7FE4 for__signal_handl Unknown Unknown libsystem_platfor 00007FF8024C25ED _sigtramp Unknown Unknown ex5f 00000001087AFA38 PetscGetArchType Unknown Unknown ex5f 000000010887913B PetscErrorPrintfI Unknown Unknown ex5f 000000010878D227 PetscInitialize_C Unknown Unknown ex5f 000000010879D289 petscinitializef_ Unknown Unknown ex5f 0000000108713C09 petscsys_mp_petsc Unknown Unknown ex5f 0000000108710B5D MAIN__ Unknown Unknown ex5f 0000000108710AEE main Unknown Unknown dyld 00007FF80213B41F start Unknown Unknown -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[48108,1],0] Exit code: 174 -------------------------------------------------------------------------- Completed test examples Error while running make check make[1]: *** [check] Error 1 make: *** [check] Error 2 ________________________________ From: Vanella, Marcos (Fed) > Sent: Monday, May 15, 2023 12:20 PM To: Matthew Knepley > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Thank you Matt I'll try this and let you know. Marcos ________________________________ From: Matthew Knepley > Sent: Monday, May 15, 2023 12:08 PM To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users > wrote: Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: I hate to yell "compiler bug" when this happens, but it sure seems like one. Can you just use --with-debugging=0 without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is almost certainly a compiler bug. If not, then we can go in the debugger and see what is failing. Thanks, Matt $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts not ok sys_classes_draw_tests-ex1_1 # Error code: 139 #?????[excess:98681] *** Process received signal *** #?????[excess:98681] Signal: Segmentation fault: 11 (11) #?????[excess:98681] Signal code: Address not mapped (1) #?????[excess:98681] Failing at address: 0x7f #?????[excess:98681] *** End of error message *** #?????-------------------------------------------------------------------------- #?????Primary job terminated normally, but 1 process returned #?????a non-zero exit code. Per user-direction, the job has been aborted. #?????-------------------------------------------------------------------------- #?????-------------------------------------------------------------------------- #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). #?????-------------------------------------------------------------------------- ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff I see the same segfault error in all PETSc examples. Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! Marcos -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Mon May 15 12:07:13 2023 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Mon, 15 May 2023 17:07:13 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: Hi Samar, what MPI library do you use? Did you compile it with clang instead of icc? Thanks, Marcos ________________________________ From: Samar Khatiwala Sent: Monday, May 15, 2023 1:05 PM To: Matthew Knepley Cc: Vanella, Marcos (Fed) ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Hi, for what it?s worth, clang + ifort from OneAPI 2023 update 1 works fine for me on both Intel and M2 Macs. So it might just be a matter of upgrading. Samar On May 15, 2023, at 5:53 PM, Matthew Knepley wrote: Send us $PETSC_ARCH/include/petscconf.h Thanks, Matt On Mon, May 15, 2023 at 12:49?PM Vanella, Marcos (Fed) > wrote: Hi Matt, I configured the lib like this: $ ./configure --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 --with-debugging=0 --with-shared-libraries=0 --download-make and compiled. I still get some check segfault error. See below: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt check Running check examples to verify correct installation Using PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 and PETSC_ARCH=arch-darwin-c-opt *******************Error detected during compile or link!******************* See https://petsc.org/release/faq/ /Users/mnv/Documents/Software/petsc-3.19.1/src/snes/tutorials ex19 ********************************************************************************* mpicc -Wl,-bind_at_load -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -wd1572 -Wno-unknown-pragmas -g -O3 -I/Users/mnv/Documents/Software/petsc-3.19.1/include -I/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/include -I/opt/X11/include -std=c99 ex19.c -L/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/lib -Wl,-rpath,/opt/intel/oneapi/mkl/2022.2.1/lib -L/opt/intel/oneapi/mkl/2022.2.1/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -L/opt/openmpi414_oneapi22u3/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/opt/intel/oneapi/tbb/2021.7.1/lib -L/opt/intel/oneapi/ippcp/2021.6.2/lib -L/opt/intel/oneapi/ipp/2021.6.2/lib -L/opt/intel/oneapi/dnnl/2022.2.1/cpu_iomp/lib -L/opt/intel/oneapi/dal/2021.7.1/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -L/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -lpetsc -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lifport -lifcoremt -lsvml -lipgo -lirc -lpthread -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -o ex19 icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscvec.h(9), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscmat.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscpc.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscksp.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsnes.h(7), from ex19.c(68): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process See https://petsc.org/release/faq/ [excess:37807] *** Process received signal *** [excess:37807] Signal: Segmentation fault: 11 (11) [excess:37807] Signal code: Address not mapped (1) [excess:37807] Failing at address: 0x7f [excess:37807] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes See https://petsc.org/release/faq/ [excess:37831] *** Process received signal *** [excess:37831] Signal: Segmentation fault: 11 (11) [excess:37831] Signal code: Address not mapped (1) [excess:37831] Failing at address: 0x7f [excess:37831] *** End of error message *** [excess:37832] *** Process received signal *** [excess:37832] Signal: Segmentation fault: 11 (11) [excess:37832] Signal code: Address not mapped (1) [excess:37832] Failing at address: 0x7f [excess:37832] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process See https://petsc.org/release/faq/ forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libifcoremt.dylib 000000010B7F7FE4 for__signal_handl Unknown Unknown libsystem_platfor 00007FF8024C25ED _sigtramp Unknown Unknown ex5f 00000001087AFA38 PetscGetArchType Unknown Unknown ex5f 000000010887913B PetscErrorPrintfI Unknown Unknown ex5f 000000010878D227 PetscInitialize_C Unknown Unknown ex5f 000000010879D289 petscinitializef_ Unknown Unknown ex5f 0000000108713C09 petscsys_mp_petsc Unknown Unknown ex5f 0000000108710B5D MAIN__ Unknown Unknown ex5f 0000000108710AEE main Unknown Unknown dyld 00007FF80213B41F start Unknown Unknown -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[48108,1],0] Exit code: 174 -------------------------------------------------------------------------- Completed test examples Error while running make check make[1]: *** [check] Error 1 make: *** [check] Error 2 ________________________________ From: Vanella, Marcos (Fed) > Sent: Monday, May 15, 2023 12:20 PM To: Matthew Knepley > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Thank you Matt I'll try this and let you know. Marcos ________________________________ From: Matthew Knepley > Sent: Monday, May 15, 2023 12:08 PM To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users > wrote: Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: I hate to yell "compiler bug" when this happens, but it sure seems like one. Can you just use --with-debugging=0 without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is almost certainly a compiler bug. If not, then we can go in the debugger and see what is failing. Thanks, Matt $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts not ok sys_classes_draw_tests-ex1_1 # Error code: 139 #?????[excess:98681] *** Process received signal *** #?????[excess:98681] Signal: Segmentation fault: 11 (11) #?????[excess:98681] Signal code: Address not mapped (1) #?????[excess:98681] Failing at address: 0x7f #?????[excess:98681] *** End of error message *** #?????-------------------------------------------------------------------------- #?????Primary job terminated normally, but 1 process returned #?????a non-zero exit code. Per user-direction, the job has been aborted. #?????-------------------------------------------------------------------------- #?????-------------------------------------------------------------------------- #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). #?????-------------------------------------------------------------------------- ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff I see the same segfault error in all PETSc examples. Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! Marcos -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 15 12:14:13 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 May 2023 13:14:13 -0400 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: On Mon, May 15, 2023 at 1:04?PM Vanella, Marcos (Fed) < marcos.vanella at nist.gov> wrote: > Hi Matt, attached is the file. > Okay, you are failing in this function PetscErrorCode PetscGetArchType(char str[], size_t slen) { PetscFunctionBegin; #if defined(PETSC_ARCH) PetscCall(PetscStrncpy(str, PETSC_ARCH, slen - 1)); #else #error "$PETSC_ARCH/include/petscconf.h is missing PETSC_ARCH" #endif PetscFunctionReturn(PETSC_SUCCESS); } How PETSC_ARCH is defined in the header you sent, so it is likely that some other header is being picked up by mistake from some other, broken build. I would completely clean out your PETSc installation and start from scratch. Thanks, Matt > Thanks! > Marcos > ------------------------------ > *From:* Matthew Knepley > *Sent:* Monday, May 15, 2023 12:53 PM > *To:* Vanella, Marcos (Fed) > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers > and OpenMPI > > Send us > > $PETSC_ARCH/include/petscconf.h > > Thanks, > > Matt > > On Mon, May 15, 2023 at 12:49?PM Vanella, Marcos (Fed) < > marcos.vanella at nist.gov> wrote: > > Hi Matt, I configured the lib like this: > > $ ./configure --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 > --with-debugging=0 --with-shared-libraries=0 --download-make > > and compiled. I still get some check segfault error. See below: > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > PETSC_ARCH=arch-darwin-c-opt check > Running check examples to verify correct installation > Using PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 and > PETSC_ARCH=arch-darwin-c-opt > *******************Error detected during compile or > link!******************* > See https://petsc.org/release/faq/ > /Users/mnv/Documents/Software/petsc-3.19.1/src/snes/tutorials ex19 > > ********************************************************************************* > mpicc -Wl,-bind_at_load -Wl,-multiply_defined,suppress > -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs > -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -wd1572 > -Wno-unknown-pragmas -g -O3 > -I/Users/mnv/Documents/Software/petsc-3.19.1/include > -I/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/include > -I/opt/X11/include -std=c99 ex19.c > -L/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/lib > -Wl,-rpath,/opt/intel/oneapi/mkl/2022.2.1/lib > -L/opt/intel/oneapi/mkl/2022.2.1/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib > -L/opt/openmpi414_oneapi22u3/lib > -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib > -L/opt/intel/oneapi/tbb/2021.7.1/lib -L/opt/intel/oneapi/ippcp/2021.6.2/lib > -L/opt/intel/oneapi/ipp/2021.6.2/lib > -L/opt/intel/oneapi/dnnl/2022.2.1/cpu_iomp/lib > -L/opt/intel/oneapi/dal/2021.7.1/lib > -L/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib > -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib > -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib > -L/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib > -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin > -L/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin > -lpetsc -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lX11 > -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte > -lopen-pal -limf -lm -lz -lifport -lifcoremt -lsvml -lipgo -lirc -lpthread > -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng > -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -lmpi -lopen-rte -lopen-pal > -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -o > ex19 > icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated > and will be removed from product release in the second half of 2023. The > Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving > forward. Please transition to use this compiler. Use '-diag-disable=10441' > to disable this message. > In file included from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscvec.h(9), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscmat.h(7), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscpc.h(7), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscksp.h(7), > from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsnes.h(7), > from ex19.c(68): > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): > warning #2621: attribute "warn_unused_result" does not apply here > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > ^ > > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process > See https://petsc.org/release/faq/ > [excess:37807] *** Process received signal *** > [excess:37807] Signal: Segmentation fault: 11 (11) > [excess:37807] Signal code: Address not mapped (1) > [excess:37807] Failing at address: 0x7f > [excess:37807] *** End of error message *** > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec noticed that process rank 0 with PID 0 on node excess exited on > signal 11 (Segmentation fault: 11). > -------------------------------------------------------------------------- > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes > See https://petsc.org/release/faq/ > [excess:37831] *** Process received signal *** > [excess:37831] Signal: Segmentation fault: 11 (11) > [excess:37831] Signal code: Address not mapped (1) > [excess:37831] Failing at address: 0x7f > [excess:37831] *** End of error message *** > [excess:37832] *** Process received signal *** > [excess:37832] Signal: Segmentation fault: 11 (11) > [excess:37832] Signal code: Address not mapped (1) > [excess:37832] Failing at address: 0x7f > [excess:37832] *** End of error message *** > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec noticed that process rank 1 with PID 0 on node excess exited on > signal 11 (Segmentation fault: 11). > -------------------------------------------------------------------------- > Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI > process > See https://petsc.org/release/faq/ > forrtl: severe (174): SIGSEGV, segmentation fault occurred > Image PC Routine Line Source > > libifcoremt.dylib 000000010B7F7FE4 for__signal_handl Unknown Unknown > libsystem_platfor 00007FF8024C25ED _sigtramp Unknown Unknown > ex5f 00000001087AFA38 PetscGetArchType Unknown Unknown > ex5f 000000010887913B PetscErrorPrintfI Unknown Unknown > ex5f 000000010878D227 PetscInitialize_C Unknown Unknown > ex5f 000000010879D289 petscinitializef_ Unknown Unknown > ex5f 0000000108713C09 petscsys_mp_petsc Unknown Unknown > ex5f 0000000108710B5D MAIN__ Unknown Unknown > ex5f 0000000108710AEE main Unknown Unknown > dyld 00007FF80213B41F start Unknown Unknown > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec detected that one or more processes exited with non-zero status, > thus causing > the job to be terminated. The first process to do so was: > > Process name: [[48108,1],0] > Exit code: 174 > -------------------------------------------------------------------------- > Completed test examples > Error while running make check > make[1]: *** [check] Error 1 > make: *** [check] Error 2 > > ------------------------------ > *From:* Vanella, Marcos (Fed) > *Sent:* Monday, May 15, 2023 12:20 PM > *To:* Matthew Knepley > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers > and OpenMPI > > Thank you Matt I'll try this and let you know. > Marcos > ------------------------------ > *From:* Matthew Knepley > *Sent:* Monday, May 15, 2023 12:08 PM > *To:* Vanella, Marcos (Fed) > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers > and OpenMPI > > On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI > 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX > Ventura 13.3.1. > I can compile PETSc in debug mode with this configure and make lines. I > can run the PETSC tests, which seem fine. > When I compile the library in optimized mode, either using -O3 or O1, for > example configuring with: > > > I hate to yell "compiler bug" when this happens, but it sure seems like > one. Can you just use > > --with-debugging=0 > > without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is > almost > certainly a compiler bug. If not, then we can go in the debugger and see > what is failing. > > Thanks, > > Matt > > > $ ./configure --prefix=/opt/petsc-oneapi22u3 > --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g > -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' > FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 > --with-shared-libraries=0 --download-make > > and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib > compiles. Yet, I see right off the bat this segfault error in the first > PETSc example: > > $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > PETSC_ARCH=arch-darwin-c-opt test > /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make > --no-print-directory -f > /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test > PETSC_ARCH=arch-darwin-c-opt > PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test > /opt/intel/oneapi/intelpython/latest/bin/python3 > /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py > --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 > --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests > Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt > PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 > CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o > In file included from > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), > from > /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): > /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): > warning #2621: attribute "warn_unused_result" does not apply here > PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { > ^ > > CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 > TEST > arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts > not ok sys_classes_draw_tests-ex1_1 *# Error code: 139* > *# [excess:98681] *** Process received signal **** > *# [excess:98681] Signal: Segmentation fault: 11 (11)* > *# [excess:98681] Signal code: Address not mapped (1)* > *# [excess:98681] Failing at address: 0x7f* > *# [excess:98681] *** End of error message **** > # > -------------------------------------------------------------------------- > # Primary job terminated normally, but 1 process returned > # a non-zero exit code. Per user-direction, the job has been aborted. > # > -------------------------------------------------------------------------- > # > -------------------------------------------------------------------------- > # mpiexec noticed that process rank 0 with PID 0 on node excess exited on > signal 11 (Segmentation fault: 11). > # > -------------------------------------------------------------------------- > ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff > > I see the same segfault error in all PETSc examples. > Any help is mostly appreciated, I'm starting to work with PETSc. Our plan > is to use the linear solver from PETSc for the Poisson equation on our > numerical scheme and test this on a GPU cluster. So also, any guideline on > how to interface PETSc with a fortran code and personal experience is also > most appreciated! > > Marcos > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From samar.khatiwala at earth.ox.ac.uk Mon May 15 12:22:18 2023 From: samar.khatiwala at earth.ox.ac.uk (Samar Khatiwala) Date: Mon, 15 May 2023 17:22:18 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: References: Message-ID: <17320D85-1186-4B51-929C-4D6D8D222BB6@earth.ox.ac.uk> Hi Marcos, Yes, I compiled with clang instead of icc (no particular reason for this; I tend to use gcc/clang). I use mpich4.1.1, which I first built with clang and ifort: FC=ifort ./configure --prefix=/usr/local/mpich4 --enable-two-level-namespace Samar On May 15, 2023, at 6:07 PM, Vanella, Marcos (Fed) wrote: Hi Samar, what MPI library do you use? Did you compile it with clang instead of icc? Thanks, Marcos ________________________________ From: Samar Khatiwala Sent: Monday, May 15, 2023 1:05 PM To: Matthew Knepley Cc: Vanella, Marcos (Fed) ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Hi, for what it?s worth, clang + ifort from OneAPI 2023 update 1 works fine for me on both Intel and M2 Macs. So it might just be a matter of upgrading. Samar On May 15, 2023, at 5:53 PM, Matthew Knepley wrote: Send us $PETSC_ARCH/include/petscconf.h Thanks, Matt On Mon, May 15, 2023 at 12:49?PM Vanella, Marcos (Fed) > wrote: Hi Matt, I configured the lib like this: $ ./configure --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 --with-debugging=0 --with-shared-libraries=0 --download-make and compiled. I still get some check segfault error. See below: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt check Running check examples to verify correct installation Using PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 and PETSC_ARCH=arch-darwin-c-opt *******************Error detected during compile or link!******************* See https://petsc.org/release/faq/ /Users/mnv/Documents/Software/petsc-3.19.1/src/snes/tutorials ex19 ********************************************************************************* mpicc -Wl,-bind_at_load -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -wd1572 -Wno-unknown-pragmas -g -O3 -I/Users/mnv/Documents/Software/petsc-3.19.1/include -I/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/include -I/opt/X11/include -std=c99 ex19.c -L/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/lib -Wl,-rpath,/opt/intel/oneapi/mkl/2022.2.1/lib -L/opt/intel/oneapi/mkl/2022.2.1/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -L/opt/openmpi414_oneapi22u3/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/opt/intel/oneapi/tbb/2021.7.1/lib -L/opt/intel/oneapi/ippcp/2021.6.2/lib -L/opt/intel/oneapi/ipp/2021.6.2/lib -L/opt/intel/oneapi/dnnl/2022.2.1/cpu_iomp/lib -L/opt/intel/oneapi/dal/2021.7.1/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -L/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -lpetsc -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lifport -lifcoremt -lsvml -lipgo -lirc -lpthread -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -o ex19 icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscvec.h(9), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscmat.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscpc.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscksp.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsnes.h(7), from ex19.c(68): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process See https://petsc.org/release/faq/ [excess:37807] *** Process received signal *** [excess:37807] Signal: Segmentation fault: 11 (11) [excess:37807] Signal code: Address not mapped (1) [excess:37807] Failing at address: 0x7f [excess:37807] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes See https://petsc.org/release/faq/ [excess:37831] *** Process received signal *** [excess:37831] Signal: Segmentation fault: 11 (11) [excess:37831] Signal code: Address not mapped (1) [excess:37831] Failing at address: 0x7f [excess:37831] *** End of error message *** [excess:37832] *** Process received signal *** [excess:37832] Signal: Segmentation fault: 11 (11) [excess:37832] Signal code: Address not mapped (1) [excess:37832] Failing at address: 0x7f [excess:37832] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process See https://petsc.org/release/faq/ forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libifcoremt.dylib 000000010B7F7FE4 for__signal_handl Unknown Unknown libsystem_platfor 00007FF8024C25ED _sigtramp Unknown Unknown ex5f 00000001087AFA38 PetscGetArchType Unknown Unknown ex5f 000000010887913B PetscErrorPrintfI Unknown Unknown ex5f 000000010878D227 PetscInitialize_C Unknown Unknown ex5f 000000010879D289 petscinitializef_ Unknown Unknown ex5f 0000000108713C09 petscsys_mp_petsc Unknown Unknown ex5f 0000000108710B5D MAIN__ Unknown Unknown ex5f 0000000108710AEE main Unknown Unknown dyld 00007FF80213B41F start Unknown Unknown -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[48108,1],0] Exit code: 174 -------------------------------------------------------------------------- Completed test examples Error while running make check make[1]: *** [check] Error 1 make: *** [check] Error 2 ________________________________ From: Vanella, Marcos (Fed) > Sent: Monday, May 15, 2023 12:20 PM To: Matthew Knepley > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Thank you Matt I'll try this and let you know. Marcos ________________________________ From: Matthew Knepley > Sent: Monday, May 15, 2023 12:08 PM To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users > wrote: Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: I hate to yell "compiler bug" when this happens, but it sure seems like one. Can you just use --with-debugging=0 without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is almost certainly a compiler bug. If not, then we can go in the debugger and see what is failing. Thanks, Matt $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts not ok sys_classes_draw_tests-ex1_1 # Error code: 139 #?????[excess:98681] *** Process received signal *** #?????[excess:98681] Signal: Segmentation fault: 11 (11) #?????[excess:98681] Signal code: Address not mapped (1) #?????[excess:98681] Failing at address: 0x7f #?????[excess:98681] *** End of error message *** #?????-------------------------------------------------------------------------- #?????Primary job terminated normally, but 1 process returned #?????a non-zero exit code. Per user-direction, the job has been aborted. #?????-------------------------------------------------------------------------- #?????-------------------------------------------------------------------------- #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). #?????-------------------------------------------------------------------------- ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff I see the same segfault error in all PETSc examples. Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! Marcos -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Mon May 15 13:05:12 2023 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Mon, 15 May 2023 18:05:12 +0000 Subject: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI In-Reply-To: <17320D85-1186-4B51-929C-4D6D8D222BB6@earth.ox.ac.uk> References: <17320D85-1186-4B51-929C-4D6D8D222BB6@earth.ox.ac.uk> Message-ID: Thank you Matt and Samar, Seems the segfaults I see are related to icc, which is not being updated anymore. The recommended intel C compiler is icx which is not released for Macs. I compiled the lib with gcc 13 + openmpi from homebrew and the tests are passing just fine in optimized mode. I will follow Samars comment and build openmpi with clang + ifort and check PETSc works fine with it. Might be time to get rid of icc in our bundle building process for Macs, I keep getting this warning: icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Thanks for the help! Marcos ________________________________ From: Samar Khatiwala Sent: Monday, May 15, 2023 1:22 PM To: Vanella, Marcos (Fed) Cc: Matthew Knepley ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Hi Marcos, Yes, I compiled with clang instead of icc (no particular reason for this; I tend to use gcc/clang). I use mpich4.1.1, which I first built with clang and ifort: FC=ifort ./configure --prefix=/usr/local/mpich4 --enable-two-level-namespace Samar On May 15, 2023, at 6:07 PM, Vanella, Marcos (Fed) wrote: Hi Samar, what MPI library do you use? Did you compile it with clang instead of icc? Thanks, Marcos ________________________________ From: Samar Khatiwala Sent: Monday, May 15, 2023 1:05 PM To: Matthew Knepley Cc: Vanella, Marcos (Fed) ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Hi, for what it?s worth, clang + ifort from OneAPI 2023 update 1 works fine for me on both Intel and M2 Macs. So it might just be a matter of upgrading. Samar On May 15, 2023, at 5:53 PM, Matthew Knepley wrote: Send us $PETSC_ARCH/include/petscconf.h Thanks, Matt On Mon, May 15, 2023 at 12:49?PM Vanella, Marcos (Fed) > wrote: Hi Matt, I configured the lib like this: $ ./configure --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 --with-debugging=0 --with-shared-libraries=0 --download-make and compiled. I still get some check segfault error. See below: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt check Running check examples to verify correct installation Using PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 and PETSC_ARCH=arch-darwin-c-opt *******************Error detected during compile or link!******************* See https://petsc.org/release/faq/ /Users/mnv/Documents/Software/petsc-3.19.1/src/snes/tutorials ex19 ********************************************************************************* mpicc -Wl,-bind_at_load -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -wd1572 -Wno-unknown-pragmas -g -O3 -I/Users/mnv/Documents/Software/petsc-3.19.1/include -I/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/include -I/opt/X11/include -std=c99 ex19.c -L/Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/lib -Wl,-rpath,/opt/intel/oneapi/mkl/2022.2.1/lib -L/opt/intel/oneapi/mkl/2022.2.1/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -L/opt/openmpi414_oneapi22u3/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/opt/intel/oneapi/tbb/2021.7.1/lib -L/opt/intel/oneapi/ippcp/2021.6.2/lib -L/opt/intel/oneapi/ipp/2021.6.2/lib -L/opt/intel/oneapi/dnnl/2022.2.1/cpu_iomp/lib -L/opt/intel/oneapi/dal/2021.7.1/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/compiler/lib -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -Wl,-rpath,/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -L/opt/intel/oneapi/compiler/2022.2.1/mac/bin/intel64/../../compiler/lib -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -L/Library/Developer/CommandLineTools/usr/lib/clang/14.0.3/lib/darwin -lpetsc -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lifport -lifcoremt -lsvml -lipgo -lirc -lpthread -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -lmpi -lopen-rte -lopen-pal -limf -lm -lz -lsvml -lirng -lc++ -lipgo -ldecimal -lirc -lclang_rt.osx -o ex19 icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscvec.h(9), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscmat.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscpc.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscksp.h(7), from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsnes.h(7), from ex19.c(68): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process See https://petsc.org/release/faq/ [excess:37807] *** Process received signal *** [excess:37807] Signal: Segmentation fault: 11 (11) [excess:37807] Signal code: Address not mapped (1) [excess:37807] Failing at address: 0x7f [excess:37807] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes See https://petsc.org/release/faq/ [excess:37831] *** Process received signal *** [excess:37831] Signal: Segmentation fault: 11 (11) [excess:37831] Signal code: Address not mapped (1) [excess:37831] Failing at address: 0x7f [excess:37831] *** End of error message *** [excess:37832] *** Process received signal *** [excess:37832] Signal: Segmentation fault: 11 (11) [excess:37832] Signal code: Address not mapped (1) [excess:37832] Failing at address: 0x7f [excess:37832] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). -------------------------------------------------------------------------- Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process See https://petsc.org/release/faq/ forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libifcoremt.dylib 000000010B7F7FE4 for__signal_handl Unknown Unknown libsystem_platfor 00007FF8024C25ED _sigtramp Unknown Unknown ex5f 00000001087AFA38 PetscGetArchType Unknown Unknown ex5f 000000010887913B PetscErrorPrintfI Unknown Unknown ex5f 000000010878D227 PetscInitialize_C Unknown Unknown ex5f 000000010879D289 petscinitializef_ Unknown Unknown ex5f 0000000108713C09 petscsys_mp_petsc Unknown Unknown ex5f 0000000108710B5D MAIN__ Unknown Unknown ex5f 0000000108710AEE main Unknown Unknown dyld 00007FF80213B41F start Unknown Unknown -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[48108,1],0] Exit code: 174 -------------------------------------------------------------------------- Completed test examples Error while running make check make[1]: *** [check] Error 1 make: *** [check] Error 2 ________________________________ From: Vanella, Marcos (Fed) > Sent: Monday, May 15, 2023 12:20 PM To: Matthew Knepley > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI Thank you Matt I'll try this and let you know. Marcos ________________________________ From: Matthew Knepley > Sent: Monday, May 15, 2023 12:08 PM To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSC with Intel OneAPI compilers and OpenMPI On Mon, May 15, 2023 at 11:19?AM Vanella, Marcos (Fed) via petsc-users > wrote: Hello, I'm trying to compile the PETSc library version 3.19.1 with OpenMPI 4.1.4 and the OneAPI 2022 Update 2 Intel Compiler suite on a Mac with OSX Ventura 13.3.1. I can compile PETSc in debug mode with this configure and make lines. I can run the PETSC tests, which seem fine. When I compile the library in optimized mode, either using -O3 or O1, for example configuring with: I hate to yell "compiler bug" when this happens, but it sure seems like one. Can you just use --with-debugging=0 without the custom COPTFLAGS, CXXOPTFLAGS, FOPTFLAGS? If that works, it is almost certainly a compiler bug. If not, then we can go in the debugger and see what is failing. Thanks, Matt $ ./configure --prefix=/opt/petsc-oneapi22u3 --with-blaslapack-dir=/opt/intel/oneapi/mkl/2022.2.1 COPTFLAGS='-m64 -O1 -g -diag-disable=10441' CXXOPTFLAGS='-m64 -O1 -g -diag-disable=10441' FOPTFLAGS='-m64 -O1 -g' LDFLAGS='-m64' --with-debugging=0 --with-shared-libraries=0 --download-make and using mpicc (icc), mpif90 (ifort) from Open MPI, the static lib compiles. Yet, I see right off the bat this segfault error in the first PETSc example: $ make PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 PETSC_ARCH=arch-darwin-c-opt test /Users/mnv/Documents/Software/petsc-3.19.1/arch-darwin-c-opt/bin/make --no-print-directory -f /Users/mnv/Documents/Software/petsc-3.19.1/gmakefile.test PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 test /opt/intel/oneapi/intelpython/latest/bin/python3 /Users/mnv/Documents/Software/petsc-3.19.1/config/gmakegentest.py --petsc-dir=/Users/mnv/Documents/Software/petsc-3.19.1 --petsc-arch=arch-darwin-c-opt --testdir=./arch-darwin-c-opt/tests Using MAKEFLAGS: --no-print-directory -- PETSC_ARCH=arch-darwin-c-opt PETSC_DIR=/Users/mnv/Documents/Software/petsc-3.19.1 CC arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1.o In file included from /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsys.h(44), from /Users/mnv/Documents/Software/petsc-3.19.1/src/sys/classes/draw/tests/ex1.c(4): /Users/mnv/Documents/Software/petsc-3.19.1/include/petscsystypes.h(68): warning #2621: attribute "warn_unused_result" does not apply here PETSC_ERROR_CODE_TYPEDEF enum PETSC_ERROR_CODE_NODISCARD { ^ CLINKER arch-darwin-c-opt/tests/sys/classes/draw/tests/ex1 TEST arch-darwin-c-opt/tests/counts/sys_classes_draw_tests-ex1_1.counts not ok sys_classes_draw_tests-ex1_1 # Error code: 139 #?????[excess:98681] *** Process received signal *** #?????[excess:98681] Signal: Segmentation fault: 11 (11) #?????[excess:98681] Signal code: Address not mapped (1) #?????[excess:98681] Failing at address: 0x7f #?????[excess:98681] *** End of error message *** #?????-------------------------------------------------------------------------- #?????Primary job terminated normally, but 1 process returned #?????a non-zero exit code. Per user-direction, the job has been aborted. #?????-------------------------------------------------------------------------- #?????-------------------------------------------------------------------------- #?????mpiexec noticed that process rank 0 with PID 0 on node excess exited on signal 11 (Segmentation fault: 11). #?????-------------------------------------------------------------------------- ok sys_classes_draw_tests-ex1_1 # SKIP Command failed so no diff I see the same segfault error in all PETSc examples. Any help is mostly appreciated, I'm starting to work with PETSc. Our plan is to use the linear solver from PETSc for the Poisson equation on our numerical scheme and test this on a GPU cluster. So also, any guideline on how to interface PETSc with a fortran code and personal experience is also most appreciated! Marcos -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Mon May 15 22:29:46 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Tue, 16 May 2023 11:29:46 +0800 Subject: [petsc-users] How to find the map between the high order coordinates of DMPlex and vertex numbering? In-Reply-To: References: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com> Message-ID: Got it. Thank you for your explanation! Best wishes, Zongze On Mon, 15 May 2023 at 23:28, Matthew Knepley wrote: > On Mon, May 15, 2023 at 9:55?AM Zongze Yang wrote: > >> On Mon, 15 May 2023 at 17:24, Matthew Knepley wrote: >> >>> On Sun, May 14, 2023 at 7:23?PM Zongze Yang >>> wrote: >>> >>>> Could you try to project the coordinates into the continuity space by >>>> enabling the option >>>> `-dm_plex_gmsh_project_petscdualspace_lagrange_continuity true`? >>>> >>> >>> There is a comment in the code about that: >>> >>> /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>> >>> So what is currently done is you project into the discontinuous space >>> from the GMsh coordinates, >>> and then we get the continuous coordinates from those later. This is why >>> we get the right answer. >>> >>> >> Sorry, I'm having difficulty understanding the comment and fully >> understanding your intended meaning. Are you saying that we can only >> project the space to a discontinuous space? >> > > For higher order simplices, because we do not have the mapping to the GMsh > order yet. > > >> Additionally, should we always set >> `dm_plex_gmsh_project_petscdualspace_lagrange_continuity` to false for >> high-order gmsh files? >> > > This is done automatically if you do not override it. > > >> With the option set to `true`, I got the following error: >> > > Yes, do not do that. > > Thanks, > > Matt > > >> ``` >> $ $PETSC_DIR/$PETSC_ARCH/tests/dm/impls/plex/tests/runex33_gmsh_3d_q2.sh >> -e "-dm_plex_gmsh_project_petscdualspace_lagrange_continuity true" >> not ok dm_impls_plex_tests-ex33_gmsh_3d_q2 # Error code: 77 >> # Volume: 0.46875 >> # [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> # [0]PETSC ERROR: Petsc has generated inconsistent data >> # [0]PETSC ERROR: Calculated volume 0.46875 != 1. actual volume >> (error 0.53125 > 1e-06 tol) >> # [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble >> shooting. >> # [0]PETSC ERROR: Petsc Development GIT revision: >> v3.19.1-294-g9cc24bc9b93 GIT Date: 2023-05-15 12:07:10 +0000 >> # [0]PETSC ERROR: ../ex33 on a arch-linux-c-debug named AMA-PC-RA18 >> by yzz Mon May 15 21:53:43 2023 >> # [0]PETSC ERROR: Configure options >> --CFLAGS=-I/opt/intel/oneapi/mkl/latest/include >> --CXXFLAGS=-I/opt/intel/oneapi/mkl/latest/include >> --LDFLAGS="-Wl,-rpath,/opt/intel/oneapi/mkl/latest/lib/intel64 >> -L/opt/intel/oneapi/mkl/latest/lib/intel64" --download-bison >> --download-chaco --download-cmake >> --download-eigen="/home/yzz/firedrake/complex-int32-mkl-X-debug/src/eigen-3.3.3.tgz >> " --download-fftw --download-hdf5 --download-hpddm --download-hwloc >> --download-libpng --download-metis --download-mmg --download-mpich >> --download-mumps --download-netcdf --download-p4est --download-parmmg >> --download-pastix --download-pnetcdf --download-ptscotch >> --download-scalapack --download-slepc --download-suitesparse >> --download-superlu_dist --download-tetgen --download-triangle >> --with-blaslapack-dir=/opt/intel/oneapi/mkl/latest --with-c2html=0 >> --with-debugging=1 --with-fortran-bindings=0 >> --with-mkl_cpardiso-dir=/opt/intel/oneapi/mkl/latest >> --with-mkl_pardiso-dir=/opt/intel/oneapi/mkl/latest >> --with-scalar-type=complex --with-shared-libraries=1 --with-x=1 --with-zlib >> PETSC_ARCH=arch-linux-c-debug >> # [0]PETSC ERROR: #1 CheckVolume() at >> /home/yzz/opt/petsc/src/dm/impls/plex/tests/ex33.c:246 >> # [0]PETSC ERROR: #2 main() at >> /home/yzz/opt/petsc/src/dm/impls/plex/tests/ex33.c:261 >> # [0]PETSC ERROR: PETSc Option Table entries: >> # [0]PETSC ERROR: -coord_space 0 (source: command line) >> # [0]PETSC ERROR: -dm_plex_filename >> /home/yzz/opt/petsc/share/petsc/datafiles/meshes/cube_q2.msh (source: >> command line) >> # [0]PETSC ERROR: -dm_plex_gmsh_project (source: command line) >> # [0]PETSC ERROR: >> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true (source: >> command line) >> # [0]PETSC ERROR: -tol 1e-6 (source: command line) >> # [0]PETSC ERROR: -volume 1.0 (source: command line) >> # [0]PETSC ERROR: ----------------End of Error Message -------send >> entire error message to petsc-maint at mcs.anl.gov---------- >> # application called MPI_Abort(MPI_COMM_SELF, 77) - process 0 >> ok dm_impls_plex_tests-ex33_gmsh_3d_q2 # SKIP Command failed so no diff >> ``` >> >> >> Best wishes, >> Zongze >> >> Thanks, >>> >>> Matt >>> >>> >>>> Best wishes, >>>> Zongze >>>> >>>> >>>> On Mon, 15 May 2023 at 04:24, Matthew Knepley >>>> wrote: >>>> >>>>> On Sun, May 14, 2023 at 12:27?PM Zongze Yang >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Sun, 14 May 2023 at 23:54, Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Sun, May 14, 2023 at 9:21?AM Zongze Yang >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, Matt, >>>>>>>> >>>>>>>> The issue has been resolved while testing on the latest version of >>>>>>>> PETSc. It seems that the problem has been fixed in the following merge >>>>>>>> request: https://gitlab.com/petsc/petsc/-/merge_requests/5970 >>>>>>>> >>>>>>> >>>>>>> No problem. Glad it is working. >>>>>>> >>>>>>> >>>>>>>> I sincerely apologize for any inconvenience caused by my previous >>>>>>>> message. However, I would like to provide you with additional information >>>>>>>> regarding the test files. Attached to this email, you will find two Gmsh >>>>>>>> files: "square_2rd.msh" and "square_3rd.msh." These files contain >>>>>>>> high-order triangulated mesh data for the unit square. >>>>>>>> >>>>>>>> ``` >>>>>>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_2rd.msh >>>>>>>> -dm_plex_gmsh_project >>>>>>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>>>>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>>>>>> PetscFE Object: P2 1 MPI process >>>>>>>> type: basic >>>>>>>> Basic Finite Element in 2 dimensions with 2 components >>>>>>>> PetscSpace Object: P2 1 MPI process >>>>>>>> type: sum >>>>>>>> Space in 2 variables with 2 components, size 12 >>>>>>>> Sum space of 2 concatenated subspaces (all identical) >>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>>>>>> type: poly >>>>>>>> Space in 2 variables with 1 components, size 6 >>>>>>>> Polynomial space of degree 2 >>>>>>>> PetscDualSpace Object: P2 1 MPI process >>>>>>>> type: lagrange >>>>>>>> Dual space with 2 components, size 12 >>>>>>>> Continuous Lagrange dual space >>>>>>>> Quadrature on a triangle of order 5 on 9 points (dim 2) >>>>>>>> Volume: 1. >>>>>>>> $ ./ex33 -coord_space 0 -dm_plex_filename square_3rd.msh >>>>>>>> -dm_plex_gmsh_project >>>>>>>> -dm_plex_gmsh_project_petscdualspace_lagrange_continuity true >>>>>>>> -dm_plex_gmsh_project_fe_view -volume 1 >>>>>>>> PetscFE Object: P3 1 MPI process >>>>>>>> type: basic >>>>>>>> Basic Finite Element in 2 dimensions with 2 components >>>>>>>> PetscSpace Object: P3 1 MPI process >>>>>>>> type: sum >>>>>>>> Space in 2 variables with 2 components, size 20 >>>>>>>> Sum space of 2 concatenated subspaces (all identical) >>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI process >>>>>>>> type: poly >>>>>>>> Space in 2 variables with 1 components, size 10 >>>>>>>> Polynomial space of degree 3 >>>>>>>> PetscDualSpace Object: P3 1 MPI process >>>>>>>> type: lagrange >>>>>>>> Dual space with 2 components, size 20 >>>>>>>> Continuous Lagrange dual space >>>>>>>> Quadrature on a triangle of order 7 on 16 points (dim 2) >>>>>>>> Volume: 1. >>>>>>>> ``` >>>>>>>> >>>>>>>> Thank you for your attention and understanding. I apologize once >>>>>>>> again for my previous oversight. >>>>>>>> >>>>>>> >>>>>>> Great! If you make an MR for this, you will be included on the next >>>>>>> list of PETSc contributors. Otherwise, I can do it. >>>>>>> >>>>>>> >>>>>> I appreciate your offer to handle the MR. Please go ahead and take >>>>>> care of it. Thank you! >>>>>> >>>>> >>>>> I have created the MR with your tests. They are working for me: >>>>> >>>>> https://gitlab.com/petsc/petsc/-/merge_requests/6463 >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Best Wishes, >>>>>> Zongze >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Best wishes, >>>>>>>> Zongze >>>>>>>> >>>>>>>> >>>>>>>> On Sun, 14 May 2023 at 16:44, Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Sat, May 13, 2023 at 6:08?AM Zongze Yang >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, Matt, >>>>>>>>>> >>>>>>>>>> There seem to be ongoing issues with projecting high-order >>>>>>>>>> coordinates from a gmsh file to other spaces. I would like to inquire >>>>>>>>>> whether there are any plans to resolve this problem. >>>>>>>>>> >>>>>>>>>> Thank you for your attention to this matter. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, I will look at it. The important thing is to have a good >>>>>>>>> test. Here are the higher order geometry tests >>>>>>>>> >>>>>>>>> >>>>>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tests/ex33.c >>>>>>>>> >>>>>>>>> I take shapes with known volume, mesh them with higher order >>>>>>>>> geometry, and look at the convergence to the true volume. Could you add a >>>>>>>>> GMsh test, meaning the .msh file and known volume, and I will fix it? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best wishes, >>>>>>>>>> Zongze >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, 18 Jun 2022 at 20:31, Zongze Yang >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Thank you for your reply. May I ask for some references on the >>>>>>>>>>> order of the dofs on PETSc's FE Space (especially high order elements)? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Zongze >>>>>>>>>>> >>>>>>>>>>> Matthew Knepley ?2022?6?18??? 20:02??? >>>>>>>>>>> >>>>>>>>>>>> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang < >>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> In order to check if I made mistakes in the python code, I try >>>>>>>>>>>>> to use c code to show the issue on DMProjectCoordinates. The code and mesh >>>>>>>>>>>>> file is attached. >>>>>>>>>>>>> If the code is correct, there must be something wrong with >>>>>>>>>>>>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Something is definitely wrong with high order, periodic >>>>>>>>>>>> simplices from Gmsh. We had not tested that case. I am at a conference and >>>>>>>>>>>> cannot look at it for a week. >>>>>>>>>>>> My suspicion is that the space we make when reading in the Gmsh >>>>>>>>>>>> coordinates does not match the values (wrong order). >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> The command and the output are listed below: (Obviously the >>>>>>>>>>>>> bounding box is changed.) >>>>>>>>>>>>> ``` >>>>>>>>>>>>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view >>>>>>>>>>>>> -new_fe_view >>>>>>>>>>>>> Old Bounding Box: >>>>>>>>>>>>> 0: lo = 0. hi = 1. >>>>>>>>>>>>> 1: lo = 0. hi = 1. >>>>>>>>>>>>> 2: lo = 0. hi = 1. >>>>>>>>>>>>> PetscFE Object: OldCoordinatesFE 1 MPI processes >>>>>>>>>>>>> type: basic >>>>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>>>> type: sum >>>>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI >>>>>>>>>>>>> processes >>>>>>>>>>>>> type: poly >>>>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>>>> type: lagrange >>>>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>>>> Discontinuous Lagrange dual space >>>>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>>>> PetscFE Object: NewCoordinatesFE 1 MPI processes >>>>>>>>>>>>> type: basic >>>>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>>>> type: sum >>>>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI >>>>>>>>>>>>> processes >>>>>>>>>>>>> type: poly >>>>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>>>> type: lagrange >>>>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>>>> New Bounding Box: >>>>>>>>>>>>> 0: lo = 2.5624e-17 hi = 8. >>>>>>>>>>>>> 1: lo = -9.23372e-17 hi = 7. >>>>>>>>>>>>> 2: lo = 2.72091e-17 hi = 8.5 >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Zongze >>>>>>>>>>>>> >>>>>>>>>>>>> Zongze Yang ?2022?6?17??? 14:54??? >>>>>>>>>>>>> >>>>>>>>>>>>>> I tried the projection operation. However, it seems that the >>>>>>>>>>>>>> projection gives the wrong solution. After projection, the bounding box is >>>>>>>>>>>>>> changed! See logs below. >>>>>>>>>>>>>> >>>>>>>>>>>>>> First, I patch the petsc4py by adding `DMProjectCoordinates`: >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>>>> b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>>>> index d8a58d183a..dbcdb280f1 100644 >>>>>>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx >>>>>>>>>>>>>> @@ -307,6 +307,12 @@ cdef class DM(Object): >>>>>>>>>>>>>> PetscINCREF(c.obj) >>>>>>>>>>>>>> return c >>>>>>>>>>>>>> >>>>>>>>>>>>>> + def projectCoordinates(self, FE fe=None): >>>>>>>>>>>>>> + if fe is None: >>>>>>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, NULL) ) >>>>>>>>>>>>>> + else: >>>>>>>>>>>>>> + CHKERR( DMProjectCoordinates(self.dm, fe.fe) ) >>>>>>>>>>>>>> + >>>>>>>>>>>>>> def getBoundingBox(self): >>>>>>>>>>>>>> cdef PetscInt i,dim=0 >>>>>>>>>>>>>> CHKERR( DMGetCoordinateDim(self.dm, &dim) ) >>>>>>>>>>>>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>>>> index 514b6fa472..c778e39884 100644 >>>>>>>>>>>>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi >>>>>>>>>>>>>> @@ -90,6 +90,7 @@ cdef extern from * nogil: >>>>>>>>>>>>>> int DMGetCoordinateDim(PetscDM,PetscInt*) >>>>>>>>>>>>>> int DMSetCoordinateDim(PetscDM,PetscInt) >>>>>>>>>>>>>> int DMLocalizeCoordinates(PetscDM) >>>>>>>>>>>>>> + int DMProjectCoordinates(PetscDM, PetscFE) >>>>>>>>>>>>>> >>>>>>>>>>>>>> int >>>>>>>>>>>>>> DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*) >>>>>>>>>>>>>> int DMCreateInjection(PetscDM,PetscDM,PetscMat*) >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> >>>>>>>>>>>>>> Then in python, I load a mesh and project the coordinates to >>>>>>>>>>>>>> P2: >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>>>> from firedrake.petsc import PETSc >>>>>>>>>>>>>> >>>>>>>>>>>>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh') >>>>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>>>> print('old bbox:', plex.getBoundingBox()) >>>>>>>>>>>>>> >>>>>>>>>>>>>> dim = plex.getDimension() >>>>>>>>>>>>>> # (dim, nc, isSimplex, k, >>>>>>>>>>>>>> qorder, comm=None) >>>>>>>>>>>>>> fe_new = PETSc.FE().createLagrange(dim, dim, True, 2, >>>>>>>>>>>>>> PETSc.DETERMINE) >>>>>>>>>>>>>> plex.projectCoordinates(fe_new) >>>>>>>>>>>>>> fe_new.view() >>>>>>>>>>>>>> >>>>>>>>>>>>>> print('new bbox:', plex.getBoundingBox()) >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> >>>>>>>>>>>>>> The output is (The bounding box is changed!) >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> >>>>>>>>>>>>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0)) >>>>>>>>>>>>>> PetscFE Object: P2 1 MPI processes >>>>>>>>>>>>>> type: basic >>>>>>>>>>>>>> Basic Finite Element in 3 dimensions with 3 components >>>>>>>>>>>>>> PetscSpace Object: P2 1 MPI processes >>>>>>>>>>>>>> type: sum >>>>>>>>>>>>>> Space in 3 variables with 3 components, size 30 >>>>>>>>>>>>>> Sum space of 3 concatenated subspaces (all identical) >>>>>>>>>>>>>> PetscSpace Object: sum component (sumcomp_) 1 MPI processes >>>>>>>>>>>>>> type: poly >>>>>>>>>>>>>> Space in 3 variables with 1 components, size 10 >>>>>>>>>>>>>> Polynomial space of degree 2 >>>>>>>>>>>>>> PetscDualSpace Object: P2 1 MPI processes >>>>>>>>>>>>>> type: lagrange >>>>>>>>>>>>>> Dual space with 3 components, size 30 >>>>>>>>>>>>>> Continuous Lagrange dual space >>>>>>>>>>>>>> Quadrature of order 5 on 27 points (dim 3) >>>>>>>>>>>>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224)) >>>>>>>>>>>>>> >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Matthew Knepley ?2022?6?17??? 01:11??? >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang < >>>>>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ? 2022?6?16??23:22?Matthew Knepley ??? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang < >>>>>>>>>>>>>>>> yangzongze at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, if I load a `gmsh` file with second-order elements, >>>>>>>>>>>>>>>>> the coordinates will be stored in a DG-P2 space. After obtaining the >>>>>>>>>>>>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> By default, they are stored as P2, not DG. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I checked the coordinates vector, and found the dogs only >>>>>>>>>>>>>>>> defined on cell other than vertex and edge, so I said they are stored as DG. >>>>>>>>>>>>>>>> Then the function DMPlexVecGetClosure >>>>>>>>>>>>>>>> seems return >>>>>>>>>>>>>>>> the coordinates in lex order. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Some code in reading gmsh file reads that >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1756: if (isSimplex) continuity = PETSC_FALSE >>>>>>>>>>>>>>>> ; >>>>>>>>>>>>>>>> /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1758: GmshCreateFE(comm, NULL, isSimplex, continuity, >>>>>>>>>>>>>>>> nodeType, dim, coordDim, order, &fe) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The continuity is set to false for simplex. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Oh, yes. That needs to be fixed. For now, you can just >>>>>>>>>>>>>>> project it to P2 if you want using >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You can ask for the coordinates of a vertex or an edge >>>>>>>>>>>>>>>> directly using >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> by giving the vertex or edge point. You can get all the >>>>>>>>>>>>>>>> coordinates on a cell, in the closure order, using >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Below is some code load the gmsh file, I want to know the >>>>>>>>>>>>>>>>> relation between `cl` and `cell_coords`. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>>>> import firedrake as fd >>>>>>>>>>>>>>>>> import numpy as np >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> # Load gmsh file (2rd) >>>>>>>>>>>>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh') >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> cs, ce = plex.getHeightStratum(0) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> cdm = plex.getCoordinateDM() >>>>>>>>>>>>>>>>> csec = dm.getCoordinateSection() >>>>>>>>>>>>>>>>> coords_gvec = dm.getCoordinates() >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> for i in range(cs, ce): >>>>>>>>>>>>>>>>> cell_coords = cdm.getVecClosure(csec, coords_gvec, i) >>>>>>>>>>>>>>>>> print(f'coordinates for cell {i} >>>>>>>>>>>>>>>>> :\n{cell_coords.reshape([-1, 3])}') >>>>>>>>>>>>>>>>> cl = dm.getTransitiveClosure(i) >>>>>>>>>>>>>>>>> print('closure:', cl) >>>>>>>>>>>>>>>>> break >>>>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wuktsinghua at gmail.com Tue May 16 09:27:09 2023 From: wuktsinghua at gmail.com (K. Wu) Date: Tue, 16 May 2023 16:27:09 +0200 Subject: [petsc-users] Interpolation between nodal and elemental field Message-ID: Hi all, Good day! I am currently working on interploating the nodal field vector I obtained to its corresponding elemental field vector. I am doing it in a simple way by using structured mesh, the element value is just the average of its corresponding nodal values. Is it possible to generate a connectivity matrix to implement this function? I also need this matrix later on to do a reverse transformation. I am wondering is there a way in PETSc to generate this connectivity matrix between nodal and elemental mesh. Or is there any better way to accomplish this? Thanks for your kind help! Best regards, Kai -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 16 09:29:47 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 May 2023 10:29:47 -0400 Subject: [petsc-users] Interpolation between nodal and elemental field In-Reply-To: References: Message-ID: On Tue, May 16, 2023 at 10:27?AM K. Wu wrote: > Hi all, > > Good day! > > I am currently working on interploating the nodal field vector I obtained > to its corresponding elemental field vector. I am doing it in a simple way > by using structured mesh, the element value is just the average of its > corresponding nodal values. > > Is it possible to generate a connectivity matrix to implement this > function? I also need this matrix later on to do a reverse transformation. > > I am wondering is there a way in PETSc to generate this connectivity > matrix between nodal and elemental mesh. Or is there any better way to > accomplish this? > > Thanks for your kind help! > 1. Are you using a PETSc DM, or your own mesh? 2. In order to define what you mean, you have to attach function spaces (or something equivalent) to the two representations. Do you mean linear interpolation between vertices and constants on cells? Thanks, Matt > Best regards, > Kai > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wuktsinghua at gmail.com Tue May 16 10:12:09 2023 From: wuktsinghua at gmail.com (K. Wu) Date: Tue, 16 May 2023 17:12:09 +0200 Subject: [petsc-users] Interpolation between nodal and elemental field In-Reply-To: References: Message-ID: Hi Matt, I am using two PETSc DM, one for vertices (nodes), and one for cells (elements). I want to do the following operation: x is a global vector generated from DM vertices, with length Nn. y is a global vector generated from DM cells, with length Ne. I want to generate a sparse matrix A with size Ne x Nn, so that y = A*x. The entry of A is set to a constant value, e.g., 1/4, if the corresponding vertex belongs to the cell. Hope I make it clear, thanks! Regards, Kai Matthew Knepley ?2023?5?16??? 16:29??? > On Tue, May 16, 2023 at 10:27?AM K. Wu wrote: > >> Hi all, >> >> Good day! >> >> I am currently working on interploating the nodal field vector I obtained >> to its corresponding elemental field vector. I am doing it in a simple way >> by using structured mesh, the element value is just the average of its >> corresponding nodal values. >> >> Is it possible to generate a connectivity matrix to implement this >> function? I also need this matrix later on to do a reverse transformation. >> >> I am wondering is there a way in PETSc to generate this connectivity >> matrix between nodal and elemental mesh. Or is there any better way to >> accomplish this? >> >> Thanks for your kind help! >> > > 1. Are you using a PETSc DM, or your own mesh? > > 2. In order to define what you mean, you have to attach function spaces > (or something equivalent) to the > two representations. Do you mean linear interpolation between vertices > and constants on cells? > > Thanks, > > Matt > > >> Best regards, >> Kai >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 16 10:28:30 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 May 2023 11:28:30 -0400 Subject: [petsc-users] Interpolation between nodal and elemental field In-Reply-To: References: Message-ID: On Tue, May 16, 2023 at 11:12?AM K. Wu wrote: > Hi Matt, > > > > I am using two PETSc DM, one for vertices (nodes), and one for cells > (elements). > I assume you mean DMDA. > I want to do the following operation: > > x is a global vector generated from DM vertices, with length Nn. y is a > global vector generated from DM cells, with length Ne. I want to generate a > sparse matrix A with size Ne x Nn, so that y = A*x. > > The entry of A is set to a constant value, e.g., 1/4, if the corresponding > vertex belongs to the cell. > Yes, this is straightforward. First create the matrix with the same local sizes as the vectors from the DMDAs. Each row will have the same number of nonzeros (the number of vertices on each cell), so preallocation is easy. Finally, I would loop over the vertex grid, and put in weights for (i, j, k) and +1 in each direction for cell (i, j, k). Thanks, Matt > > > Hope I make it clear, thanks! > > > > Regards, > > Kai > > Matthew Knepley ?2023?5?16??? 16:29??? > >> On Tue, May 16, 2023 at 10:27?AM K. Wu wrote: >> >>> Hi all, >>> >>> Good day! >>> >>> I am currently working on interploating the nodal field vector I >>> obtained to its corresponding elemental field vector. I am doing it in a >>> simple way by using structured mesh, the element value is just the average >>> of its corresponding nodal values. >>> >>> Is it possible to generate a connectivity matrix to implement this >>> function? I also need this matrix later on to do a reverse transformation. >>> >>> I am wondering is there a way in PETSc to generate this connectivity >>> matrix between nodal and elemental mesh. Or is there any better way to >>> accomplish this? >>> >>> Thanks for your kind help! >>> >> >> 1. Are you using a PETSc DM, or your own mesh? >> >> 2. In order to define what you mean, you have to attach function spaces >> (or something equivalent) to the >> two representations. Do you mean linear interpolation between >> vertices and constants on cells? >> >> Thanks, >> >> Matt >> >> >>> Best regards, >>> Kai >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wuktsinghua at gmail.com Tue May 16 10:44:00 2023 From: wuktsinghua at gmail.com (K. Wu) Date: Tue, 16 May 2023 17:44:00 +0200 Subject: [petsc-users] Interpolation between nodal and elemental field In-Reply-To: References: Message-ID: Got it, thanks Matt! Matthew Knepley ?2023?5?16??? 17:28??? > On Tue, May 16, 2023 at 11:12?AM K. Wu wrote: > >> Hi Matt, >> >> >> >> I am using two PETSc DM, one for vertices (nodes), and one for cells >> (elements). >> > > I assume you mean DMDA. > > >> I want to do the following operation: >> >> x is a global vector generated from DM vertices, with length Nn. y is a >> global vector generated from DM cells, with length Ne. I want to generate a >> sparse matrix A with size Ne x Nn, so that y = A*x. >> >> The entry of A is set to a constant value, e.g., 1/4, if the >> corresponding vertex belongs to the cell. >> > Yes, this is straightforward. First create the matrix with the same local > sizes as the vectors from the DMDAs. > Each row will have the same number of nonzeros (the number of vertices on > each cell), so preallocation is > easy. Finally, I would loop over the vertex grid, and put in weights for > (i, j, k) and +1 in each direction for cell (i, j, k). > > Thanks, > > Matt > > >> >> >> Hope I make it clear, thanks! >> >> >> >> Regards, >> >> Kai >> >> Matthew Knepley ?2023?5?16??? 16:29??? >> >>> On Tue, May 16, 2023 at 10:27?AM K. Wu wrote: >>> >>>> Hi all, >>>> >>>> Good day! >>>> >>>> I am currently working on interploating the nodal field vector I >>>> obtained to its corresponding elemental field vector. I am doing it in a >>>> simple way by using structured mesh, the element value is just the average >>>> of its corresponding nodal values. >>>> >>>> Is it possible to generate a connectivity matrix to implement this >>>> function? I also need this matrix later on to do a reverse transformation. >>>> >>>> I am wondering is there a way in PETSc to generate this connectivity >>>> matrix between nodal and elemental mesh. Or is there any better way to >>>> accomplish this? >>>> >>>> Thanks for your kind help! >>>> >>> >>> 1. Are you using a PETSc DM, or your own mesh? >>> >>> 2. In order to define what you mean, you have to attach function spaces >>> (or something equivalent) to the >>> two representations. Do you mean linear interpolation between >>> vertices and constants on cells? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Best regards, >>>> Kai >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kabdelaz at purdue.edu Tue May 16 19:07:47 2023 From: kabdelaz at purdue.edu (Khaled Nabil Shar Abdelaziz) Date: Wed, 17 May 2023 00:07:47 +0000 Subject: [petsc-users] SNESDMDASNESSetFunctionLocal in Fortran In-Reply-To: References: Message-ID: This is very useful! I was more focused on my search for an example with DMDASNES example instead of ctx. I guess from the looks of it, I can pass it directly in the ctx argument or I can pass the pointer using the interfaces defined in the referenced code you provided, correct? Another question I had is that functions has to be defined as external? Is that a general case? Or is it just when the subroutine is defined elsewhere so I have to define it as external. Because now when the solver calls the FormFunction, a NULL is being passed and I can?t access the x or F vectors. Thank you for your patience! I am still new to PETSc and learning how to use it. From: Matthew Knepley Sent: Sunday, May 14, 2023 12:24 PM To: Khaled Nabil Shar Abdelaziz Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] SNESDMDASNESSetFunctionLocal in Fortran ---- External Email: Use caution with attachments, links, or sharing data ---- On Sun, May 14, 2023 at 12:06?PM Khaled Nabil Shar Abdelaziz > wrote: Hey there, I'm having a problem with the DMDASNESSetFunctionLocal() function in C and its Fortran counterpart. The thing is, in C, you can pass a bunch of variables using the ctx parameter, but in Fortran, it only seems to accept one variable. What's weird is that the SNESSetFunction() function has a similar ctx parameter, but in Fortran, it can handle multiple variables for ctx, unlike DMDASNESSetFunctionLocal(). Do you know if this is on purpose, or am I missing something? I think we show how to do this here: https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex5f90t.F90 Thanks, Matt Thanks in advance! Best regards, Khaled -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 16 19:29:50 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 May 2023 20:29:50 -0400 Subject: [petsc-users] SNESDMDASNESSetFunctionLocal in Fortran In-Reply-To: References: Message-ID: On Tue, May 16, 2023 at 8:07?PM Khaled Nabil Shar Abdelaziz < kabdelaz at purdue.edu> wrote: > This is very useful! I was more focused on my search for an example with > DMDASNES example instead of ctx. > > > > I guess from the looks of it, I can pass it directly in the ctx argument > or I can pass the pointer using the interfaces defined in the referenced > code you provided, correct? > Yes. > Another question I had is that functions has to be defined as external? Is > that a general case? Or is it just when the subroutine is defined elsewhere > so I have to define it as external. Because now when the solver calls the > FormFunction, a NULL is being passed and I can?t access the x or F > vectors. > The symbol has to be exposed by the linker, so most times (unless everything is in the same file), it needs to be external. Thanks, Matt > Thank you for your patience! I am still new to PETSc and learning how to > use it. > > > > *From:* Matthew Knepley > *Sent:* Sunday, May 14, 2023 12:24 PM > *To:* Khaled Nabil Shar Abdelaziz > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] SNESDMDASNESSetFunctionLocal in Fortran > > > > ---- *External Email*: Use caution with attachments, links, or sharing > data ---- > > > > On Sun, May 14, 2023 at 12:06?PM Khaled Nabil Shar Abdelaziz < > kabdelaz at purdue.edu> wrote: > > Hey there, > > > > I'm having a problem with the DMDASNESSetFunctionLocal() function in C and > its Fortran counterpart. The thing is, in C, you can pass a bunch of > variables using the ctx parameter, but in Fortran, it only seems to accept > one variable. > > > > What's weird is that the SNESSetFunction() function has a similar ctx > parameter, but in Fortran, it can handle multiple variables for ctx, unlike > DMDASNESSetFunctionLocal(). Do you know if this is on purpose, or am I > missing something? > > > > I think we show how to do this here: > > > > > https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex5f90t.F90 > > > > Thanks, > > > > Matt > > > > Thanks in advance! > > > > Best regards, > > Khaled > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From benno.fleischli at hslu.ch Wed May 17 07:55:35 2023 From: benno.fleischli at hslu.ch (Fleischli Benno HSLU T&A) Date: Wed, 17 May 2023 12:55:35 +0000 Subject: [petsc-users] Large MATMPIAIJ - 32bit integer overflow in nz value Message-ID: Dear PETSc developers I am creating a very large parallel sparse matrix (MATMPIAIJ) with PETSc. I write this matrix to disk. The number of non-zeros exceeds the maximum number a 32-bit integer can hold. When I read the matrix from disk i get an error because there was an overflow in the nz number. (see petsc-3.18.4/src/mat/impls/aij/seq/aij.c:4977) Obviously I could compile PETSc with 64bit integers (--with-64-bit-indices). But I wanted to ask if there is another way. Because the total number of nonzeros nz is the only numer that exceeds the 32bit limit. It would not be efficient to use 64bit integers everywhere just because of this single number. This how I configured PETSc: ./configure --download-fblaslapack --download-hpddm --download-hypre --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --with-scalar-type=real (--with-mpi-dir=/home/benno/Libraries/openMPI) Kind Regards Benno ________________________________ Hochschule Luzern Technik & Architektur Institute for Mechanical Engineering and Energy Technology Competence Center Fluid Mechanics and Numerical Methods Benno Fleischli MSc in Mechanical Engineering / BSc in Electrical Engineering Wissenschaftlicher Mitarbeiter benno.fleischli at hslu.ch -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 17 08:08:37 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2023 09:08:37 -0400 Subject: [petsc-users] Large MATMPIAIJ - 32bit integer overflow in nz value In-Reply-To: References: Message-ID: On Wed, May 17, 2023 at 9:02?AM Fleischli Benno HSLU T&A < benno.fleischli at hslu.ch> wrote: > Dear PETSc developers > > I am creating a very large parallel sparse matrix (MATMPIAIJ) with PETSc. > I write this matrix to disk. > The number of non-zeros exceeds the maximum number a 32-bit integer can > hold. > When I read the matrix from disk i get an error because there was an > overflow in the nz number. > (see petsc-3.18.4/src/mat/impls/aij/seq/aij.c:4977) > > Obviously I could compile PETSc with 64bit integers > (--with-64-bit-indices). > But I wanted to ask if there is another way. Because the total number of > nonzeros nz is the only numer that exceeds the 32bit limit. > It would not be efficient to use 64bit integers everywhere just because of > this single number. > Integers tend to be a small part of the overall storage. How much does it increase the maximum overall storage for your problem? Thanks, Matt > This how I configured PETSc: > > ./configure --download-fblaslapack --download-hpddm --download-hypre > --with-debugging=0 > COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native > -mtune=native' > FOPTFLAGS='-O3 -march=native -mtune=native' --with-scalar-type=real > (--with-mpi-dir=/home/benno/Libraries/openMPI) > > > > Kind Regards > > Benno > > > > ________________________________ > > > *Hochschule Luzern Technik & Architektur* > Institute for Mechanical Engineering and Energy Technology > > Competence Center Fluid Mechanics and Numerical Methods > > > *Benno Fleischli* > MSc in Mechanical Engineering / BSc in Electrical Engineering > > Wissenschaftlicher Mitarbeiter > > benno.fleischli at hslu.ch > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Wed May 17 09:00:13 2023 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Wed, 17 May 2023 16:00:13 +0200 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 Message-ID: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> Dear PETSc Team, We are using DMPlex, and we create a mesh using DMPlexCreateBoxMesh (.... ); and get a uniform mesh. The mesh is periodic. We typically want to "scale" the coordinates (vertices) of the mesh, and to achieve this, we call DMGetCoordinatesLocal(dm, &coordinates); and scale the entries in the Vector coordinates appropriately. and then DMSetCoordinatesLocal(dm, coordinates); After this, we localise the coordinates by calling DMLocalizeCoordinates(dm); This worked fine up to PETSc 3.18, but with versions after this, the coordinates we get from the call DMPlexGetCellCoordinates(dm, CellID, &isDG, &CoordSize, &ArrayCoordinates, &Coordinates); are no longer correct if the mesh is periodic. A number of the coordinates returned from calling DMPlexGetCellCoordinates are wrong. I think, this is because DMLocalizeCoordinates is now automatically called within the routine DMPlexCreateBoxMesh. So, my question is: How should we scale the coordinates from a periodic DMPlex mesh so that they are reflected correctly when calling both DMGetCoordinatesLocal and DMPlexGetCellCoordinates, with PETSc versions >= 3.18? Many thanks, Berend. From knepley at gmail.com Wed May 17 09:10:23 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2023 10:10:23 -0400 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> Message-ID: On Wed, May 17, 2023 at 10:02?AM Berend van Wachem wrote: > Dear PETSc Team, > > We are using DMPlex, and we create a mesh using > > DMPlexCreateBoxMesh (.... ); > > and get a uniform mesh. The mesh is periodic. > > We typically want to "scale" the coordinates (vertices) of the mesh, and > to achieve this, we call > > DMGetCoordinatesLocal(dm, &coordinates); > > and scale the entries in the Vector coordinates appropriately. > > and then > > DMSetCoordinatesLocal(dm, coordinates); > > > After this, we localise the coordinates by calling > > DMLocalizeCoordinates(dm); > > This worked fine up to PETSc 3.18, but with versions after this, the > coordinates we get from the call > > DMPlexGetCellCoordinates(dm, CellID, &isDG, &CoordSize, > &ArrayCoordinates, &Coordinates); > > are no longer correct if the mesh is periodic. A number of the > coordinates returned from calling DMPlexGetCellCoordinates are wrong. > > I think, this is because DMLocalizeCoordinates is now automatically > called within the routine DMPlexCreateBoxMesh. > > So, my question is: How should we scale the coordinates from a periodic > DMPlex mesh so that they are reflected correctly when calling both > DMGetCoordinatesLocal and DMPlexGetCellCoordinates, with PETSc versions > >= 3.18? > I think we might have to add an API function. For now, when you scale the coordinates, can you scale both copies? DMGetCoordinatesLocal() DMGetCellCoordinatesLocal(); and then set them back. Thanks, Matt > Many thanks, Berend. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Wed May 17 09:19:12 2023 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Wed, 17 May 2023 16:19:12 +0200 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> Message-ID: <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> Dear Matt, Thanks for getting back to me so quickly. If I scale each of the coordinates of the mesh (say, I want to cube each co-ordinate), and I do this for both: DMGetCoordinatesLocal(); DMGetCellCoordinatesLocal(); How do I know I am not cubing one coordinate multiple times? Thanks, Berend. On 5/17/23 16:10, Matthew Knepley wrote: > On Wed, May 17, 2023 at 10:02?AM Berend van Wachem > > wrote: > > Dear PETSc Team, > > We are using DMPlex, and we create a mesh using > > DMPlexCreateBoxMesh (.... ); > > and get a uniform mesh. The mesh is periodic. > > We typically want to "scale" the coordinates (vertices) of the mesh, > and > to achieve this, we call > > DMGetCoordinatesLocal(dm, &coordinates); > > and scale the entries in the Vector coordinates appropriately. > > and then > > DMSetCoordinatesLocal(dm, coordinates); > > > After this, we localise the coordinates by calling > > DMLocalizeCoordinates(dm); > > This worked fine up to PETSc 3.18, but with versions after this, the > coordinates we get from the call > > DMPlexGetCellCoordinates(dm, CellID, &isDG, &CoordSize, > &ArrayCoordinates, &Coordinates); > > are no longer correct if the mesh is periodic. A number of the > coordinates returned from calling DMPlexGetCellCoordinates are wrong. > > I think, this is because DMLocalizeCoordinates is now automatically > called within the routine DMPlexCreateBoxMesh. > > So, my question is: How should we scale the coordinates from a periodic > DMPlex mesh so that they are reflected correctly when calling both > DMGetCoordinatesLocal and DMPlexGetCellCoordinates, with PETSc versions > ?>= 3.18? > > > I think we might have to add an API function. For now, when you scale > the coordinates, > can you scale both copies? > > ? DMGetCoordinatesLocal() > ? DMGetCellCoordinatesLocal(); > > and then set them back. > > ? Thanks, > > ? ? ?Matt > > Many thanks, Berend. > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Wed May 17 09:35:58 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2023 10:35:58 -0400 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> Message-ID: On Wed, May 17, 2023 at 10:21?AM Berend van Wachem wrote: > Dear Matt, > > Thanks for getting back to me so quickly. > > If I scale each of the coordinates of the mesh (say, I want to cube each > co-ordinate), and I do this for both: > > DMGetCoordinatesLocal(); > DMGetCellCoordinatesLocal(); > > How do I know I am not cubing one coordinate multiple times? > Good question. Right now, the only connection between the two sets of coordinates is DMLocalizeCoordinates(). Since sometimes people want to do non-trivial things to coordinates, I prefer not to push in an API for "just" scaling, but I could be convinced the other way. Thanks, Matt > Thanks, Berend. > > On 5/17/23 16:10, Matthew Knepley wrote: > > On Wed, May 17, 2023 at 10:02?AM Berend van Wachem > > > wrote: > > > > Dear PETSc Team, > > > > We are using DMPlex, and we create a mesh using > > > > DMPlexCreateBoxMesh (.... ); > > > > and get a uniform mesh. The mesh is periodic. > > > > We typically want to "scale" the coordinates (vertices) of the mesh, > > and > > to achieve this, we call > > > > DMGetCoordinatesLocal(dm, &coordinates); > > > > and scale the entries in the Vector coordinates appropriately. > > > > and then > > > > DMSetCoordinatesLocal(dm, coordinates); > > > > > > After this, we localise the coordinates by calling > > > > DMLocalizeCoordinates(dm); > > > > This worked fine up to PETSc 3.18, but with versions after this, the > > coordinates we get from the call > > > > DMPlexGetCellCoordinates(dm, CellID, &isDG, &CoordSize, > > &ArrayCoordinates, &Coordinates); > > > > are no longer correct if the mesh is periodic. A number of the > > coordinates returned from calling DMPlexGetCellCoordinates are wrong. > > > > I think, this is because DMLocalizeCoordinates is now automatically > > called within the routine DMPlexCreateBoxMesh. > > > > So, my question is: How should we scale the coordinates from a > periodic > > DMPlex mesh so that they are reflected correctly when calling both > > DMGetCoordinatesLocal and DMPlexGetCellCoordinates, with PETSc > versions > > >= 3.18? > > > > > > I think we might have to add an API function. For now, when you scale > > the coordinates, > > can you scale both copies? > > > > DMGetCoordinatesLocal() > > DMGetCellCoordinatesLocal(); > > > > and then set them back. > > > > Thanks, > > > > Matt > > > > Many thanks, Berend. > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Wed May 17 10:10:15 2023 From: leonardo.mutti01 at universitadipavia.it (Leonardo Mutti) Date: Wed, 17 May 2023 17:10:15 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: <6CE3B35C-E74E-43B5-A3DF-4D0D77E6A94C@petsc.dev> References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> <65FAB3E9-8D08-4AEE-874E-636EB2C76A29@petsc.dev> <6CE3B35C-E74E-43B5-A3DF-4D0D77E6A94C@petsc.dev> Message-ID: Dear developers, let me kindly ask for your help again. In the following snippet, a bi-diagonal matrix A is set up. It measures 8x8 blocks, each block is 2x2 elements. I would like to create the correct IS objects for PCGASM. The non-overlapping IS should be: [*0,1*], [*2,3*],[*4,5*], ..., [*14,15*]. The overlapping IS should be: [*0,1*], [0,1,*2,3*], [2,3,*4,5*], ..., [12,13, *14,15*] I am running the code with 4 processors. For some reason, after calling PCGASMDestroySubdomains the code crashes with severe (157): Program Exception - access violation. A visual inspection of the indices using ISView looks good. Thanks again, Leonardo Mat :: A Vec :: b PetscInt :: M,N_blocks,block_size,I,J,NSub,converged_reason,srank,erank,color,subcomm PetscMPIInt :: size PetscErrorCode :: ierr PetscScalar :: v KSP :: ksp PC :: pc IS,ALLOCATABLE :: subdomains_IS(:), inflated_IS(:) PetscInt :: NMPI,MYRANK,IERMPI INTEGER :: IS_counter, is_start, is_end call PetscInitialize(PETSC_NULL_CHARACTER, ierr) call PetscLogDefaultBegin(ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, NMPI, IERMPI) CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERMPI) N_blocks = 8 block_size = 2 M = N_blocks * block_size ALLOCATE(subdomains_IS(N_blocks)) ALLOCATE(inflated_IS(N_blocks)) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ASSUMPTION: no block spans more than one rank (the inflated blocks can) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! INTRO: create matrix and right hand side !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! How many inflated blocks span more than one rank? NMPI-1 ! call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) call VecCreate(PETSC_COMM_WORLD,b,ierr) call VecSetSizes(b, PETSC_DECIDE, M,ierr) call VecSetFromOptions(b,ierr) DO I=(MYRANK*(M/NMPI)),((MYRANK+1)*(M/NMPI)-1) ! Set matrix v=1 call MatSetValue(A, I, I, v, INSERT_VALUES, ierr) IF (I-block_size .GE. 0) THEN v=-1 call MatSetValue(A, I, I-block_size, v, INSERT_VALUES, ierr) ENDIF ! Set rhs v = I call VecSetValue(b,I,v, INSERT_VALUES,ierr) END DO call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) call VecAssemblyBegin(b,ierr) call VecAssemblyEnd(b,ierr) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! FIRST KSP/PC SETUP !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) call KSPSetOperators(ksp, A, A, ierr) call KSPSetType(ksp, 'preonly', ierr) call KSPGetPC(ksp, pc, ierr) call PCSetType(pc, PCGASM, ierr) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !! GASM, SETTING SUBDOMAINS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! DO IS_COUNTER=1,N_blocks srank = MAX(((IS_COUNTER-2)*block_size)/(M/NMPI),0) ! start rank reached by inflated block erank = MIN(((IS_COUNTER-1)*block_size)/(M/NMPI),NMPI-1) ! end rank reached by inflated block. Coincides with rank containing non-inflated block ! Create subcomms color = MPI_UNDEFINED IF (myrank == srank .or. myrank == erank) THEN color = 1 ENDIF call MPI_Comm_split(MPI_COMM_WORLD,color,MYRANK,subcomm,ierr) ! Create IS IF (srank .EQ. erank) THEN ! Block and overlap are on the same rank IF (MYRANK .EQ. srank) THEN call ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) IF (IS_COUNTER .EQ. 1) THEN ! the first block is not inflated call ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) ELSE call ISCreateStride(PETSC_COMM_SELF,2*block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) ENDIF ENDIF else ! Block and overlap not on the same rank if (myrank == erank) then ! the block call ISCreateStride (subcomm,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) call ISCreateStride (subcomm,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) endif if (myrank == srank) then ! the overlap call ISCreateStride (subcomm,block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) call ISCreateStride (subcomm,0,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) endif endif call MPI_Comm_free(subcomm, ierr) END DO ! Set the domains/subdomains NSub = N_blocks/NMPI is_start = 1 + myrank * NSub is_end = min(is_start + NSub, N_blocks) if (myrank + 1 < NMPI) then NSub = NSub + 1 endif call PCGASMSetSubdomains(pc,NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) call PCGASMDestroySubdomains(NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_ksp_type", "gmres", ierr) call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_pc_type", "none", ierr) call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-pc_gasm_view_subdomains", "1", ierr) call KSPSetUp(ksp, ierr) call PCSetUp(pc, ierr) call KSPSetFromOptions(ksp, ierr) call PCSetFromOptions(pc, ierr) call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD, ierr) Il giorno mer 10 mag 2023 alle ore 03:02 Barry Smith ha scritto: > > > On May 9, 2023, at 4:58 PM, LEONARDO MUTTI < > leonardo.mutti01 at universitadipavia.it> wrote: > > In my notation diag(1,1) means a diagonal 2x2 matrix with 1,1 on the > diagonal, submatrix in the 8x8 diagonal matrix diag(1,1,2,2,...,2). > Am I then correct that the IS representing diag(1,1) is 0,1, and that > diag(2,2,...,2) is represented by 2,3,4,5,6,7? > > > I believe so > > Thanks, > Leonardo > > Il mar 9 mag 2023, 20:45 Barry Smith ha scritto: > >> >> It is simplier than you are making it out to be. Each IS[] is a list of >> rows (and columns) in the sub (domain) matrix. In your case with the matrix >> of 144 by 144 the indices will go from 0 to 143. >> >> In your simple Fortran code you have a completely different problem. A >> matrix with 8 rows and columns. In that case if you want the first IS to >> represent just the first row (and column) in the matrix then it should >> contain only 0. The second submatrix which is all rows (but the first) >> should have 1,2,3,4,5,6,7 >> >> I do not understand why your code has >> >> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>> >>>> >> it should just be 0 >> >> >> >> >> >> On May 9, 2023, at 12:44 PM, LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >> Partial typo: I expect 9x(16+16) numbers to be stored in subdomain_IS : # >> subdomains x (row indices of the submatrix + col indices of the submatrix). >> >> Il giorno mar 9 mag 2023 alle ore 18:31 LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> ha scritto: >> >>> >>> >>> ---------- Forwarded message --------- >>> Da: LEONARDO MUTTI >>> Date: mar 9 mag 2023 alle ore 18:29 >>> Subject: Re: [petsc-users] Understanding index sets for PCGASM >>> To: Matthew Knepley >>> >>> >>> Thank you for your answer, but I am still confused, sorry. >>> Consider >>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 on >>> one processor. >>> Let M=12 for the sake of simplicity, i.e. we deal with a 12x12 2D grid, >>> hence, a 144x144 matrix. >>> Let NSubx = 3, so that on the grid we do 3 vertical and 3 horizontal >>> subdivisions. >>> We should obtain 9 subdomains that are grids of 4x4 nodes each, thus >>> corresponding to 9 submatrices of size 16x16. >>> In my run I obtain NSub = 9 (great) and subdomain_IS(i), i=1,...,9, >>> reads: >>> >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 0* >>> *1 1* >>> *2 2* >>> *3 3* >>> *4 12* >>> *5 13* >>> *6 14* >>> *7 15* >>> *8 24* >>> *9 25* >>> *10 26* >>> *11 27* >>> *12 36* >>> *13 37* >>> *14 38* >>> *15 39* >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 4* >>> *1 5* >>> *2 6* >>> *3 7* >>> *4 16* >>> *5 17* >>> *6 18* >>> *7 19* >>> *8 28* >>> *9 29* >>> *10 30* >>> *11 31* >>> *12 40* >>> *13 41* >>> *14 42* >>> *15 43* >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 8* >>> *1 9* >>> *2 10* >>> *3 11* >>> *4 20* >>> *5 21* >>> *6 22* >>> *7 23* >>> *8 32* >>> *9 33* >>> *10 34* >>> *11 35* >>> *12 44* >>> *13 45* >>> *14 46* >>> *15 47* >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 48* >>> *1 49* >>> *2 50* >>> *3 51* >>> *4 60* >>> *5 61* >>> *6 62* >>> *7 63* >>> *8 72* >>> *9 73* >>> *10 74* >>> *11 75* >>> *12 84* >>> *13 85* >>> *14 86* >>> *15 87* >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 52* >>> *1 53* >>> *2 54* >>> *3 55* >>> *4 64* >>> *5 65* >>> *6 66* >>> *7 67* >>> *8 76* >>> *9 77* >>> *10 78* >>> *11 79* >>> *12 88* >>> *13 89* >>> *14 90* >>> *15 91* >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 56* >>> *1 57* >>> *2 58* >>> *3 59* >>> *4 68* >>> *5 69* >>> *6 70* >>> *7 71* >>> *8 80* >>> *9 81* >>> *10 82* >>> *11 83* >>> *12 92* >>> *13 93* >>> *14 94* >>> *15 95* >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 96* >>> *1 97* >>> *2 98* >>> *3 99* >>> *4 108* >>> *5 109* >>> *6 110* >>> *7 111* >>> *8 120* >>> *9 121* >>> *10 122* >>> *11 123* >>> *12 132* >>> *13 133* >>> *14 134* >>> *15 135* >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 100* >>> *1 101* >>> *2 102* >>> *3 103* >>> *4 112* >>> *5 113* >>> *6 114* >>> *7 115* >>> *8 124* >>> *9 125* >>> *10 126* >>> *11 127* >>> *12 136* >>> *13 137* >>> *14 138* >>> *15 139* >>> *IS Object: 1 MPI process* >>> * type: general* >>> *Number of indices in set 16* >>> *0 104* >>> *1 105* >>> *2 106* >>> *3 107* >>> *4 116* >>> *5 117* >>> *6 118* >>> *7 119* >>> *8 128* >>> *9 129* >>> *10 130* >>> *11 131* >>> *12 140* >>> *13 141* >>> *14 142* >>> *15 143* >>> >>> As you said, no number here reaches 144. >>> But the number stored in subdomain_IS are 9x16= #subdomains x 16, >>> whereas I would expect, also given your latest reply, 9x16x16x2=#subdomains >>> x submatrix height x submatrix width x length of a (row,column) pair. >>> It would really help me if you could briefly explain how the output >>> above encodes the subdivision into subdomains. >>> Many thanks again, >>> Leonardo >>> >>> >>> >>> Il giorno mar 9 mag 2023 alle ore 16:24 Matthew Knepley < >>> knepley at gmail.com> ha scritto: >>> >>>> On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI < >>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>> >>>>> Great thanks! I can now successfully run >>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 >>>>> . >>>>> >>>>> Going forward with my experiments, let me post a new code snippet >>>>> (very similar to ex71f.F90) that I cannot get to work, probably I must be >>>>> setting up the IS objects incorrectly. >>>>> >>>>> I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a >>>>> vector b=(0.5,...,0.5). We have only one processor, and I want to solve >>>>> Ax=b using GASM. In particular, KSP is set to preonly, GASM is the >>>>> preconditioner and it uses on each submatrix an lu direct solver (sub_ksp = >>>>> preonly, sub_pc = lu). >>>>> >>>>> For the GASM algorithm, I divide A into diag(1,1) and diag(2,2,...,2). >>>>> For simplicity I set 0 overlap. Now I want to use GASM to solve Ax=b. The >>>>> code follows. >>>>> >>>>> #include >>>>> #include >>>>> #include >>>>> USE petscmat >>>>> USE petscksp >>>>> USE petscpc >>>>> USE MPI >>>>> >>>>> Mat :: A >>>>> Vec :: b, x >>>>> PetscInt :: M, I, J, ISLen, NSub >>>>> PetscMPIInt :: size >>>>> PetscErrorCode :: ierr >>>>> PetscScalar :: v >>>>> KSP :: ksp >>>>> PC :: pc >>>>> IS :: subdomains_IS(2), inflated_IS(2) >>>>> PetscInt,DIMENSION(4) :: indices_first_domain >>>>> PetscInt,DIMENSION(36) :: indices_second_domain >>>>> >>>>> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >>>>> call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) >>>>> >>>>> >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> ! INTRO: create matrix and right hand side >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> >>>>> WRITE(*,*) "Assembling A,b" >>>>> >>>>> M = 8 >>>>> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >>>>> DO I=1,M >>>>> DO J=1,M >>>>> IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN >>>>> v = 1 >>>>> ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN >>>>> v = 2 >>>>> ELSE >>>>> v = 0 >>>>> ENDIF >>>>> call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) >>>>> END DO >>>>> END DO >>>>> >>>>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>>>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>>>> >>>>> call VecCreate(PETSC_COMM_WORLD,b,ierr) >>>>> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >>>>> call VecSetFromOptions(b,ierr) >>>>> >>>>> do I=1,M >>>>> v = 0.5 >>>>> call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) >>>>> end do >>>>> >>>>> call VecAssemblyBegin(b,ierr) >>>>> call VecAssemblyEnd(b,ierr) >>>>> >>>>> >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> ! FIRST KSP/PC SETUP >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> >>>>> WRITE(*,*) "KSP/PC first setup" >>>>> >>>>> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >>>>> call KSPSetOperators(ksp, A, A, ierr) >>>>> call KSPSetType(ksp, 'preonly', ierr) >>>>> call KSPGetPC(ksp, pc, ierr) >>>>> call KSPSetUp(ksp, ierr) >>>>> call PCSetType(pc, PCGASM, ierr) >>>>> >>>>> >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> ! GASM, SETTING SUBDOMAINS >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> >>>>> WRITE(*,*) "Setting GASM subdomains" >>>>> >>>>> ! Let's create the subdomain IS and inflated_IS >>>>> ! They are equal if no overlap is present >>>>> ! They are 1: 0,1,8,9 >>>>> ! 2: 10,...,15,18,...,23,...,58,...,63 >>>>> >>>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>> do I=0,5 >>>>> do J=0,5 >>>>> indices_second_domain(I*6+1+J) = 18 + J + 8*I ! >>>>> corresponds to diag(2,2,...,2) >>>>> !WRITE(*,*) I*6+1+J, 18 + J + 8*I >>>>> end do >>>>> end do >>>>> >>>>> ! Convert into IS >>>>> ISLen = 4 >>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>> & PETSC_COPY_VALUES, subdomains_IS(1), ierr) >>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>> & PETSC_COPY_VALUES, inflated_IS(1), ierr) >>>>> ISLen = 36 >>>>> call >>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>> & PETSC_COPY_VALUES, subdomains_IS(2), ierr) >>>>> call >>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>> & PETSC_COPY_VALUES, inflated_IS(2), ierr) >>>>> >>>>> NSub = 2 >>>>> call PCGASMSetSubdomains(pc,NSub, >>>>> & subdomains_IS,inflated_IS,ierr) >>>>> call PCGASMDestroySubdomains(NSub, >>>>> & subdomains_IS,inflated_IS,ierr) >>>>> >>>>> >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> ! GASM: SET SUBSOLVERS >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> >>>>> WRITE(*,*) "Setting subsolvers for GASM" >>>>> >>>>> call PCSetUp(pc, ierr) ! should I add this? >>>>> >>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>> & "-sub_pc_type", "lu", ierr) >>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>> & "-sub_ksp_type", "preonly", ierr) >>>>> >>>>> call KSPSetFromOptions(ksp, ierr) >>>>> call PCSetFromOptions(pc, ierr) >>>>> >>>>> >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> ! DUMMY SOLUTION: DID IT WORK? >>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>> >>>>> WRITE(*,*) "Solve" >>>>> >>>>> call VecDuplicate(b,x,ierr) >>>>> call KSPSolve(ksp,b,x,ierr) >>>>> >>>>> call MatDestroy(A, ierr) >>>>> call KSPDestroy(ksp, ierr) >>>>> call PetscFinalize(ierr) >>>>> >>>>> This code is failing in multiple points. At call PCSetUp(pc, ierr) it >>>>> produces: >>>>> >>>>> *[0]PETSC ERROR: Argument out of range* >>>>> *[0]PETSC ERROR: Scatter indices in ix are out of range* >>>>> *...* >>>>> *[0]PETSC ERROR: #1 VecScatterCreate() at >>>>> ***\src\vec\is\sf\INTERF~1\vscat.c:736* >>>>> *[0]PETSC ERROR: #2 PCSetUp_GASM() at >>>>> ***\src\ksp\pc\impls\gasm\gasm.c:433* >>>>> *[0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994* >>>>> >>>>> And at call KSPSolve(ksp,b,x,ierr) it produces: >>>>> >>>>> *forrtl: severe (157): Program Exception - access violation* >>>>> >>>>> >>>>> The index sets are setup coherently with the outputs of e.g. >>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: >>>>> in particular each element of the matrix A corresponds to a number from 0 >>>>> to 63. >>>>> >>>> >>>> This is not correct, I believe. The indices are row/col indices, not >>>> indices into dense blocks, so for >>>> your example, they are all in [0, 8]. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Note that each submatrix does not represent some physical subdomain, >>>>> the subdivision is just at the algebraic level. >>>>> I thus have the following questions: >>>>> >>>>> - is this the correct way of creating the IS objects, given my >>>>> objective at the beginning of the email? Is the ordering correct? >>>>> - what am I doing wrong that is generating the above errors? >>>>> >>>>> Thanks for the patience and the time. >>>>> Best, >>>>> Leonardo >>>>> >>>>> Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith >>>>> ha scritto: >>>>> >>>>>> >>>>>> Added in *barry/2023-05-04/add-pcgasm-set-subdomains *see also >>>>>> https://gitlab.com/petsc/petsc/-/merge_requests/6419 >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI < >>>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>>> >>>>>> Thank you for the help. >>>>>> Adding to my example: >>>>>> >>>>>> >>>>>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>>>>> inflated_IS,ierr) call >>>>>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>>>>> results in: >>>>>> >>>>>> * Error LNK2019 unresolved external symbol >>>>>> PCGASMDESTROYSUBDOMAINS referenced in function ... * >>>>>> >>>>>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>>>>> referenced in function ... * >>>>>> I'm not sure if the interfaces are missing or if I have a compilation >>>>>> problem. >>>>>> Thank you again. >>>>>> Best, >>>>>> Leonardo >>>>>> >>>>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith < >>>>>> bsmith at petsc.dev> ha scritto: >>>>>> >>>>>>> >>>>>>> Thank you for the test code. I have a fix in the branch >>>>>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>> with >>>>>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>>>> >>>>>>> The functions did not have proper Fortran stubs and interfaces so >>>>>>> I had to provide them manually in the new branch. >>>>>>> >>>>>>> Use >>>>>>> >>>>>>> git fetch >>>>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>> >>>>>>> ./configure etc >>>>>>> >>>>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I >>>>>>> had to change things slightly and I updated the error handling for the >>>>>>> latest version. >>>>>>> >>>>>>> Please let us know if you have any later questions. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>>>> >>>>>>> Hello. I am having a hard time understanding the index sets to feed >>>>>>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>>>>>> get more intuition on how the IS objects behave I tried the following >>>>>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>>>>> square, non-overlapping submatrices: >>>>>>> >>>>>>> #include >>>>>>> #include >>>>>>> #include >>>>>>> USE petscmat >>>>>>> USE petscksp >>>>>>> USE petscpc >>>>>>> >>>>>>> Mat :: A >>>>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>>>> INTEGER :: I,J >>>>>>> PetscErrorCode :: ierr >>>>>>> PetscScalar :: v >>>>>>> KSP :: ksp >>>>>>> PC :: pc >>>>>>> IS :: subdomains_IS, inflated_IS >>>>>>> >>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>>>> >>>>>>> !-----Create a dummy matrix >>>>>>> M = 16 >>>>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>> & M, M, >>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>> & A, ierr) >>>>>>> >>>>>>> DO I=1,M >>>>>>> DO J=1,M >>>>>>> v = I*J >>>>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>>>> & INSERT_VALUES , ierr) >>>>>>> END DO >>>>>>> END DO >>>>>>> >>>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>> >>>>>>> !-----Create KSP and PC >>>>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>>>> call KSPGetPC(ksp,pc,ierr) >>>>>>> call KSPSetUp(ksp, ierr) >>>>>>> call PCSetType(pc,PCGASM, ierr) >>>>>>> call PCSetUp(pc , ierr) >>>>>>> >>>>>>> !-----GASM setup >>>>>>> NSubx = 4 >>>>>>> dof = 1 >>>>>>> overlap = 0 >>>>>>> >>>>>>> call PCGASMCreateSubdomains2D(pc, >>>>>>> & M, M, >>>>>>> & NSubx, NSubx, >>>>>>> & dof, overlap, >>>>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>>>> >>>>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>>>> >>>>>>> call KSPDestroy(ksp, ierr) >>>>>>> call PetscFinalize(ierr) >>>>>>> >>>>>>> Running this on one processor, I get NSub = 4. >>>>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = >>>>>>> 16 as expected. >>>>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception >>>>>>> - access violation". So: >>>>>>> 1) why do I get two different results with ASM, and GASM? >>>>>>> 2) why do I get access violation and how can I solve this? >>>>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>>>>> objects. As I see on the Fortran interface, the arguments to >>>>>>> PCGASMCreateSubdomains2D are IS objects: >>>>>>> >>>>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>>>> import tPC,tIS >>>>>>> PC a ! PC >>>>>>> PetscInt b ! PetscInt >>>>>>> PetscInt c ! PetscInt >>>>>>> PetscInt d ! PetscInt >>>>>>> PetscInt e ! PetscInt >>>>>>> PetscInt f ! PetscInt >>>>>>> PetscInt g ! PetscInt >>>>>>> PetscInt h ! PetscInt >>>>>>> IS i ! IS >>>>>>> IS j ! IS >>>>>>> PetscErrorCode z >>>>>>> end subroutine PCGASMCreateSubdomains2D >>>>>>> Thus: >>>>>>> 3) what should be inside e.g., subdomains_IS? I expect it to >>>>>>> contain, for every created subdomain, the list of rows and columns defining >>>>>>> the subblock in the matrix, am I right? >>>>>>> >>>>>>> Context: I have a block-tridiagonal system arising from space-time >>>>>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>>>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>>>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>>>> >>>>>>> Thanks in advance, >>>>>>> Leonardo >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 17 10:10:51 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 17 May 2023 11:10:51 -0400 Subject: [petsc-users] Large MATMPIAIJ - 32bit integer overflow in nz value In-Reply-To: References: Message-ID: Yeah, this is silly. The check is just a "sanity-check" on the data in the file. We store redundant information in the matrix header in the file, header[3] is the total number of nonzeros in the matrix. When nz is too large, the correct value cannot fit in the header. Changing the file format would be a breaking change meaning old files could no longer be read (MatLoad() can still load binary files saved in 1995) which would be bad. I think the easiest way to handle this is to provide an option to avoid the unneeded (and doesn't work) sanity check. In other words, something like -mat_load_ignore_nz turns off the sanity check in MatLoad_MPIAIJ_Binary(). Is that something you could try in an MR or would you prefer we do it? Barry > On May 17, 2023, at 8:55 AM, Fleischli Benno HSLU T&A wrote: > > Dear PETSc developers > > I am creating a very large parallel sparse matrix (MATMPIAIJ) with PETSc. I write this matrix to disk. > The number of non-zeros exceeds the maximum number a 32-bit integer can hold. > When I read the matrix from disk i get an error because there was an overflow in the nz number. > (see petsc-3.18.4/src/mat/impls/aij/seq/aij.c:4977) > > Obviously I could compile PETSc with 64bit integers (--with-64-bit-indices). > But I wanted to ask if there is another way. Because the total number of nonzeros nz is the only numer that exceeds the 32bit limit. > It would not be efficient to use 64bit integers everywhere just because of this single number. > > This how I configured PETSc: > > ./configure --download-fblaslapack --download-hpddm --download-hypre --with-debugging=0 > COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' > FOPTFLAGS='-O3 -march=native -mtune=native' --with-scalar-type=real (--with-mpi-dir=/home/benno/Libraries/openMPI) > > > > Kind Regards > > Benno > > > > ________________________________ > Hochschule Luzern > Technik & Architektur > Institute for Mechanical Engineering and Energy Technology > Competence Center Fluid Mechanics and Numerical Methods > > Benno Fleischli > MSc in Mechanical Engineering / BSc in Electrical Engineering > Wissenschaftlicher Mitarbeiter > benno.fleischli at hslu.ch -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Wed May 17 10:20:45 2023 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Wed, 17 May 2023 17:20:45 +0200 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> Message-ID: Dear Matt, Is there a way to 'redo' the DMLocalizeCoordinates() ? Or to undo it? Alternatively, can we make the calling of DMLocalizeCoordinates() in the DMPlexCreate...() routines optional? Otherwise, we would have to copy all arrays of coordinates from DMGetCoordinatesLocal() and DMGetCellCoordinatesLocal() before scaling them. Best regards, Berend. On 5/17/23 16:35, Matthew Knepley wrote: > On Wed, May 17, 2023 at 10:21?AM Berend van Wachem > wrote: > > Dear Matt, > > Thanks for getting back to me so quickly. > > If I scale each of the coordinates of the mesh (say, I want to cube each > co-ordinate), and I do this for both: > > DMGetCoordinatesLocal(); > DMGetCellCoordinatesLocal(); > > How do I know I am not cubing one coordinate multiple times? > > > Good question. Right now, the only connection between the two sets of coordinates is DMLocalizeCoordinates(). Since sometimes > people want to do non-trivial things to > coordinates, I prefer not to push in an API for "just" scaling, but I could be convinced > the other way. > > ? Thanks, > > ? ? ?Matt > > Thanks, Berend. > > On 5/17/23 16:10, Matthew Knepley wrote: > > On Wed, May 17, 2023 at 10:02?AM Berend van Wachem > > >> wrote: > > > >? ? ?Dear PETSc Team, > > > >? ? ?We are using DMPlex, and we create a mesh using > > > >? ? ?DMPlexCreateBoxMesh (.... ); > > > >? ? ?and get a uniform mesh. The mesh is periodic. > > > >? ? ?We typically want to "scale" the coordinates (vertices) of the mesh, > >? ? ?and > >? ? ?to achieve this, we call > > > >? ? ?DMGetCoordinatesLocal(dm, &coordinates); > > > >? ? ?and scale the entries in the Vector coordinates appropriately. > > > >? ? ?and then > > > >? ? ?DMSetCoordinatesLocal(dm, coordinates); > > > > > >? ? ?After this, we localise the coordinates by calling > > > >? ? ?DMLocalizeCoordinates(dm); > > > >? ? ?This worked fine up to PETSc 3.18, but with versions after this, the > >? ? ?coordinates we get from the call > > > >? ? ?DMPlexGetCellCoordinates(dm, CellID, &isDG, &CoordSize, > >? ? ?&ArrayCoordinates, &Coordinates); > > > >? ? ?are no longer correct if the mesh is periodic. A number of the > >? ? ?coordinates returned from calling DMPlexGetCellCoordinates are wrong. > > > >? ? ?I think, this is because DMLocalizeCoordinates is now automatically > >? ? ?called within the routine DMPlexCreateBoxMesh. > > > >? ? ?So, my question is: How should we scale the coordinates from a periodic > >? ? ?DMPlex mesh so that they are reflected correctly when calling both > >? ? ?DMGetCoordinatesLocal and DMPlexGetCellCoordinates, with PETSc versions > >? ? ? ?>= 3.18? > > > > > > I think we might have to add an API function. For now, when you scale > > the coordinates, > > can you scale both copies? > > > >? ? DMGetCoordinatesLocal() > >? ? DMGetCellCoordinatesLocal(); > > > > and then set them back. > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Many thanks, Berend. > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Wed May 17 10:58:09 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2023 11:58:09 -0400 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> Message-ID: On Wed, May 17, 2023 at 11:20?AM Berend van Wachem wrote: > Dear Matt, > > Is there a way to 'redo' the DMLocalizeCoordinates() ? Or to undo it? > Alternatively, can we make the calling of DMLocalizeCoordinates() in the > DMPlexCreate...() routines optional? > > Otherwise, we would have to copy all arrays of coordinates from > DMGetCoordinatesLocal() and DMGetCellCoordinatesLocal() before > scaling them. > I am likely not being clear. I think all you have to do is the following: DMGetCoordinatesLocal(dm, &xl); VecScale(xl, scale); DMSetCoordinatesLocal(dm, xl); DMGetCellCoordinatesLocal(dm, &xl); VecScale(xl, scale); DMSetCellCoordinatesLocal(dm, xl); Does this not work? Thanks, Matt Best regards, Berend. > > On 5/17/23 16:35, Matthew Knepley wrote: > > On Wed, May 17, 2023 at 10:21?AM Berend van Wachem < > berend.vanwachem at ovgu.de > wrote: > > > > Dear Matt, > > > > Thanks for getting back to me so quickly. > > > > If I scale each of the coordinates of the mesh (say, I want to cube > each > > co-ordinate), and I do this for both: > > > > DMGetCoordinatesLocal(); > > DMGetCellCoordinatesLocal(); > > > > How do I know I am not cubing one coordinate multiple times? > > > > > > Good question. Right now, the only connection between the two sets of > coordinates is DMLocalizeCoordinates(). Since sometimes > > people want to do non-trivial things to > > coordinates, I prefer not to push in an API for "just" scaling, but I > could be convinced > > the other way. > > > > Thanks, > > > > Matt > > > > Thanks, Berend. > > > > On 5/17/23 16:10, Matthew Knepley wrote: > > > On Wed, May 17, 2023 at 10:02?AM Berend van Wachem > > > > > >> wrote: > > > > > > Dear PETSc Team, > > > > > > We are using DMPlex, and we create a mesh using > > > > > > DMPlexCreateBoxMesh (.... ); > > > > > > and get a uniform mesh. The mesh is periodic. > > > > > > We typically want to "scale" the coordinates (vertices) of > the mesh, > > > and > > > to achieve this, we call > > > > > > DMGetCoordinatesLocal(dm, &coordinates); > > > > > > and scale the entries in the Vector coordinates appropriately. > > > > > > and then > > > > > > DMSetCoordinatesLocal(dm, coordinates); > > > > > > > > > After this, we localise the coordinates by calling > > > > > > DMLocalizeCoordinates(dm); > > > > > > This worked fine up to PETSc 3.18, but with versions after > this, the > > > coordinates we get from the call > > > > > > DMPlexGetCellCoordinates(dm, CellID, &isDG, &CoordSize, > > > &ArrayCoordinates, &Coordinates); > > > > > > are no longer correct if the mesh is periodic. A number of the > > > coordinates returned from calling DMPlexGetCellCoordinates > are wrong. > > > > > > I think, this is because DMLocalizeCoordinates is now > automatically > > > called within the routine DMPlexCreateBoxMesh. > > > > > > So, my question is: How should we scale the coordinates from > a periodic > > > DMPlex mesh so that they are reflected correctly when calling > both > > > DMGetCoordinatesLocal and DMPlexGetCellCoordinates, with > PETSc versions > > > >= 3.18? > > > > > > > > > I think we might have to add an API function. For now, when you > scale > > > the coordinates, > > > can you scale both copies? > > > > > > DMGetCoordinatesLocal() > > > DMGetCellCoordinatesLocal(); > > > > > > and then set them back. > > > > > > Thanks, > > > > > > Matt > > > > > > Many thanks, Berend. > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ < > https://www.cse.buffalo.edu/~knepley/> < > http://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > > which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexlindsay239 at gmail.com Wed May 17 11:47:16 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Wed, 17 May 2023 09:47:16 -0700 Subject: [petsc-users] Nested field split Message-ID: I've seen threads in the archives about nested field split but I'm not sure they match what I'm asking about. I'm doing a Schur field split for a porous version of incompressible Navier-Stokes. In addition to pressure and velocity fields, we have fluid and solid temperature fields. I plan to put all primal variables in one split and the pressure obviously in the Schur split. Now within the "primal variable split" a user is wondering whether we can do a further split, e.g. perhaps an additive split with the solid temperature split out from the velocities and fluid temperature (the former is almost pure conduction whereas the latter may be advection dominated). Is this possible? Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Wed May 17 13:01:22 2023 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Wed, 17 May 2023 20:01:22 +0200 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> Message-ID: <0241f3b5-16b4-786b-61a3-5f8ac64a3f5a@ovgu.de> Dear Matt, I tried it, but it doesn't seem to work. Attached is a very small working example illustrating the problem. I create a DMPlexBox Mesh, periodic in the Y direction. I then scale the Y coordinates with a factor 10, and add 1.0 to it. Both DMGetCoordinatesLocal and DMGetCellCoordinatesLocal. Then I evaluate the coordinates with DMPlexGetCellCoordinates. Most of the Y coordinates are correct, but not all of them - for instance, the minimum Y coordinate is 0.0, and this should be 1.0. Am I doing something wrong? Thanks and best regards, Berend. On 5/17/23 17:58, Matthew Knepley wrote: > On Wed, May 17, 2023 at 11:20?AM Berend van Wachem > wrote: > > Dear Matt, > > Is there a way to 'redo' the DMLocalizeCoordinates() ? Or to undo it? > Alternatively, can we make the calling of DMLocalizeCoordinates() in the? DMPlexCreate...() routines optional? > > Otherwise, we would have to copy all arrays of coordinates from DMGetCoordinatesLocal() and DMGetCellCoordinatesLocal() before > scaling them. > > > I am likely not being clear. I think all you have to do is the following: > > ? DMGetCoordinatesLocal(dm, &xl); > ? VecScale(xl, scale); > ? DMSetCoordinatesLocal(dm, xl); > ? DMGetCellCoordinatesLocal(dm, &xl); > ? VecScale(xl, scale); > ? DMSetCellCoordinatesLocal(dm, xl); > > Does this not work? > > ? Thanks, > > ? ? ?Matt > > Best regards, Berend. > > On 5/17/23 16:35, Matthew Knepley wrote: > > On Wed, May 17, 2023 at 10:21?AM Berend van Wachem > >> wrote: > > > >? ? ?Dear Matt, > > > >? ? ?Thanks for getting back to me so quickly. > > > >? ? ?If I scale each of the coordinates of the mesh (say, I want to cube each > >? ? ?co-ordinate), and I do this for both: > > > >? ? ?DMGetCoordinatesLocal(); > >? ? ?DMGetCellCoordinatesLocal(); > > > >? ? ?How do I know I am not cubing one coordinate multiple times? > > > > > > Good question. Right now, the only connection between the two sets of coordinates is DMLocalizeCoordinates(). Since > sometimes > > people want to do non-trivial things to > > coordinates, I prefer not to push in an API for "just" scaling, but I could be convinced > > the other way. > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Thanks, Berend. > > > >? ? ?On 5/17/23 16:10, Matthew Knepley wrote: > >? ? ? > On Wed, May 17, 2023 at 10:02?AM Berend van Wachem > >? ? ? > > > >? ? ?>>> wrote: > >? ? ? > > >? ? ? >? ? ?Dear PETSc Team, > >? ? ? > > >? ? ? >? ? ?We are using DMPlex, and we create a mesh using > >? ? ? > > >? ? ? >? ? ?DMPlexCreateBoxMesh (.... ); > >? ? ? > > >? ? ? >? ? ?and get a uniform mesh. The mesh is periodic. > >? ? ? > > >? ? ? >? ? ?We typically want to "scale" the coordinates (vertices) of the mesh, > >? ? ? >? ? ?and > >? ? ? >? ? ?to achieve this, we call > >? ? ? > > >? ? ? >? ? ?DMGetCoordinatesLocal(dm, &coordinates); > >? ? ? > > >? ? ? >? ? ?and scale the entries in the Vector coordinates appropriately. > >? ? ? > > >? ? ? >? ? ?and then > >? ? ? > > >? ? ? >? ? ?DMSetCoordinatesLocal(dm, coordinates); > >? ? ? > > >? ? ? > > >? ? ? >? ? ?After this, we localise the coordinates by calling > >? ? ? > > >? ? ? >? ? ?DMLocalizeCoordinates(dm); > >? ? ? > > >? ? ? >? ? ?This worked fine up to PETSc 3.18, but with versions after this, the > >? ? ? >? ? ?coordinates we get from the call > >? ? ? > > >? ? ? >? ? ?DMPlexGetCellCoordinates(dm, CellID, &isDG, &CoordSize, > >? ? ? >? ? ?&ArrayCoordinates, &Coordinates); > >? ? ? > > >? ? ? >? ? ?are no longer correct if the mesh is periodic. A number of the > >? ? ? >? ? ?coordinates returned from calling DMPlexGetCellCoordinates are wrong. > >? ? ? > > >? ? ? >? ? ?I think, this is because DMLocalizeCoordinates is now automatically > >? ? ? >? ? ?called within the routine DMPlexCreateBoxMesh. > >? ? ? > > >? ? ? >? ? ?So, my question is: How should we scale the coordinates from a periodic > >? ? ? >? ? ?DMPlex mesh so that they are reflected correctly when calling both > >? ? ? >? ? ?DMGetCoordinatesLocal and DMPlexGetCellCoordinates, with PETSc versions > >? ? ? >? ? ? ?>= 3.18? > >? ? ? > > >? ? ? > > >? ? ? > I think we might have to add an API function. For now, when you scale > >? ? ? > the coordinates, > >? ? ? > can you scale both copies? > >? ? ? > > >? ? ? >? ? DMGetCoordinatesLocal() > >? ? ? >? ? DMGetCellCoordinatesLocal(); > >? ? ? > > >? ? ? > and then set them back. > >? ? ? > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? ?Matt > >? ? ? > > >? ? ? >? ? ?Many thanks, Berend. > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin their > >? ? ? > experiments is infinitely more interesting than any results to which > >? ? ? > their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > > > >? ? ?>> > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any > results to > > which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- A non-text attachment was scrubbed... Name: coordscaleexample.c Type: text/x-csrc Size: 3215 bytes Desc: not available URL: From bsmith at petsc.dev Wed May 17 13:59:15 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 17 May 2023 14:59:15 -0400 Subject: [petsc-users] Nested field split In-Reply-To: References: Message-ID: <0BA9CB92-B356-4EC4-B46C-C7A47054A4CF@petsc.dev> Absolutely, that is fundamental to the design. In the simple case where all the degrees of freedom exist at the same grid points, hence storage is like u,v,t,p in the vector the nesting is trivial. You indicate the fields without using IS (don't even need to change any code) -pc_fieldsplit_0_fields 0,1,2 -fieldsplit_pc_fieldsplit_0_fields 0,1 Listing the two complimentary fields pc_fieldsplit_1_fields 3 -fieldsplit_pc_fieldsplit_1_fields 2 should be optional (I can't remember if it is smart enough to allow not listing them) If you have a staggered grid then indicating the fields is trickery (since you don't have the simple u,v,t,p layout of the degrees of freedom) > On May 17, 2023, at 12:47 PM, Alexander Lindsay wrote: > > I've seen threads in the archives about nested field split but I'm not sure they match what I'm asking about. > > I'm doing a Schur field split for a porous version of incompressible Navier-Stokes. In addition to pressure and velocity fields, we have fluid and solid temperature fields. I plan to put all primal variables in one split and the pressure obviously in the Schur split. Now within the "primal variable split" a user is wondering whether we can do a further split, e.g. perhaps an additive split with the solid temperature split out from the velocities and fluid temperature (the former is almost pure conduction whereas the latter may be advection dominated). Is this possible? > > Alex From alexlindsay239 at gmail.com Wed May 17 14:22:52 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Wed, 17 May 2023 12:22:52 -0700 Subject: [petsc-users] Nested field split In-Reply-To: <0BA9CB92-B356-4EC4-B46C-C7A47054A4CF@petsc.dev> References: <0BA9CB92-B356-4EC4-B46C-C7A47054A4CF@petsc.dev> Message-ID: Awesome, thanks Barry! On Wed, May 17, 2023 at 11:59?AM Barry Smith wrote: > > Absolutely, that is fundamental to the design. > > In the simple case where all the degrees of freedom exist at the same > grid points, hence storage is like u,v,t,p in the vector the nesting is > trivial. You indicate the fields without using IS (don't even need to > change any code) > > -pc_fieldsplit_0_fields 0,1,2 > -fieldsplit_pc_fieldsplit_0_fields 0,1 > > Listing the two complimentary fields > pc_fieldsplit_1_fields 3 > -fieldsplit_pc_fieldsplit_1_fields 2 > should be optional (I can't remember if it is smart enough to allow not > listing them) > > If you have a staggered grid then indicating the fields is trickery (since > you don't have the simple u,v,t,p layout of the degrees of freedom) > > > > > On May 17, 2023, at 12:47 PM, Alexander Lindsay < > alexlindsay239 at gmail.com> wrote: > > > > I've seen threads in the archives about nested field split but I'm not > sure they match what I'm asking about. > > > > I'm doing a Schur field split for a porous version of incompressible > Navier-Stokes. In addition to pressure and velocity fields, we have fluid > and solid temperature fields. I plan to put all primal variables in one > split and the pressure obviously in the Schur split. Now within the "primal > variable split" a user is wondering whether we can do a further split, e.g. > perhaps an additive split with the solid temperature split out from the > velocities and fluid temperature (the former is almost pure conduction > whereas the latter may be advection dominated). Is this possible? > > > > Alex > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 17 14:23:34 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2023 15:23:34 -0400 Subject: [petsc-users] Nested field split In-Reply-To: <0BA9CB92-B356-4EC4-B46C-C7A47054A4CF@petsc.dev> References: <0BA9CB92-B356-4EC4-B46C-C7A47054A4CF@petsc.dev> Message-ID: On Wed, May 17, 2023 at 2:59?PM Barry Smith wrote: > > Absolutely, that is fundamental to the design. > > In the simple case where all the degrees of freedom exist at the same > grid points, hence storage is like u,v,t,p in the vector the nesting is > trivial. You indicate the fields without using IS (don't even need to > change any code) > > -pc_fieldsplit_0_fields 0,1,2 > -fieldsplit_pc_fieldsplit_0_fields 0,1 > > Listing the two complimentary fields > pc_fieldsplit_1_fields 3 > -fieldsplit_pc_fieldsplit_1_fields 2 > should be optional (I can't remember if it is smart enough to allow not > listing them) > > If you have a staggered grid then indicating the fields is trickery (since > you don't have the simple u,v,t,p layout of the degrees of freedom) > Here we do something similar. Thanks, Matt > > On May 17, 2023, at 12:47 PM, Alexander Lindsay < > alexlindsay239 at gmail.com> wrote: > > > > I've seen threads in the archives about nested field split but I'm not > sure they match what I'm asking about. > > > > I'm doing a Schur field split for a porous version of incompressible > Navier-Stokes. In addition to pressure and velocity fields, we have fluid > and solid temperature fields. I plan to put all primal variables in one > split and the pressure obviously in the Schur split. Now within the "primal > variable split" a user is wondering whether we can do a further split, e.g. > perhaps an additive split with the solid temperature split out from the > velocities and fluid temperature (the former is almost pure conduction > whereas the latter may be advection dominated). Is this possible? > > > > Alex > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 17 14:24:07 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2023 15:24:07 -0400 Subject: [petsc-users] Nested field split In-Reply-To: References: <0BA9CB92-B356-4EC4-B46C-C7A47054A4CF@petsc.dev> Message-ID: On Wed, May 17, 2023 at 3:23?PM Matthew Knepley wrote: > On Wed, May 17, 2023 at 2:59?PM Barry Smith wrote: > >> >> Absolutely, that is fundamental to the design. >> >> In the simple case where all the degrees of freedom exist at the same >> grid points, hence storage is like u,v,t,p in the vector the nesting is >> trivial. You indicate the fields without using IS (don't even need to >> change any code) >> >> -pc_fieldsplit_0_fields 0,1,2 >> -fieldsplit_pc_fieldsplit_0_fields 0,1 >> >> Listing the two complimentary fields >> pc_fieldsplit_1_fields 3 >> -fieldsplit_pc_fieldsplit_1_fields 2 >> should be optional (I can't remember if it is smart enough to allow not >> listing them) >> >> If you have a staggered grid then indicating the fields is trickery >> (since you don't have the simple u,v,t,p layout of the degrees of freedom) >> > > Here we do something similar. > The URL was missing: https://arxiv.org/abs/1808.08328 Thanks, Matt > Thanks, > > Matt > > >> > On May 17, 2023, at 12:47 PM, Alexander Lindsay < >> alexlindsay239 at gmail.com> wrote: >> > >> > I've seen threads in the archives about nested field split but I'm not >> sure they match what I'm asking about. >> > >> > I'm doing a Schur field split for a porous version of incompressible >> Navier-Stokes. In addition to pressure and velocity fields, we have fluid >> and solid temperature fields. I plan to put all primal variables in one >> split and the pressure obviously in the Schur split. Now within the "primal >> variable split" a user is wondering whether we can do a further split, e.g. >> perhaps an additive split with the solid temperature split out from the >> velocities and fluid temperature (the former is almost pure conduction >> whereas the latter may be advection dominated). Is this possible? >> > >> > Alex >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 17 14:54:53 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 17 May 2023 15:54:53 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> <65FAB3E9-8D08-4AEE-874E-636EB2C76A29@petsc.dev> <6CE3B35C-E74E-43B5-A3DF-4D0D77E6A94C@petsc.dev> Message-ID: > On May 17, 2023, at 11:10 AM, Leonardo Mutti wrote: > > Dear developers, let me kindly ask for your help again. > In the following snippet, a bi-diagonal matrix A is set up. It measures 8x8 blocks, each block is 2x2 elements. I would like to create the correct IS objects for PCGASM. > The non-overlapping IS should be: [0,1], [2,3],[4,5], ..., [14,15]. The overlapping IS should be: [0,1], [0,1,2,3], [2,3,4,5], ..., [12,13,14,15] > I am running the code with 4 processors. For some reason, after calling PCGASMDestroySubdomains the code crashes with severe (157): Program Exception - access violation. A visual inspection of the indices using ISView looks good. Likely memory corruption or use of an object or an array that was already freed. Best to use Valgrind to find the exact location of the mess. > Thanks again, > Leonardo > > Mat :: A > Vec :: b > PetscInt :: M,N_blocks,block_size,I,J,NSub,converged_reason,srank,erank,color,subcomm > PetscMPIInt :: size > PetscErrorCode :: ierr > PetscScalar :: v > KSP :: ksp > PC :: pc > IS,ALLOCATABLE :: subdomains_IS(:), inflated_IS(:) > PetscInt :: NMPI,MYRANK,IERMPI > INTEGER :: IS_counter, is_start, is_end > > call PetscInitialize(PETSC_NULL_CHARACTER, ierr) > call PetscLogDefaultBegin(ierr) > call MPI_COMM_SIZE(MPI_COMM_WORLD, NMPI, IERMPI) > CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERMPI) > > N_blocks = 8 > block_size = 2 > M = N_blocks * block_size > > ALLOCATE(subdomains_IS(N_blocks)) > ALLOCATE(inflated_IS(N_blocks)) > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! ASSUMPTION: no block spans more than one rank (the inflated blocks can) > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! INTRO: create matrix and right hand side > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > ! How many inflated blocks span more than one rank? NMPI-1 ! > > call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, > & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, > & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) > call VecCreate(PETSC_COMM_WORLD,b,ierr) > call VecSetSizes(b, PETSC_DECIDE, M,ierr) > call VecSetFromOptions(b,ierr) > > DO I=(MYRANK*(M/NMPI)),((MYRANK+1)*(M/NMPI)-1) > > ! Set matrix > v=1 > call MatSetValue(A, I, I, v, INSERT_VALUES, ierr) > IF (I-block_size .GE. 0) THEN > v=-1 > call MatSetValue(A, I, I-block_size, v, INSERT_VALUES, ierr) > ENDIF > ! Set rhs > v = I > call VecSetValue(b,I,v, INSERT_VALUES,ierr) > > END DO > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > call VecAssemblyBegin(b,ierr) > call VecAssemblyEnd(b,ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! FIRST KSP/PC SETUP > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) > call KSPSetOperators(ksp, A, A, ierr) > call KSPSetType(ksp, 'preonly', ierr) > call KSPGetPC(ksp, pc, ierr) > call PCSetType(pc, PCGASM, ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > !! GASM, SETTING SUBDOMAINS > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > DO IS_COUNTER=1,N_blocks > > srank = MAX(((IS_COUNTER-2)*block_size)/(M/NMPI),0) ! start rank reached by inflated block > erank = MIN(((IS_COUNTER-1)*block_size)/(M/NMPI),NMPI-1) ! end rank reached by inflated block. Coincides with rank containing non-inflated block > > ! Create subcomms > color = MPI_UNDEFINED > IF (myrank == srank .or. myrank == erank) THEN > color = 1 > ENDIF > call MPI_Comm_split(MPI_COMM_WORLD,color,MYRANK,subcomm,ierr) > > ! Create IS > IF (srank .EQ. erank) THEN ! Block and overlap are on the same rank > IF (MYRANK .EQ. srank) THEN > call ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) > IF (IS_COUNTER .EQ. 1) THEN ! the first block is not inflated > call ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) > ELSE > call ISCreateStride(PETSC_COMM_SELF,2*block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) > ENDIF > ENDIF > else ! Block and overlap not on the same rank > if (myrank == erank) then ! the block > call ISCreateStride(subcomm,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) > call ISCreateStride(subcomm,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) > endif > if (myrank == srank) then ! the overlap > call ISCreateStride(subcomm,block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) > call ISCreateStride(subcomm,0,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) > endif > endif > > call MPI_Comm_free(subcomm, ierr) > END DO > > ! Set the domains/subdomains > NSub = N_blocks/NMPI > is_start = 1 + myrank * NSub > is_end = min(is_start + NSub, N_blocks) > if (myrank + 1 < NMPI) then > NSub = NSub + 1 > endif > > call PCGASMSetSubdomains(pc,NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) > call PCGASMDestroySubdomains(NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) > > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_ksp_type", "gmres", ierr) > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_pc_type", "none", ierr) > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-pc_gasm_view_subdomains", "1", ierr) > > call KSPSetUp(ksp, ierr) > call PCSetUp(pc, ierr) > call KSPSetFromOptions(ksp, ierr) > call PCSetFromOptions(pc, ierr) > > call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD, ierr) > > > Il giorno mer 10 mag 2023 alle ore 03:02 Barry Smith > ha scritto: >> >> >>> On May 9, 2023, at 4:58 PM, LEONARDO MUTTI > wrote: >>> >>> In my notation diag(1,1) means a diagonal 2x2 matrix with 1,1 on the diagonal, submatrix in the 8x8 diagonal matrix diag(1,1,2,2,...,2). >>> Am I then correct that the IS representing diag(1,1) is 0,1, and that diag(2,2,...,2) is represented by 2,3,4,5,6,7? >> >> I believe so >> >>> Thanks, >>> Leonardo >>> >>> Il mar 9 mag 2023, 20:45 Barry Smith > ha scritto: >>>> >>>> It is simplier than you are making it out to be. Each IS[] is a list of rows (and columns) in the sub (domain) matrix. In your case with the matrix of 144 by 144 the indices will go from 0 to 143. >>>> >>>> In your simple Fortran code you have a completely different problem. A matrix with 8 rows and columns. In that case if you want the first IS to represent just the first row (and column) in the matrix then it should contain only 0. The second submatrix which is all rows (but the first) should have 1,2,3,4,5,6,7 >>>> >>>> I do not understand why your code has >>>> >>>>>>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>> >>>> it should just be 0 >>>> >>>> >>>> >>>> >>>> >>>>> On May 9, 2023, at 12:44 PM, LEONARDO MUTTI > wrote: >>>>> >>>>> Partial typo: I expect 9x(16+16) numbers to be stored in subdomain_IS : # subdomains x (row indices of the submatrix + col indices of the submatrix). >>>>> >>>>> Il giorno mar 9 mag 2023 alle ore 18:31 LEONARDO MUTTI > ha scritto: >>>>>> >>>>>> >>>>>> ---------- Forwarded message --------- >>>>>> Da: LEONARDO MUTTI > >>>>>> Date: mar 9 mag 2023 alle ore 18:29 >>>>>> Subject: Re: [petsc-users] Understanding index sets for PCGASM >>>>>> To: Matthew Knepley > >>>>>> >>>>>> >>>>>> Thank you for your answer, but I am still confused, sorry. >>>>>> Consider https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 on one processor. >>>>>> Let M=12 for the sake of simplicity, i.e. we deal with a 12x12 2D grid, hence, a 144x144 matrix. >>>>>> Let NSubx = 3, so that on the grid we do 3 vertical and 3 horizontal subdivisions. >>>>>> We should obtain 9 subdomains that are grids of 4x4 nodes each, thus corresponding to 9 submatrices of size 16x16. >>>>>> In my run I obtain NSub = 9 (great) and subdomain_IS(i), i=1,...,9, reads: >>>>>> >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 0 >>>>>> 1 1 >>>>>> 2 2 >>>>>> 3 3 >>>>>> 4 12 >>>>>> 5 13 >>>>>> 6 14 >>>>>> 7 15 >>>>>> 8 24 >>>>>> 9 25 >>>>>> 10 26 >>>>>> 11 27 >>>>>> 12 36 >>>>>> 13 37 >>>>>> 14 38 >>>>>> 15 39 >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 4 >>>>>> 1 5 >>>>>> 2 6 >>>>>> 3 7 >>>>>> 4 16 >>>>>> 5 17 >>>>>> 6 18 >>>>>> 7 19 >>>>>> 8 28 >>>>>> 9 29 >>>>>> 10 30 >>>>>> 11 31 >>>>>> 12 40 >>>>>> 13 41 >>>>>> 14 42 >>>>>> 15 43 >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 8 >>>>>> 1 9 >>>>>> 2 10 >>>>>> 3 11 >>>>>> 4 20 >>>>>> 5 21 >>>>>> 6 22 >>>>>> 7 23 >>>>>> 8 32 >>>>>> 9 33 >>>>>> 10 34 >>>>>> 11 35 >>>>>> 12 44 >>>>>> 13 45 >>>>>> 14 46 >>>>>> 15 47 >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 48 >>>>>> 1 49 >>>>>> 2 50 >>>>>> 3 51 >>>>>> 4 60 >>>>>> 5 61 >>>>>> 6 62 >>>>>> 7 63 >>>>>> 8 72 >>>>>> 9 73 >>>>>> 10 74 >>>>>> 11 75 >>>>>> 12 84 >>>>>> 13 85 >>>>>> 14 86 >>>>>> 15 87 >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 52 >>>>>> 1 53 >>>>>> 2 54 >>>>>> 3 55 >>>>>> 4 64 >>>>>> 5 65 >>>>>> 6 66 >>>>>> 7 67 >>>>>> 8 76 >>>>>> 9 77 >>>>>> 10 78 >>>>>> 11 79 >>>>>> 12 88 >>>>>> 13 89 >>>>>> 14 90 >>>>>> 15 91 >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 56 >>>>>> 1 57 >>>>>> 2 58 >>>>>> 3 59 >>>>>> 4 68 >>>>>> 5 69 >>>>>> 6 70 >>>>>> 7 71 >>>>>> 8 80 >>>>>> 9 81 >>>>>> 10 82 >>>>>> 11 83 >>>>>> 12 92 >>>>>> 13 93 >>>>>> 14 94 >>>>>> 15 95 >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 96 >>>>>> 1 97 >>>>>> 2 98 >>>>>> 3 99 >>>>>> 4 108 >>>>>> 5 109 >>>>>> 6 110 >>>>>> 7 111 >>>>>> 8 120 >>>>>> 9 121 >>>>>> 10 122 >>>>>> 11 123 >>>>>> 12 132 >>>>>> 13 133 >>>>>> 14 134 >>>>>> 15 135 >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 100 >>>>>> 1 101 >>>>>> 2 102 >>>>>> 3 103 >>>>>> 4 112 >>>>>> 5 113 >>>>>> 6 114 >>>>>> 7 115 >>>>>> 8 124 >>>>>> 9 125 >>>>>> 10 126 >>>>>> 11 127 >>>>>> 12 136 >>>>>> 13 137 >>>>>> 14 138 >>>>>> 15 139 >>>>>> IS Object: 1 MPI process >>>>>> type: general >>>>>> Number of indices in set 16 >>>>>> 0 104 >>>>>> 1 105 >>>>>> 2 106 >>>>>> 3 107 >>>>>> 4 116 >>>>>> 5 117 >>>>>> 6 118 >>>>>> 7 119 >>>>>> 8 128 >>>>>> 9 129 >>>>>> 10 130 >>>>>> 11 131 >>>>>> 12 140 >>>>>> 13 141 >>>>>> 14 142 >>>>>> 15 143 >>>>>> >>>>>> As you said, no number here reaches 144. >>>>>> But the number stored in subdomain_IS are 9x16= #subdomains x 16, whereas I would expect, also given your latest reply, 9x16x16x2=#subdomains x submatrix height x submatrix width x length of a (row,column) pair. >>>>>> It would really help me if you could briefly explain how the output above encodes the subdivision into subdomains. >>>>>> Many thanks again, >>>>>> Leonardo >>>>>> >>>>>> >>>>>> Il giorno mar 9 mag 2023 alle ore 16:24 Matthew Knepley > ha scritto: >>>>>>> On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI > wrote: >>>>>>>> Great thanks! I can now successfully run https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90. >>>>>>>> >>>>>>>> Going forward with my experiments, let me post a new code snippet (very similar to ex71f.F90) that I cannot get to work, probably I must be setting up the IS objects incorrectly. >>>>>>>> >>>>>>>> I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a vector b=(0.5,...,0.5). We have only one processor, and I want to solve Ax=b using GASM. In particular, KSP is set to preonly, GASM is the preconditioner and it uses on each submatrix an lu direct solver (sub_ksp = preonly, sub_pc = lu). >>>>>>>> >>>>>>>> For the GASM algorithm, I divide A into diag(1,1) and diag(2,2,...,2). For simplicity I set 0 overlap. Now I want to use GASM to solve Ax=b. The code follows. >>>>>>>> >>>>>>>> #include >>>>>>>> #include >>>>>>>> #include >>>>>>>> USE petscmat >>>>>>>> USE petscksp >>>>>>>> USE petscpc >>>>>>>> USE MPI >>>>>>>> >>>>>>>> Mat :: A >>>>>>>> Vec :: b, x >>>>>>>> PetscInt :: M, I, J, ISLen, NSub >>>>>>>> PetscMPIInt :: size >>>>>>>> PetscErrorCode :: ierr >>>>>>>> PetscScalar :: v >>>>>>>> KSP :: ksp >>>>>>>> PC :: pc >>>>>>>> IS :: subdomains_IS(2), inflated_IS(2) >>>>>>>> PetscInt,DIMENSION(4) :: indices_first_domain >>>>>>>> PetscInt,DIMENSION(36) :: indices_second_domain >>>>>>>> >>>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >>>>>>>> call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) >>>>>>>> >>>>>>>> >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> ! INTRO: create matrix and right hand side >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> >>>>>>>> WRITE(*,*) "Assembling A,b" >>>>>>>> >>>>>>>> M = 8 >>>>>>>> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>>> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >>>>>>>> DO I=1,M >>>>>>>> DO J=1,M >>>>>>>> IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN >>>>>>>> v = 1 >>>>>>>> ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN >>>>>>>> v = 2 >>>>>>>> ELSE >>>>>>>> v = 0 >>>>>>>> ENDIF >>>>>>>> call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) >>>>>>>> END DO >>>>>>>> END DO >>>>>>>> >>>>>>>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>>>>>>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>>>>>>> >>>>>>>> call VecCreate(PETSC_COMM_WORLD,b,ierr) >>>>>>>> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >>>>>>>> call VecSetFromOptions(b,ierr) >>>>>>>> >>>>>>>> do I=1,M >>>>>>>> v = 0.5 >>>>>>>> call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) >>>>>>>> end do >>>>>>>> >>>>>>>> call VecAssemblyBegin(b,ierr) >>>>>>>> call VecAssemblyEnd(b,ierr) >>>>>>>> >>>>>>>> >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> ! FIRST KSP/PC SETUP >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> >>>>>>>> WRITE(*,*) "KSP/PC first setup" >>>>>>>> >>>>>>>> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >>>>>>>> call KSPSetOperators(ksp, A, A, ierr) >>>>>>>> call KSPSetType(ksp, 'preonly', ierr) >>>>>>>> call KSPGetPC(ksp, pc, ierr) >>>>>>>> call KSPSetUp(ksp, ierr) >>>>>>>> call PCSetType(pc, PCGASM, ierr) >>>>>>>> >>>>>>>> >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> ! GASM, SETTING SUBDOMAINS >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> >>>>>>>> WRITE(*,*) "Setting GASM subdomains" >>>>>>>> >>>>>>>> ! Let's create the subdomain IS and inflated_IS >>>>>>>> ! They are equal if no overlap is present >>>>>>>> ! They are 1: 0,1,8,9 >>>>>>>> ! 2: 10,...,15,18,...,23,...,58,...,63 >>>>>>>> >>>>>>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>>>>> do I=0,5 >>>>>>>> do J=0,5 >>>>>>>> indices_second_domain(I*6+1+J) = 18 + J + 8*I ! corresponds to diag(2,2,...,2) >>>>>>>> !WRITE(*,*) I*6+1+J, 18 + J + 8*I >>>>>>>> end do >>>>>>>> end do >>>>>>>> >>>>>>>> ! Convert into IS >>>>>>>> ISLen = 4 >>>>>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>>>>> & PETSC_COPY_VALUES, subdomains_IS(1), ierr) >>>>>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>>>>> & PETSC_COPY_VALUES, inflated_IS(1), ierr) >>>>>>>> ISLen = 36 >>>>>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>>>>> & PETSC_COPY_VALUES, subdomains_IS(2), ierr) >>>>>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>>>>> & PETSC_COPY_VALUES, inflated_IS(2), ierr) >>>>>>>> >>>>>>>> NSub = 2 >>>>>>>> call PCGASMSetSubdomains(pc,NSub, >>>>>>>> & subdomains_IS,inflated_IS,ierr) >>>>>>>> call PCGASMDestroySubdomains(NSub, >>>>>>>> & subdomains_IS,inflated_IS,ierr) >>>>>>>> >>>>>>>> >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> ! GASM: SET SUBSOLVERS >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> >>>>>>>> WRITE(*,*) "Setting subsolvers for GASM" >>>>>>>> >>>>>>>> call PCSetUp(pc, ierr) ! should I add this? >>>>>>>> >>>>>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>>>>> & "-sub_pc_type", "lu", ierr) >>>>>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>>>>> & "-sub_ksp_type", "preonly", ierr) >>>>>>>> >>>>>>>> call KSPSetFromOptions(ksp, ierr) >>>>>>>> call PCSetFromOptions(pc, ierr) >>>>>>>> >>>>>>>> >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> ! DUMMY SOLUTION: DID IT WORK? >>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>> >>>>>>>> WRITE(*,*) "Solve" >>>>>>>> >>>>>>>> call VecDuplicate(b,x,ierr) >>>>>>>> call KSPSolve(ksp,b,x,ierr) >>>>>>>> >>>>>>>> call MatDestroy(A, ierr) >>>>>>>> call KSPDestroy(ksp, ierr) >>>>>>>> call PetscFinalize(ierr) >>>>>>>> >>>>>>>> This code is failing in multiple points. At call PCSetUp(pc, ierr) it produces: >>>>>>>> >>>>>>>> [0]PETSC ERROR: Argument out of range >>>>>>>> [0]PETSC ERROR: Scatter indices in ix are out of range >>>>>>>> ... >>>>>>>> [0]PETSC ERROR: #1 VecScatterCreate() at ***\src\vec\is\sf\INTERF~1\vscat.c:736 >>>>>>>> [0]PETSC ERROR: #2 PCSetUp_GASM() at ***\src\ksp\pc\impls\gasm\gasm.c:433 >>>>>>>> [0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994 >>>>>>>> >>>>>>>> And at call KSPSolve(ksp,b,x,ierr) it produces: >>>>>>>> >>>>>>>> forrtl: severe (157): Program Exception - access violation >>>>>>>> >>>>>>>> The index sets are setup coherently with the outputs of e.g. https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: in particular each element of the matrix A corresponds to a number from 0 to 63. >>>>>>> >>>>>>> This is not correct, I believe. The indices are row/col indices, not indices into dense blocks, so for >>>>>>> your example, they are all in [0, 8]. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>>> Note that each submatrix does not represent some physical subdomain, the subdivision is just at the algebraic level. >>>>>>>> I thus have the following questions: >>>>>>>> is this the correct way of creating the IS objects, given my objective at the beginning of the email? Is the ordering correct? >>>>>>>> what am I doing wrong that is generating the above errors? >>>>>>>> Thanks for the patience and the time. >>>>>>>> Best, >>>>>>>> Leonardo >>>>>>>> >>>>>>>> Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith > ha scritto: >>>>>>>>> >>>>>>>>> Added in barry/2023-05-04/add-pcgasm-set-subdomains see also https://gitlab.com/petsc/petsc/-/merge_requests/6419 >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> >>>>>>>>>> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI > wrote: >>>>>>>>>> >>>>>>>>>> Thank you for the help. >>>>>>>>>> Adding to my example: >>>>>>>>>> call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) >>>>>>>>>> call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr) >>>>>>>>>> results in: >>>>>>>>>> Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS referenced in function ... >>>>>>>>>> Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS referenced in function ... >>>>>>>>>> I'm not sure if the interfaces are missing or if I have a compilation problem. >>>>>>>>>> Thank you again. >>>>>>>>>> Best, >>>>>>>>>> Leonardo >>>>>>>>>> >>>>>>>>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith > ha scritto: >>>>>>>>>>> >>>>>>>>>>> Thank you for the test code. I have a fix in the branch barry/2023-04-29/fix-pcasmcreatesubdomains2d with merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>>>>>>>> >>>>>>>>>>> The functions did not have proper Fortran stubs and interfaces so I had to provide them manually in the new branch. >>>>>>>>>>> >>>>>>>>>>> Use >>>>>>>>>>> >>>>>>>>>>> git fetch >>>>>>>>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>>>>>> ./configure etc >>>>>>>>>>> >>>>>>>>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to change things slightly and I updated the error handling for the latest version. >>>>>>>>>>> >>>>>>>>>>> Please let us know if you have any later questions. >>>>>>>>>>> >>>>>>>>>>> Barry >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI > wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hello. I am having a hard time understanding the index sets to feed PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To get more intuition on how the IS objects behave I tried the following minimal (non) working example, which should tile a 16x16 matrix into 16 square, non-overlapping submatrices: >>>>>>>>>>>> >>>>>>>>>>>> #include >>>>>>>>>>>> #include >>>>>>>>>>>> #include >>>>>>>>>>>> USE petscmat >>>>>>>>>>>> USE petscksp >>>>>>>>>>>> USE petscpc >>>>>>>>>>>> >>>>>>>>>>>> Mat :: A >>>>>>>>>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>>>>>>>>> INTEGER :: I,J >>>>>>>>>>>> PetscErrorCode :: ierr >>>>>>>>>>>> PetscScalar :: v >>>>>>>>>>>> KSP :: ksp >>>>>>>>>>>> PC :: pc >>>>>>>>>>>> IS :: subdomains_IS, inflated_IS >>>>>>>>>>>> >>>>>>>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>>>>>>>>> >>>>>>>>>>>> !-----Create a dummy matrix >>>>>>>>>>>> M = 16 >>>>>>>>>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>>>>>>> & M, M, >>>>>>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>>>>>> & A, ierr) >>>>>>>>>>>> >>>>>>>>>>>> DO I=1,M >>>>>>>>>>>> DO J=1,M >>>>>>>>>>>> v = I*J >>>>>>>>>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>>>>>>>>> & INSERT_VALUES , ierr) >>>>>>>>>>>> END DO >>>>>>>>>>>> END DO >>>>>>>>>>>> >>>>>>>>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>>>>>> >>>>>>>>>>>> !-----Create KSP and PC >>>>>>>>>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>>>>>>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>>>>>>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>>>>>>>>> call KSPGetPC(ksp,pc,ierr) >>>>>>>>>>>> call KSPSetUp(ksp, ierr) >>>>>>>>>>>> call PCSetType(pc,PCGASM, ierr) >>>>>>>>>>>> call PCSetUp(pc , ierr) >>>>>>>>>>>> >>>>>>>>>>>> !-----GASM setup >>>>>>>>>>>> NSubx = 4 >>>>>>>>>>>> dof = 1 >>>>>>>>>>>> overlap = 0 >>>>>>>>>>>> >>>>>>>>>>>> call PCGASMCreateSubdomains2D(pc, >>>>>>>>>>>> & M, M, >>>>>>>>>>>> & NSubx, NSubx, >>>>>>>>>>>> & dof, overlap, >>>>>>>>>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>>>>>>>>> >>>>>>>>>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>>>>>>>>> >>>>>>>>>>>> call KSPDestroy(ksp, ierr) >>>>>>>>>>>> call PetscFinalize(ierr) >>>>>>>>>>>> >>>>>>>>>>>> Running this on one processor, I get NSub = 4. >>>>>>>>>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as expected. >>>>>>>>>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - access violation". So: >>>>>>>>>>>> 1) why do I get two different results with ASM, and GASM? >>>>>>>>>>>> 2) why do I get access violation and how can I solve this? >>>>>>>>>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. As I see on the Fortran interface, the arguments to PCGASMCreateSubdomains2D are IS objects: >>>>>>>>>>>> >>>>>>>>>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>>>>>>>>> import tPC,tIS >>>>>>>>>>>> PC a ! PC >>>>>>>>>>>> PetscInt b ! PetscInt >>>>>>>>>>>> PetscInt c ! PetscInt >>>>>>>>>>>> PetscInt d ! PetscInt >>>>>>>>>>>> PetscInt e ! PetscInt >>>>>>>>>>>> PetscInt f ! PetscInt >>>>>>>>>>>> PetscInt g ! PetscInt >>>>>>>>>>>> PetscInt h ! PetscInt >>>>>>>>>>>> IS i ! IS >>>>>>>>>>>> IS j ! IS >>>>>>>>>>>> PetscErrorCode z >>>>>>>>>>>> end subroutine PCGASMCreateSubdomains2D >>>>>>>>>>>> Thus: >>>>>>>>>>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, for every created subdomain, the list of rows and columns defining the subblock in the matrix, am I right? >>>>>>>>>>>> >>>>>>>>>>>> Context: I have a block-tridiagonal system arising from space-time finite elements, and I want to solve it with GMRES+PCGASM preconditioner, where each overlapping submatrix is on the diagonal and of size 3x3 blocks (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>>>>>>>>> >>>>>>>>>>>> Thanks in advance, >>>>>>>>>>>> Leonardo >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 17 16:04:24 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2023 17:04:24 -0400 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: <0241f3b5-16b4-786b-61a3-5f8ac64a3f5a@ovgu.de> References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> <0241f3b5-16b4-786b-61a3-5f8ac64a3f5a@ovgu.de> Message-ID: On Wed, May 17, 2023 at 2:01?PM Berend van Wachem wrote: > Dear Matt, > > I tried it, but it doesn't seem to work. > Attached is a very small working example illustrating the problem. > I create a DMPlexBox Mesh, periodic in the Y direction. I then scale the Y > coordinates with a factor 10, and add 1.0 to it. Both > DMGetCoordinatesLocal and DMGetCellCoordinatesLocal. > Then I evaluate the coordinates with DMPlexGetCellCoordinates. Most of the > Y coordinates are correct, but not all of them - for > instance, the minimum Y coordinate is 0.0, and this should be 1.0. > > Am I doing something wrong? > Quickly, I see that a *= 10.0 + 1.0; is the same as a *= 11.0; not multiply by 10 and add 1. I will send it back when I get everything the way I want. Thanks, Matt > Thanks and best regards, > > Berend. > > On 5/17/23 17:58, Matthew Knepley wrote: > > On Wed, May 17, 2023 at 11:20?AM Berend van Wachem < > berend.vanwachem at ovgu.de > wrote: > > > > Dear Matt, > > > > Is there a way to 'redo' the DMLocalizeCoordinates() ? Or to undo it? > > Alternatively, can we make the calling of DMLocalizeCoordinates() in > the DMPlexCreate...() routines optional? > > > > Otherwise, we would have to copy all arrays of coordinates from > DMGetCoordinatesLocal() and DMGetCellCoordinatesLocal() before > > scaling them. > > > > > > I am likely not being clear. I think all you have to do is the following: > > > > DMGetCoordinatesLocal(dm, &xl); > > VecScale(xl, scale); > > DMSetCoordinatesLocal(dm, xl); > > DMGetCellCoordinatesLocal(dm, &xl); > > VecScale(xl, scale); > > DMSetCellCoordinatesLocal(dm, xl); > > > > Does this not work? > > > > Thanks, > > > > Matt > > > > Best regards, Berend. > > > > On 5/17/23 16:35, Matthew Knepley wrote: > > > On Wed, May 17, 2023 at 10:21?AM Berend van Wachem < > berend.vanwachem at ovgu.de > > >> > wrote: > > > > > > Dear Matt, > > > > > > Thanks for getting back to me so quickly. > > > > > > If I scale each of the coordinates of the mesh (say, I want > to cube each > > > co-ordinate), and I do this for both: > > > > > > DMGetCoordinatesLocal(); > > > DMGetCellCoordinatesLocal(); > > > > > > How do I know I am not cubing one coordinate multiple times? > > > > > > > > > Good question. Right now, the only connection between the two > sets of coordinates is DMLocalizeCoordinates(). Since > > sometimes > > > people want to do non-trivial things to > > > coordinates, I prefer not to push in an API for "just" scaling, > but I could be convinced > > > the other way. > > > > > > Thanks, > > > > > > Matt > > > > > > Thanks, Berend. > > > > > > On 5/17/23 16:10, Matthew Knepley wrote: > > > > On Wed, May 17, 2023 at 10:02?AM Berend van Wachem > > > > > > > > > > berend.vanwachem at ovgu.de>>>> wrote: > > > > > > > > Dear PETSc Team, > > > > > > > > We are using DMPlex, and we create a mesh using > > > > > > > > DMPlexCreateBoxMesh (.... ); > > > > > > > > and get a uniform mesh. The mesh is periodic. > > > > > > > > We typically want to "scale" the coordinates > (vertices) of the mesh, > > > > and > > > > to achieve this, we call > > > > > > > > DMGetCoordinatesLocal(dm, &coordinates); > > > > > > > > and scale the entries in the Vector coordinates > appropriately. > > > > > > > > and then > > > > > > > > DMSetCoordinatesLocal(dm, coordinates); > > > > > > > > > > > > After this, we localise the coordinates by calling > > > > > > > > DMLocalizeCoordinates(dm); > > > > > > > > This worked fine up to PETSc 3.18, but with versions > after this, the > > > > coordinates we get from the call > > > > > > > > DMPlexGetCellCoordinates(dm, CellID, &isDG, &CoordSize, > > > > &ArrayCoordinates, &Coordinates); > > > > > > > > are no longer correct if the mesh is periodic. A > number of the > > > > coordinates returned from calling > DMPlexGetCellCoordinates are wrong. > > > > > > > > I think, this is because DMLocalizeCoordinates is now > automatically > > > > called within the routine DMPlexCreateBoxMesh. > > > > > > > > So, my question is: How should we scale the > coordinates from a periodic > > > > DMPlex mesh so that they are reflected correctly when > calling both > > > > DMGetCoordinatesLocal and DMPlexGetCellCoordinates, > with PETSc versions > > > > >= 3.18? > > > > > > > > > > > > I think we might have to add an API function. For now, > when you scale > > > > the coordinates, > > > > can you scale both copies? > > > > > > > > DMGetCoordinatesLocal() > > > > DMGetCellCoordinatesLocal(); > > > > > > > > and then set them back. > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > Many thanks, Berend. > > > > > > > > -- > > > > What most experimenters take for granted before they begin > their > > > > experiments is infinitely more interesting than any > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ < > https://www.cse.buffalo.edu/~knepley/> > > https://www.cse.buffalo.edu/~knepley/>> < > http://www.cse.buffalo.edu/~knepley/ > > > > > http://www.cse.buffalo.edu/~knepley/>>> > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any > > results to > > > which their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ < > https://www.cse.buffalo.edu/~knepley/> < > http://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > > which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Wed May 17 17:58:20 2023 From: liufield at gmail.com (neil liu) Date: Wed, 17 May 2023 18:58:20 -0400 Subject: [petsc-users] Using dmplexdistribute do parallel FEM code. Message-ID: Dear Petsc developers, I am writing my own code to calculate the FEM matrix. The following is my general framework, DMPlexCreateGmsh(); MPI_Comm_rank (Petsc_comm_world, &rank); DMPlexDistribute (.., .., &dmDist); dm = dmDist; //This can create separate dm s for different processors. (reordering.) MatCreate (Petsc_comm_world, &A) // Loop over every tetrahedral element to calculate the local matrix for each processor. Then we can get a local matrix A for each processor. *My question is : it seems we should build a global matrix B (assemble all the As for each partition) and then transfer B to KSP. KSP will do the parallelization correctly, right? * If that is right, I should define a whole domain matrix B before the partitioning (MatCreate (Petsc_comm_world, &B); ), and then use localtoglobal (which petsc function should I use? Do you have any examples.) map to add A to B at the right positions (MatSetValues) ? Does that make sense? Thanks, Xiaodong -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 17 18:30:31 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2023 19:30:31 -0400 Subject: [petsc-users] Using dmplexdistribute do parallel FEM code. In-Reply-To: References: Message-ID: On Wed, May 17, 2023 at 6:58?PM neil liu wrote: > Dear Petsc developers, > > I am writing my own code to calculate the FEM matrix. The following is my > general framework, > > DMPlexCreateGmsh(); > MPI_Comm_rank (Petsc_comm_world, &rank); > DMPlexDistribute (.., .., &dmDist); > > dm = dmDist; > //This can create separate dm s for different processors. (reordering.) > > MatCreate (Petsc_comm_world, &A) > // Loop over every tetrahedral element to calculate the local matrix for > each processor. Then we can get a local matrix A for each processor. > > *My question is : it seems we should build a global matrix B (assemble all > the As for each partition) and then transfer B to KSP. KSP will do the > parallelization correctly, right? * > I would not suggest this. The more common strategy is to assemble each element matrix directly into the global matrix B, by mapping the cell indices directly to global indices (rather than to local indices in the matrix A). You can do this in two stages. You can create a LocalToGlobalMapping in PETSc that maps every local index to a global index. Then you can assemble into B exactly as you would assemble into A by calling MatSetValuesLocal(). DMPlex handles these mappings for you automatically, but I realize that it is a large number of things to buy into. Thanks, Matt > If that is right, I should define a whole domain matrix B before the > partitioning (MatCreate (Petsc_comm_world, &B); ), and then use > localtoglobal (which petsc function should I use? Do you have any > examples.) map to add A to B at the right positions (MatSetValues) ? > > Does that make sense? > > Thanks, > > Xiaodong > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Wed May 17 18:51:58 2023 From: leonardo.mutti01 at universitadipavia.it (Leonardo Mutti) Date: Thu, 18 May 2023 01:51:58 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> <65FAB3E9-8D08-4AEE-874E-636EB2C76A29@petsc.dev> <6CE3B35C-E74E-43B5-A3DF-4D0D77E6A94C@petsc.dev> Message-ID: Thanks for the reply. Even without Valgrind (which I can't use since I'm on Windows), by further simplifying the example, I was able to have PETSc display a more informative message. What I am doing wrong and what should be done differently, this is still unclear to me. The simplified code runs on 2 processors, I built a 4x4 matrix. The subdomains are now given by [0,1] and [2,3], with [2,3] inflating to [0,1,2,3]. Thank you again. Error: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Out of memory. This could be due to allocating [0]PETSC ERROR: too large an object or bleeding by not properly [0]PETSC ERROR: destroying unneeded objects. [0]PETSC ERROR: Memory allocated 0 Memory used by process 0 [0]PETSC ERROR: Try running with -malloc_dump or -malloc_view for info. [0]PETSC ERROR: Memory requested 9437902811936987136 [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! [0]PETSC ERROR: Option left: name:-pc_gasm_view_subdomains value: 1 source: code [0]PETSC ERROR: Option left: name:-sub_ksp_type value: gmres source: code [0]PETSC ERROR: Option left: name:-sub_pc_type value: none source: code [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-234-g43977f8d16 GIT Date: 2023-05-08 14:50:03 +0000 [...] [0]PETSC ERROR: #1 PetscMallocAlign() at ...\Sources\Git\Test-FV\PETSC-~1\src\sys\memory\mal.c:66 [0]PETSC ERROR: #2 PetscMallocA() at ...\Sources\Git\Test-FV\PETSC-~1\src\sys\memory\mal.c:411 [0]PETSC ERROR: #3 MatCreateSubMatrices_MPIAIJ() at ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:2025 [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:3136 [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:3208 [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at ...\Sources\Git\Test-FV\PETSC-~1\src\mat\INTERF~1\matrix.c:7071 [0]PETSC ERROR: #7 PCSetUp_GASM() at ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\pc\impls\gasm\gasm.c:556 [0]PETSC ERROR: #8 PCSetUp() at ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\pc\INTERF~1\precon.c:994 [0]PETSC ERROR: #9 KSPSetUp() at ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\ksp\INTERF~1\itfunc.c:406 Code, the important bit is after ! GASM, SETTING SUBDOMAINS: Mat :: A Vec :: b PetscInt :: M,N_blocks,block_size,NSub,I PetscErrorCode :: ierr PetscScalar :: v KSP :: ksp PC :: pc IS :: subdomains_IS(2), inflated_IS(2) PetscInt :: NMPI,MYRANK,IERMPI INTEGER :: start call PetscInitialize(PETSC_NULL_CHARACTER, ierr) call PetscLogDefaultBegin(ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, NMPI, IERMPI) CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERMPI) N_blocks = 2 block_size = 2 M = N_blocks * block_size ! INTRO: create matrix and right hand side, create IS call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) call VecCreate(PETSC_COMM_WORLD,b,ierr) call VecSetSizes(b, PETSC_DECIDE, M,ierr) call VecSetFromOptions(b,ierr) DO I=(MYRANK*(M/NMPI)),((MYRANK+1)*(M/NMPI)-1) ! Set matrix v=1 call MatSetValue(A, I, I, v, INSERT_VALUES, ierr) IF (I-block_size .GE. 0) THEN v=-1 call MatSetValue(A, I, I-block_size, v, INSERT_VALUES, ierr) ENDIF ! Set rhs v = I call VecSetValue(b,I,v, INSERT_VALUES,ierr) END DO call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) call VecAssemblyBegin(b,ierr) call VecAssemblyEnd(b,ierr) ! FIRST KSP/PC SETUP call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) call KSPSetOperators(ksp, A, A, ierr) call KSPSetType(ksp, 'preonly', ierr) call KSPGetPC(ksp, pc, ierr) call PCSetType(pc, PCGASM, ierr) ! GASM, SETTING SUBDOMAINS if (myrank == 0) then call ISCreateStride(PETSC_COMM_SELF, 2, 0, 1, subdomains_IS(1), ierr) call ISCreateStride(PETSC_COMM_WORLD, 0, 0, 1, subdomains_IS(2), ierr) call ISCreateStride(PETSC_COMM_SELF, 2, 0, 1, inflated_IS(1), ierr) call ISCreateStride(PETSC_COMM_WORLD, 2, 0, 1, inflated_IS(2), ierr) start = 1 NSub = 2 else call ISCreateStride(PETSC_COMM_WORLD, 2, 2, 1, subdomains_IS(2), ierr) call ISCreateStride(PETSC_COMM_WORLD, 2, 2, 1, inflated_IS(2), ierr) start = 2 NSub = 1 endif call PCGASMSetSubdomains(pc,NSub,subdomains_IS(start:2),inflated_IS(start:2),ierr) call PCGASMDestroySubdomains(NSub,subdomains_IS(start:2),inflated_IS(start:2),ierr) ! GASM: SET SUBSOLVERS call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_ksp_type", "gmres", ierr) call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_pc_type", "none", ierr) call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-pc_gasm_view_subdomains", "1", ierr) call KSPSetUp(ksp, ierr) call PCSetUp(pc, ierr) call KSPSetFromOptions(ksp, ierr) call PCSetFromOptions(pc, ierr) call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD, ierr) call MatDestroy(A, ierr) call PetscFinalize(ierr) Il giorno mer 17 mag 2023 alle ore 21:55 Barry Smith ha scritto: > > > On May 17, 2023, at 11:10 AM, Leonardo Mutti < > leonardo.mutti01 at universitadipavia.it> wrote: > > Dear developers, let me kindly ask for your help again. > In the following snippet, a bi-diagonal matrix A is set up. It measures > 8x8 blocks, each block is 2x2 elements. I would like to create the > correct IS objects for PCGASM. > The non-overlapping IS should be: [*0,1*], [*2,3*],[*4,5*], ..., [*14,15* > ]. The overlapping IS should be: [*0,1*], [0,1,*2,3*], [2,3,*4,5*], ..., > [12,13,*14,15*] > I am running the code with 4 processors. For some reason, after calling PCGASMDestroySubdomains > the code crashes with severe (157): Program Exception - access violation. > A visual inspection of the indices using ISView looks good. > > > Likely memory corruption or use of an object or an array that was > already freed. Best to use Valgrind to find the exact location of the mess. > > > Thanks again, > Leonardo > > Mat :: A > Vec :: b > PetscInt :: > M,N_blocks,block_size,I,J,NSub,converged_reason,srank,erank,color,subcomm > PetscMPIInt :: size > PetscErrorCode :: ierr > PetscScalar :: v > KSP :: ksp > PC :: pc > IS,ALLOCATABLE :: subdomains_IS(:), inflated_IS(:) > PetscInt :: NMPI,MYRANK,IERMPI > INTEGER :: IS_counter, is_start, is_end > > call PetscInitialize(PETSC_NULL_CHARACTER, ierr) > call PetscLogDefaultBegin(ierr) > call MPI_COMM_SIZE(MPI_COMM_WORLD, NMPI, IERMPI) > CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERMPI) > > N_blocks = 8 > block_size = 2 > M = N_blocks * block_size > > ALLOCATE(subdomains_IS(N_blocks)) > ALLOCATE(inflated_IS(N_blocks)) > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! ASSUMPTION: no block spans more than one rank (the inflated blocks > can) > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! INTRO: create matrix and right hand side > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > ! How many inflated blocks span more than one rank? NMPI-1 ! > > call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, > & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, > & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) > call VecCreate(PETSC_COMM_WORLD,b,ierr) > call VecSetSizes(b, PETSC_DECIDE, M,ierr) > call VecSetFromOptions(b,ierr) > > DO I=(MYRANK*(M/NMPI)),((MYRANK+1)*(M/NMPI)-1) > > ! Set matrix > v=1 > call MatSetValue(A, I, I, v, INSERT_VALUES, ierr) > IF (I-block_size .GE. 0) THEN > v=-1 > call MatSetValue(A, I, I-block_size, v, INSERT_VALUES, ierr) > ENDIF > ! Set rhs > v = I > call VecSetValue(b,I,v, INSERT_VALUES,ierr) > > END DO > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > call VecAssemblyBegin(b,ierr) > call VecAssemblyEnd(b,ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ! FIRST KSP/PC SETUP > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) > call KSPSetOperators(ksp, A, A, ierr) > call KSPSetType(ksp, 'preonly', ierr) > call KSPGetPC(ksp, pc, ierr) > call PCSetType(pc, PCGASM, ierr) > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > !! GASM, SETTING SUBDOMAINS > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > DO IS_COUNTER=1,N_blocks > > srank = MAX(((IS_COUNTER-2)*block_size)/(M/NMPI),0) ! start rank > reached by inflated block > erank = MIN(((IS_COUNTER-1)*block_size)/(M/NMPI),NMPI-1) ! end > rank reached by inflated block. Coincides with rank containing non-inflated > block > > ! Create subcomms > color = MPI_UNDEFINED > IF (myrank == srank .or. myrank == erank) THEN > color = 1 > ENDIF > call MPI_Comm_split(MPI_COMM_WORLD,color,MYRANK,subcomm,ierr) > > > ! Create IS > IF (srank .EQ. erank) THEN ! Block and overlap are on the same > rank > IF (MYRANK .EQ. srank) THEN > call > ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) > IF (IS_COUNTER .EQ. 1) THEN ! the first block is not > inflated > call > ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) > ELSE > call > ISCreateStride(PETSC_COMM_SELF,2*block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) > ENDIF > ENDIF > else ! Block and overlap not on the same rank > if (myrank == erank) then ! the block > call ISCreateStride > (subcomm,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) > call ISCreateStride > (subcomm,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) > endif > if (myrank == srank) then ! the overlap > call ISCreateStride > (subcomm,block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) > call ISCreateStride > (subcomm,0,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) > endif > endif > > call MPI_Comm_free(subcomm, ierr) > END DO > > ! Set the domains/subdomains > NSub = N_blocks/NMPI > is_start = 1 + myrank * NSub > is_end = min(is_start + NSub, N_blocks) > if (myrank + 1 < NMPI) then > NSub = NSub + 1 > endif > > call > PCGASMSetSubdomains(pc,NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) > call > PCGASMDestroySubdomains(NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) > > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_ksp_type", > "gmres", ierr) > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_pc_type", "none", > ierr) > call > PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-pc_gasm_view_subdomains", "1", > ierr) > > call KSPSetUp(ksp, ierr) > call PCSetUp(pc, ierr) > call KSPSetFromOptions(ksp, ierr) > call PCSetFromOptions(pc, ierr) > > call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD, ierr) > > > Il giorno mer 10 mag 2023 alle ore 03:02 Barry Smith > ha scritto: > >> >> >> On May 9, 2023, at 4:58 PM, LEONARDO MUTTI < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >> In my notation diag(1,1) means a diagonal 2x2 matrix with 1,1 on the >> diagonal, submatrix in the 8x8 diagonal matrix diag(1,1,2,2,...,2). >> Am I then correct that the IS representing diag(1,1) is 0,1, and that >> diag(2,2,...,2) is represented by 2,3,4,5,6,7? >> >> >> I believe so >> >> Thanks, >> Leonardo >> >> Il mar 9 mag 2023, 20:45 Barry Smith ha scritto: >> >>> >>> It is simplier than you are making it out to be. Each IS[] is a list >>> of rows (and columns) in the sub (domain) matrix. In your case with the >>> matrix of 144 by 144 the indices will go from 0 to 143. >>> >>> In your simple Fortran code you have a completely different problem. A >>> matrix with 8 rows and columns. In that case if you want the first IS to >>> represent just the first row (and column) in the matrix then it should >>> contain only 0. The second submatrix which is all rows (but the first) >>> should have 1,2,3,4,5,6,7 >>> >>> I do not understand why your code has >>> >>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>>> >>>>> >>> it should just be 0 >>> >>> >>> >>> >>> >>> On May 9, 2023, at 12:44 PM, LEONARDO MUTTI < >>> leonardo.mutti01 at universitadipavia.it> wrote: >>> >>> Partial typo: I expect 9x(16+16) numbers to be stored in subdomain_IS : >>> # subdomains x (row indices of the submatrix + col indices of the >>> submatrix). >>> >>> Il giorno mar 9 mag 2023 alle ore 18:31 LEONARDO MUTTI < >>> leonardo.mutti01 at universitadipavia.it> ha scritto: >>> >>>> >>>> >>>> ---------- Forwarded message --------- >>>> Da: LEONARDO MUTTI >>>> Date: mar 9 mag 2023 alle ore 18:29 >>>> Subject: Re: [petsc-users] Understanding index sets for PCGASM >>>> To: Matthew Knepley >>>> >>>> >>>> Thank you for your answer, but I am still confused, sorry. >>>> Consider >>>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 on >>>> one processor. >>>> Let M=12 for the sake of simplicity, i.e. we deal with a 12x12 2D grid, >>>> hence, a 144x144 matrix. >>>> Let NSubx = 3, so that on the grid we do 3 vertical and 3 horizontal >>>> subdivisions. >>>> We should obtain 9 subdomains that are grids of 4x4 nodes each, thus >>>> corresponding to 9 submatrices of size 16x16. >>>> In my run I obtain NSub = 9 (great) and subdomain_IS(i), i=1,...,9, >>>> reads: >>>> >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 0* >>>> *1 1* >>>> *2 2* >>>> *3 3* >>>> *4 12* >>>> *5 13* >>>> *6 14* >>>> *7 15* >>>> *8 24* >>>> *9 25* >>>> *10 26* >>>> *11 27* >>>> *12 36* >>>> *13 37* >>>> *14 38* >>>> *15 39* >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 4* >>>> *1 5* >>>> *2 6* >>>> *3 7* >>>> *4 16* >>>> *5 17* >>>> *6 18* >>>> *7 19* >>>> *8 28* >>>> *9 29* >>>> *10 30* >>>> *11 31* >>>> *12 40* >>>> *13 41* >>>> *14 42* >>>> *15 43* >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 8* >>>> *1 9* >>>> *2 10* >>>> *3 11* >>>> *4 20* >>>> *5 21* >>>> *6 22* >>>> *7 23* >>>> *8 32* >>>> *9 33* >>>> *10 34* >>>> *11 35* >>>> *12 44* >>>> *13 45* >>>> *14 46* >>>> *15 47* >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 48* >>>> *1 49* >>>> *2 50* >>>> *3 51* >>>> *4 60* >>>> *5 61* >>>> *6 62* >>>> *7 63* >>>> *8 72* >>>> *9 73* >>>> *10 74* >>>> *11 75* >>>> *12 84* >>>> *13 85* >>>> *14 86* >>>> *15 87* >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 52* >>>> *1 53* >>>> *2 54* >>>> *3 55* >>>> *4 64* >>>> *5 65* >>>> *6 66* >>>> *7 67* >>>> *8 76* >>>> *9 77* >>>> *10 78* >>>> *11 79* >>>> *12 88* >>>> *13 89* >>>> *14 90* >>>> *15 91* >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 56* >>>> *1 57* >>>> *2 58* >>>> *3 59* >>>> *4 68* >>>> *5 69* >>>> *6 70* >>>> *7 71* >>>> *8 80* >>>> *9 81* >>>> *10 82* >>>> *11 83* >>>> *12 92* >>>> *13 93* >>>> *14 94* >>>> *15 95* >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 96* >>>> *1 97* >>>> *2 98* >>>> *3 99* >>>> *4 108* >>>> *5 109* >>>> *6 110* >>>> *7 111* >>>> *8 120* >>>> *9 121* >>>> *10 122* >>>> *11 123* >>>> *12 132* >>>> *13 133* >>>> *14 134* >>>> *15 135* >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 100* >>>> *1 101* >>>> *2 102* >>>> *3 103* >>>> *4 112* >>>> *5 113* >>>> *6 114* >>>> *7 115* >>>> *8 124* >>>> *9 125* >>>> *10 126* >>>> *11 127* >>>> *12 136* >>>> *13 137* >>>> *14 138* >>>> *15 139* >>>> *IS Object: 1 MPI process* >>>> * type: general* >>>> *Number of indices in set 16* >>>> *0 104* >>>> *1 105* >>>> *2 106* >>>> *3 107* >>>> *4 116* >>>> *5 117* >>>> *6 118* >>>> *7 119* >>>> *8 128* >>>> *9 129* >>>> *10 130* >>>> *11 131* >>>> *12 140* >>>> *13 141* >>>> *14 142* >>>> *15 143* >>>> >>>> As you said, no number here reaches 144. >>>> But the number stored in subdomain_IS are 9x16= #subdomains x 16, >>>> whereas I would expect, also given your latest reply, 9x16x16x2=#subdomains >>>> x submatrix height x submatrix width x length of a (row,column) pair. >>>> It would really help me if you could briefly explain how the output >>>> above encodes the subdivision into subdomains. >>>> Many thanks again, >>>> Leonardo >>>> >>>> >>>> >>>> Il giorno mar 9 mag 2023 alle ore 16:24 Matthew Knepley < >>>> knepley at gmail.com> ha scritto: >>>> >>>>> On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI < >>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>> >>>>>> Great thanks! I can now successfully run >>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 >>>>>> . >>>>>> >>>>>> Going forward with my experiments, let me post a new code snippet >>>>>> (very similar to ex71f.F90) that I cannot get to work, probably I must be >>>>>> setting up the IS objects incorrectly. >>>>>> >>>>>> I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a >>>>>> vector b=(0.5,...,0.5). We have only one processor, and I want to solve >>>>>> Ax=b using GASM. In particular, KSP is set to preonly, GASM is the >>>>>> preconditioner and it uses on each submatrix an lu direct solver (sub_ksp = >>>>>> preonly, sub_pc = lu). >>>>>> >>>>>> For the GASM algorithm, I divide A into diag(1,1) and >>>>>> diag(2,2,...,2). For simplicity I set 0 overlap. Now I want to use GASM to >>>>>> solve Ax=b. The code follows. >>>>>> >>>>>> #include >>>>>> #include >>>>>> #include >>>>>> USE petscmat >>>>>> USE petscksp >>>>>> USE petscpc >>>>>> USE MPI >>>>>> >>>>>> Mat :: A >>>>>> Vec :: b, x >>>>>> PetscInt :: M, I, J, ISLen, NSub >>>>>> PetscMPIInt :: size >>>>>> PetscErrorCode :: ierr >>>>>> PetscScalar :: v >>>>>> KSP :: ksp >>>>>> PC :: pc >>>>>> IS :: subdomains_IS(2), inflated_IS(2) >>>>>> PetscInt,DIMENSION(4) :: indices_first_domain >>>>>> PetscInt,DIMENSION(36) :: indices_second_domain >>>>>> >>>>>> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >>>>>> call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) >>>>>> >>>>>> >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> ! INTRO: create matrix and right hand side >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> >>>>>> WRITE(*,*) "Assembling A,b" >>>>>> >>>>>> M = 8 >>>>>> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >>>>>> DO I=1,M >>>>>> DO J=1,M >>>>>> IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN >>>>>> v = 1 >>>>>> ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN >>>>>> v = 2 >>>>>> ELSE >>>>>> v = 0 >>>>>> ENDIF >>>>>> call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) >>>>>> END DO >>>>>> END DO >>>>>> >>>>>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>>>>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>>>>> >>>>>> call VecCreate(PETSC_COMM_WORLD,b,ierr) >>>>>> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >>>>>> call VecSetFromOptions(b,ierr) >>>>>> >>>>>> do I=1,M >>>>>> v = 0.5 >>>>>> call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) >>>>>> end do >>>>>> >>>>>> call VecAssemblyBegin(b,ierr) >>>>>> call VecAssemblyEnd(b,ierr) >>>>>> >>>>>> >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> ! FIRST KSP/PC SETUP >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> >>>>>> WRITE(*,*) "KSP/PC first setup" >>>>>> >>>>>> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >>>>>> call KSPSetOperators(ksp, A, A, ierr) >>>>>> call KSPSetType(ksp, 'preonly', ierr) >>>>>> call KSPGetPC(ksp, pc, ierr) >>>>>> call KSPSetUp(ksp, ierr) >>>>>> call PCSetType(pc, PCGASM, ierr) >>>>>> >>>>>> >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> ! GASM, SETTING SUBDOMAINS >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> >>>>>> WRITE(*,*) "Setting GASM subdomains" >>>>>> >>>>>> ! Let's create the subdomain IS and inflated_IS >>>>>> ! They are equal if no overlap is present >>>>>> ! They are 1: 0,1,8,9 >>>>>> ! 2: 10,...,15,18,...,23,...,58,...,63 >>>>>> >>>>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>>> do I=0,5 >>>>>> do J=0,5 >>>>>> indices_second_domain(I*6+1+J) = 18 + J + 8*I ! >>>>>> corresponds to diag(2,2,...,2) >>>>>> !WRITE(*,*) I*6+1+J, 18 + J + 8*I >>>>>> end do >>>>>> end do >>>>>> >>>>>> ! Convert into IS >>>>>> ISLen = 4 >>>>>> call >>>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>>> & PETSC_COPY_VALUES, subdomains_IS(1), ierr) >>>>>> call >>>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>>> & PETSC_COPY_VALUES, inflated_IS(1), ierr) >>>>>> ISLen = 36 >>>>>> call >>>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>>> & PETSC_COPY_VALUES, subdomains_IS(2), ierr) >>>>>> call >>>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>>> & PETSC_COPY_VALUES, inflated_IS(2), ierr) >>>>>> >>>>>> NSub = 2 >>>>>> call PCGASMSetSubdomains(pc,NSub, >>>>>> & subdomains_IS,inflated_IS,ierr) >>>>>> call PCGASMDestroySubdomains(NSub, >>>>>> & subdomains_IS,inflated_IS,ierr) >>>>>> >>>>>> >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> ! GASM: SET SUBSOLVERS >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> >>>>>> WRITE(*,*) "Setting subsolvers for GASM" >>>>>> >>>>>> call PCSetUp(pc, ierr) ! should I add this? >>>>>> >>>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>>> & "-sub_pc_type", "lu", ierr) >>>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>>> & "-sub_ksp_type", "preonly", ierr) >>>>>> >>>>>> call KSPSetFromOptions(ksp, ierr) >>>>>> call PCSetFromOptions(pc, ierr) >>>>>> >>>>>> >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> ! DUMMY SOLUTION: DID IT WORK? >>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>> >>>>>> WRITE(*,*) "Solve" >>>>>> >>>>>> call VecDuplicate(b,x,ierr) >>>>>> call KSPSolve(ksp,b,x,ierr) >>>>>> >>>>>> call MatDestroy(A, ierr) >>>>>> call KSPDestroy(ksp, ierr) >>>>>> call PetscFinalize(ierr) >>>>>> >>>>>> This code is failing in multiple points. At call PCSetUp(pc, ierr) >>>>>> it produces: >>>>>> >>>>>> *[0]PETSC ERROR: Argument out of range* >>>>>> *[0]PETSC ERROR: Scatter indices in ix are out of range* >>>>>> *...* >>>>>> *[0]PETSC ERROR: #1 VecScatterCreate() at >>>>>> ***\src\vec\is\sf\INTERF~1\vscat.c:736* >>>>>> *[0]PETSC ERROR: #2 PCSetUp_GASM() at >>>>>> ***\src\ksp\pc\impls\gasm\gasm.c:433* >>>>>> *[0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994* >>>>>> >>>>>> And at call KSPSolve(ksp,b,x,ierr) it produces: >>>>>> >>>>>> *forrtl: severe (157): Program Exception - access violation* >>>>>> >>>>>> >>>>>> The index sets are setup coherently with the outputs of e.g. >>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: >>>>>> in particular each element of the matrix A corresponds to a number from 0 >>>>>> to 63. >>>>>> >>>>> >>>>> This is not correct, I believe. The indices are row/col indices, not >>>>> indices into dense blocks, so for >>>>> your example, they are all in [0, 8]. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Note that each submatrix does not represent some physical subdomain, >>>>>> the subdivision is just at the algebraic level. >>>>>> I thus have the following questions: >>>>>> >>>>>> - is this the correct way of creating the IS objects, given my >>>>>> objective at the beginning of the email? Is the ordering correct? >>>>>> - what am I doing wrong that is generating the above errors? >>>>>> >>>>>> Thanks for the patience and the time. >>>>>> Best, >>>>>> Leonardo >>>>>> >>>>>> Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith >>>>>> ha scritto: >>>>>> >>>>>>> >>>>>>> Added in *barry/2023-05-04/add-pcgasm-set-subdomains *see also >>>>>>> https://gitlab.com/petsc/petsc/-/merge_requests/6419 >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI < >>>>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>>>> >>>>>>> Thank you for the help. >>>>>>> Adding to my example: >>>>>>> >>>>>>> >>>>>>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>>>>>> inflated_IS,ierr) call >>>>>>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>>>>>> results in: >>>>>>> >>>>>>> * Error LNK2019 unresolved external symbol >>>>>>> PCGASMDESTROYSUBDOMAINS referenced in function ... * >>>>>>> >>>>>>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>>>>>> referenced in function ... * >>>>>>> I'm not sure if the interfaces are missing or if I have a >>>>>>> compilation problem. >>>>>>> Thank you again. >>>>>>> Best, >>>>>>> Leonardo >>>>>>> >>>>>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith < >>>>>>> bsmith at petsc.dev> ha scritto: >>>>>>> >>>>>>>> >>>>>>>> Thank you for the test code. I have a fix in the branch >>>>>>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>>> with >>>>>>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>>>>> >>>>>>>> The functions did not have proper Fortran stubs and interfaces >>>>>>>> so I had to provide them manually in the new branch. >>>>>>>> >>>>>>>> Use >>>>>>>> >>>>>>>> git fetch >>>>>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>>> >>>>>>>> ./configure etc >>>>>>>> >>>>>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I >>>>>>>> had to change things slightly and I updated the error handling for the >>>>>>>> latest version. >>>>>>>> >>>>>>>> Please let us know if you have any later questions. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>>>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>>>>> >>>>>>>> Hello. I am having a hard time understanding the index sets to feed >>>>>>>> PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To >>>>>>>> get more intuition on how the IS objects behave I tried the following >>>>>>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>>>>>> square, non-overlapping submatrices: >>>>>>>> >>>>>>>> #include >>>>>>>> #include >>>>>>>> #include >>>>>>>> USE petscmat >>>>>>>> USE petscksp >>>>>>>> USE petscpc >>>>>>>> >>>>>>>> Mat :: A >>>>>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>>>>> INTEGER :: I,J >>>>>>>> PetscErrorCode :: ierr >>>>>>>> PetscScalar :: v >>>>>>>> KSP :: ksp >>>>>>>> PC :: pc >>>>>>>> IS :: subdomains_IS, inflated_IS >>>>>>>> >>>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>>>>> >>>>>>>> !-----Create a dummy matrix >>>>>>>> M = 16 >>>>>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>>> & M, M, >>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>> & A, ierr) >>>>>>>> >>>>>>>> DO I=1,M >>>>>>>> DO J=1,M >>>>>>>> v = I*J >>>>>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>>>>> & INSERT_VALUES , ierr) >>>>>>>> END DO >>>>>>>> END DO >>>>>>>> >>>>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>> >>>>>>>> !-----Create KSP and PC >>>>>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>>>>> call KSPGetPC(ksp,pc,ierr) >>>>>>>> call KSPSetUp(ksp, ierr) >>>>>>>> call PCSetType(pc,PCGASM, ierr) >>>>>>>> call PCSetUp(pc , ierr) >>>>>>>> >>>>>>>> !-----GASM setup >>>>>>>> NSubx = 4 >>>>>>>> dof = 1 >>>>>>>> overlap = 0 >>>>>>>> >>>>>>>> call PCGASMCreateSubdomains2D(pc, >>>>>>>> & M, M, >>>>>>>> & NSubx, NSubx, >>>>>>>> & dof, overlap, >>>>>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>>>>> >>>>>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>>>>> >>>>>>>> call KSPDestroy(ksp, ierr) >>>>>>>> call PetscFinalize(ierr) >>>>>>>> >>>>>>>> Running this on one processor, I get NSub = 4. >>>>>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = >>>>>>>> 16 as expected. >>>>>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception >>>>>>>> - access violation". So: >>>>>>>> 1) why do I get two different results with ASM, and GASM? >>>>>>>> 2) why do I get access violation and how can I solve this? >>>>>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>>>>>> objects. As I see on the Fortran interface, the arguments to >>>>>>>> PCGASMCreateSubdomains2D are IS objects: >>>>>>>> >>>>>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>>>>> import tPC,tIS >>>>>>>> PC a ! PC >>>>>>>> PetscInt b ! PetscInt >>>>>>>> PetscInt c ! PetscInt >>>>>>>> PetscInt d ! PetscInt >>>>>>>> PetscInt e ! PetscInt >>>>>>>> PetscInt f ! PetscInt >>>>>>>> PetscInt g ! PetscInt >>>>>>>> PetscInt h ! PetscInt >>>>>>>> IS i ! IS >>>>>>>> IS j ! IS >>>>>>>> PetscErrorCode z >>>>>>>> end subroutine PCGASMCreateSubdomains2D >>>>>>>> Thus: >>>>>>>> 3) what should be inside e.g., subdomains_IS? I expect it to >>>>>>>> contain, for every created subdomain, the list of rows and columns defining >>>>>>>> the subblock in the matrix, am I right? >>>>>>>> >>>>>>>> Context: I have a block-tridiagonal system arising from space-time >>>>>>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>>>>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>>>>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>>>>> >>>>>>>> Thanks in advance, >>>>>>>> Leonardo >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 17 21:26:20 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 17 May 2023 22:26:20 -0400 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> <65FAB3E9-8D08-4AEE-874E-636EB2C76A29@petsc.dev> <6CE3B35C-E74E-43B5-A3DF-4D0D77E6A94C@petsc.dev> Message-ID: <07222845-BC1C-4875-BBFC-17B01E08A529@petsc.dev> Yikes. Such huge numbers usually come from integer overflow or memory corruption. The code to decide on the memory that needs allocating is straightforward PetscErrorCode MatCreateSubMatrices_MPIAIJ(Mat C, PetscInt ismax, const IS isrow[], const IS iscol[], MatReuse scall, Mat *submat[]) { PetscInt nmax, nstages = 0, i, pos, max_no, nrow, ncol, in[2], out[2]; PetscBool rowflag, colflag, wantallmatrix = PETSC_FALSE; Mat_SeqAIJ *subc; Mat_SubSppt *smat; PetscFunctionBegin; /* Check for special case: each processor has a single IS */ if (C->submat_singleis) { /* flag is set in PCSetUp_ASM() to skip MPI_Allreduce() */ PetscCall(MatCreateSubMatrices_MPIAIJ_SingleIS(C, ismax, isrow, iscol, scall, submat)); C->submat_singleis = PETSC_FALSE; /* resume its default value in case C will be used for non-single IS */ PetscFunctionReturn(PETSC_SUCCESS); } /* Collect global wantallmatrix and nstages */ if (!C->cmap->N) nmax = 20 * 1000000 / sizeof(PetscInt); else nmax = 20 * 1000000 / (C->cmap->N * sizeof(PetscInt)); if (!nmax) nmax = 1; if (scall == MAT_INITIAL_MATRIX) { /* Collect global wantallmatrix and nstages */ if (ismax == 1 && C->rmap->N == C->cmap->N) { PetscCall(ISIdentity(*isrow, &rowflag)); PetscCall(ISIdentity(*iscol, &colflag)); PetscCall(ISGetLocalSize(*isrow, &nrow)); PetscCall(ISGetLocalSize(*iscol, &ncol)); if (rowflag && colflag && nrow == C->rmap->N && ncol == C->cmap->N) { wantallmatrix = PETSC_TRUE; PetscCall(PetscOptionsGetBool(((PetscObject)C)->options, ((PetscObject)C)->prefix, "-use_fast_submatrix", &wantallmatrix, NULL)); } } /* Determine the number of stages through which submatrices are done Each stage will extract nmax submatrices. nmax is determined by the matrix column dimension. If the original matrix has 20M columns, only one submatrix per stage is allowed, etc. */ nstages = ismax / nmax + ((ismax % nmax) ? 1 : 0); /* local nstages */ in[0] = -1 * (PetscInt)wantallmatrix; in[1] = nstages; PetscCall(MPIU_Allreduce(in, out, 2, MPIU_INT, MPI_MAX, PetscObjectComm((PetscObject)C))); wantallmatrix = (PetscBool)(-out[0]); nstages = out[1]; /* Make sure every processor loops through the global nstages */ } else { /* MAT_REUSE_MATRIX */ if (ismax) { subc = (Mat_SeqAIJ *)(*submat)[0]->data; smat = subc->submatis1; } else { /* (*submat)[0] is a dummy matrix */ smat = (Mat_SubSppt *)(*submat)[0]->data; } if (!smat) { /* smat is not generated by MatCreateSubMatrix_MPIAIJ_All(...,MAT_INITIAL_MATRIX,...) */ wantallmatrix = PETSC_TRUE; } else if (smat->singleis) { PetscCall(MatCreateSubMatrices_MPIAIJ_SingleIS(C, ismax, isrow, iscol, scall, submat)); PetscFunctionReturn(PETSC_SUCCESS); } else { nstages = smat->nstages; } } if (wantallmatrix) { PetscCall(MatCreateSubMatrix_MPIAIJ_All(C, MAT_GET_VALUES, scall, submat)); PetscFunctionReturn(PETSC_SUCCESS); } /* Allocate memory to hold all the submatrices and dummy submatrices */ if (scall == MAT_INITIAL_MATRIX) PetscCall(PetscCalloc1(ismax + nstages, submat)); for (i = 0, pos = 0; i < nstages; i++) { if (pos + nmax <= ismax) max_no = nmax; else if (pos >= ismax) max_no = 0; else max_no = ismax - pos; PetscCall(MatCreateSubMatrices_MPIAIJ_Local(C, max_no, isrow + pos, iscol + pos, scall, *submat + pos)); if (!max_no) { if (scall == MAT_INITIAL_MATRIX) { /* submat[pos] is a dummy matrix */ smat = (Mat_SubSppt *)(*submat)[pos]->data; smat->nstages = nstages; } pos++; /* advance to next dummy matrix if any */ } else pos += max_no; } if (ismax && scall == MAT_INITIAL_MATRIX) { /* save nstages for reuse */ subc = (Mat_SeqAIJ *)(*submat)[0]->data; smat = subc->submatis1; smat->nstages = nstages; } PetscFunctionReturn(PETSC_SUCCESS); } The easiest way to debug would be to put a breakpoint in MatCreateSubMatrices_MPIAIJ on MPI rank zero and next through the subroutine to see where the crazy number appears that gets passed down in the line if (scall == MAT_INITIAL_MATRIX) PetscCall(PetscCalloc1(ismax + nstages, submat)); where either ismax or nstages has a crazy value. If you are using the GNU compilers you can use the command line options -start_in_debugger noxterm -debugger_ranks 0 to start the Gnu debugger. If you are using the Microsoft Windows compilers you will need to use their debugger, I don't know how to do that (and shudder at the thought :-). > On May 17, 2023, at 7:51 PM, Leonardo Mutti wrote: > > Thanks for the reply. Even without Valgrind (which I can't use since I'm on Windows), by further simplifying the example, I was able to have PETSc display a more informative message. > What I am doing wrong and what should be done differently, this is still unclear to me. > > The simplified code runs on 2 processors, I built a 4x4 matrix. > The subdomains are now given by [0,1] and [2,3], with [2,3] inflating to [0,1,2,3]. > > Thank you again. > > Error: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 0 Memory used by process 0 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_view for info. > [0]PETSC ERROR: Memory requested 9437902811936987136 > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-pc_gasm_view_subdomains value: 1 source: code > [0]PETSC ERROR: Option left: name:-sub_ksp_type value: gmres source: code > [0]PETSC ERROR: Option left: name:-sub_pc_type value: none source: code > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-234-g43977f8d16 GIT Date: 2023-05-08 14:50:03 +0000 > [...] > [0]PETSC ERROR: #1 PetscMallocAlign() at ...\Sources\Git\Test-FV\PETSC-~1\src\sys\memory\mal.c:66 > [0]PETSC ERROR: #2 PetscMallocA() at ...\Sources\Git\Test-FV\PETSC-~1\src\sys\memory\mal.c:411 > [0]PETSC ERROR: #3 MatCreateSubMatrices_MPIAIJ() at ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:2025 > [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:3136 > [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:3208 > [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at ...\Sources\Git\Test-FV\PETSC-~1\src\mat\INTERF~1\matrix.c:7071 > [0]PETSC ERROR: #7 PCSetUp_GASM() at ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\pc\impls\gasm\gasm.c:556 > [0]PETSC ERROR: #8 PCSetUp() at ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\pc\INTERF~1\precon.c:994 > [0]PETSC ERROR: #9 KSPSetUp() at ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\ksp\INTERF~1\itfunc.c:406 > > Code, the important bit is after ! GASM, SETTING SUBDOMAINS: > > Mat :: A > Vec :: b > PetscInt :: M,N_blocks,block_size,NSub,I > PetscErrorCode :: ierr > PetscScalar :: v > KSP :: ksp > PC :: pc > IS :: subdomains_IS(2), inflated_IS(2) > PetscInt :: NMPI,MYRANK,IERMPI > INTEGER :: start > > call PetscInitialize(PETSC_NULL_CHARACTER, ierr) > call PetscLogDefaultBegin(ierr) > call MPI_COMM_SIZE(MPI_COMM_WORLD, NMPI, IERMPI) > CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERMPI) > > N_blocks = 2 > block_size = 2 > M = N_blocks * block_size > > > ! INTRO: create matrix and right hand side, create IS > > call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, > & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, > & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) > call VecCreate(PETSC_COMM_WORLD,b,ierr) > call VecSetSizes(b, PETSC_DECIDE, M,ierr) > call VecSetFromOptions(b,ierr) > > DO I=(MYRANK*(M/NMPI)),((MYRANK+1)*(M/NMPI)-1) > > ! Set matrix > v=1 > call MatSetValue(A, I, I, v, INSERT_VALUES, ierr) > IF (I-block_size .GE. 0) THEN > v=-1 > call MatSetValue(A, I, I-block_size, v, INSERT_VALUES, ierr) > ENDIF > ! Set rhs > v = I > call VecSetValue(b,I,v, INSERT_VALUES,ierr) > > END DO > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > call VecAssemblyBegin(b,ierr) > call VecAssemblyEnd(b,ierr) > > ! FIRST KSP/PC SETUP > > call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) > call KSPSetOperators(ksp, A, A, ierr) > call KSPSetType(ksp, 'preonly', ierr) > call KSPGetPC(ksp, pc, ierr) > call PCSetType(pc, PCGASM, ierr) > > > ! GASM, SETTING SUBDOMAINS > > if (myrank == 0) then > call ISCreateStride(PETSC_COMM_SELF, 2, 0, 1, subdomains_IS(1), ierr) > call ISCreateStride(PETSC_COMM_WORLD, 0, 0, 1, subdomains_IS(2), ierr) > call ISCreateStride(PETSC_COMM_SELF, 2, 0, 1, inflated_IS(1), ierr) > call ISCreateStride(PETSC_COMM_WORLD, 2, 0, 1, inflated_IS(2), ierr) > start = 1 > NSub = 2 > else > call ISCreateStride(PETSC_COMM_WORLD, 2, 2, 1, subdomains_IS(2), ierr) > call ISCreateStride(PETSC_COMM_WORLD, 2, 2, 1, inflated_IS(2), ierr) > start = 2 > NSub = 1 > endif > > call PCGASMSetSubdomains(pc,NSub,subdomains_IS(start:2),inflated_IS(start:2),ierr) > call PCGASMDestroySubdomains(NSub,subdomains_IS(start:2),inflated_IS(start:2),ierr) > > ! GASM: SET SUBSOLVERS > > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_ksp_type", "gmres", ierr) > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_pc_type", "none", ierr) > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-pc_gasm_view_subdomains", "1", ierr) > > call KSPSetUp(ksp, ierr) > call PCSetUp(pc, ierr) > call KSPSetFromOptions(ksp, ierr) > call PCSetFromOptions(pc, ierr) > call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD, ierr) > > call MatDestroy(A, ierr) > call PetscFinalize(ierr) > > > Il giorno mer 17 mag 2023 alle ore 21:55 Barry Smith > ha scritto: >> >> >>> On May 17, 2023, at 11:10 AM, Leonardo Mutti > wrote: >>> >>> Dear developers, let me kindly ask for your help again. >>> In the following snippet, a bi-diagonal matrix A is set up. It measures 8x8 blocks, each block is 2x2 elements. I would like to create the correct IS objects for PCGASM. >>> The non-overlapping IS should be: [0,1], [2,3],[4,5], ..., [14,15]. The overlapping IS should be: [0,1], [0,1,2,3], [2,3,4,5], ..., [12,13,14,15] >>> I am running the code with 4 processors. For some reason, after calling PCGASMDestroySubdomains the code crashes with severe (157): Program Exception - access violation. A visual inspection of the indices using ISView looks good. >> >> Likely memory corruption or use of an object or an array that was already freed. Best to use Valgrind to find the exact location of the mess. >> >> >>> Thanks again, >>> Leonardo >>> >>> Mat :: A >>> Vec :: b >>> PetscInt :: M,N_blocks,block_size,I,J,NSub,converged_reason,srank,erank,color,subcomm >>> PetscMPIInt :: size >>> PetscErrorCode :: ierr >>> PetscScalar :: v >>> KSP :: ksp >>> PC :: pc >>> IS,ALLOCATABLE :: subdomains_IS(:), inflated_IS(:) >>> PetscInt :: NMPI,MYRANK,IERMPI >>> INTEGER :: IS_counter, is_start, is_end >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >>> call PetscLogDefaultBegin(ierr) >>> call MPI_COMM_SIZE(MPI_COMM_WORLD, NMPI, IERMPI) >>> CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERMPI) >>> >>> N_blocks = 8 >>> block_size = 2 >>> M = N_blocks * block_size >>> >>> ALLOCATE(subdomains_IS(N_blocks)) >>> ALLOCATE(inflated_IS(N_blocks)) >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> ! ASSUMPTION: no block spans more than one rank (the inflated blocks can) >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> ! INTRO: create matrix and right hand side >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> ! How many inflated blocks span more than one rank? NMPI-1 ! >>> >>> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >>> call VecCreate(PETSC_COMM_WORLD,b,ierr) >>> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >>> call VecSetFromOptions(b,ierr) >>> >>> DO I=(MYRANK*(M/NMPI)),((MYRANK+1)*(M/NMPI)-1) >>> >>> ! Set matrix >>> v=1 >>> call MatSetValue(A, I, I, v, INSERT_VALUES, ierr) >>> IF (I-block_size .GE. 0) THEN >>> v=-1 >>> call MatSetValue(A, I, I-block_size, v, INSERT_VALUES, ierr) >>> ENDIF >>> ! Set rhs >>> v = I >>> call VecSetValue(b,I,v, INSERT_VALUES,ierr) >>> >>> END DO >>> >>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>> call VecAssemblyBegin(b,ierr) >>> call VecAssemblyEnd(b,ierr) >>> >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> ! FIRST KSP/PC SETUP >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >>> call KSPSetOperators(ksp, A, A, ierr) >>> call KSPSetType(ksp, 'preonly', ierr) >>> call KSPGetPC(ksp, pc, ierr) >>> call PCSetType(pc, PCGASM, ierr) >>> >>> >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> !! GASM, SETTING SUBDOMAINS >>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>> >>> DO IS_COUNTER=1,N_blocks >>> >>> srank = MAX(((IS_COUNTER-2)*block_size)/(M/NMPI),0) ! start rank reached by inflated block >>> erank = MIN(((IS_COUNTER-1)*block_size)/(M/NMPI),NMPI-1) ! end rank reached by inflated block. Coincides with rank containing non-inflated block >>> >>> ! Create subcomms >>> color = MPI_UNDEFINED >>> IF (myrank == srank .or. myrank == erank) THEN >>> color = 1 >>> ENDIF >>> call MPI_Comm_split(MPI_COMM_WORLD,color,MYRANK,subcomm,ierr) >>> >>> ! Create IS >>> IF (srank .EQ. erank) THEN ! Block and overlap are on the same rank >>> IF (MYRANK .EQ. srank) THEN >>> call ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) >>> IF (IS_COUNTER .EQ. 1) THEN ! the first block is not inflated >>> call ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) >>> ELSE >>> call ISCreateStride(PETSC_COMM_SELF,2*block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) >>> ENDIF >>> ENDIF >>> else ! Block and overlap not on the same rank >>> if (myrank == erank) then ! the block >>> call ISCreateStride(subcomm,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) >>> call ISCreateStride(subcomm,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) >>> endif >>> if (myrank == srank) then ! the overlap >>> call ISCreateStride(subcomm,block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) >>> call ISCreateStride(subcomm,0,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) >>> endif >>> endif >>> >>> call MPI_Comm_free(subcomm, ierr) >>> END DO >>> >>> ! Set the domains/subdomains >>> NSub = N_blocks/NMPI >>> is_start = 1 + myrank * NSub >>> is_end = min(is_start + NSub, N_blocks) >>> if (myrank + 1 < NMPI) then >>> NSub = NSub + 1 >>> endif >>> >>> call PCGASMSetSubdomains(pc,NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) >>> call PCGASMDestroySubdomains(NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) >>> >>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_ksp_type", "gmres", ierr) >>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_pc_type", "none", ierr) >>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-pc_gasm_view_subdomains", "1", ierr) >>> >>> call KSPSetUp(ksp, ierr) >>> call PCSetUp(pc, ierr) >>> call KSPSetFromOptions(ksp, ierr) >>> call PCSetFromOptions(pc, ierr) >>> >>> call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD, ierr) >>> >>> >>> Il giorno mer 10 mag 2023 alle ore 03:02 Barry Smith > ha scritto: >>>> >>>> >>>>> On May 9, 2023, at 4:58 PM, LEONARDO MUTTI > wrote: >>>>> >>>>> In my notation diag(1,1) means a diagonal 2x2 matrix with 1,1 on the diagonal, submatrix in the 8x8 diagonal matrix diag(1,1,2,2,...,2). >>>>> Am I then correct that the IS representing diag(1,1) is 0,1, and that diag(2,2,...,2) is represented by 2,3,4,5,6,7? >>>> >>>> I believe so >>>> >>>>> Thanks, >>>>> Leonardo >>>>> >>>>> Il mar 9 mag 2023, 20:45 Barry Smith > ha scritto: >>>>>> >>>>>> It is simplier than you are making it out to be. Each IS[] is a list of rows (and columns) in the sub (domain) matrix. In your case with the matrix of 144 by 144 the indices will go from 0 to 143. >>>>>> >>>>>> In your simple Fortran code you have a completely different problem. A matrix with 8 rows and columns. In that case if you want the first IS to represent just the first row (and column) in the matrix then it should contain only 0. The second submatrix which is all rows (but the first) should have 1,2,3,4,5,6,7 >>>>>> >>>>>> I do not understand why your code has >>>>>> >>>>>>>>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>>> >>>>>> it should just be 0 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On May 9, 2023, at 12:44 PM, LEONARDO MUTTI > wrote: >>>>>>> >>>>>>> Partial typo: I expect 9x(16+16) numbers to be stored in subdomain_IS : # subdomains x (row indices of the submatrix + col indices of the submatrix). >>>>>>> >>>>>>> Il giorno mar 9 mag 2023 alle ore 18:31 LEONARDO MUTTI > ha scritto: >>>>>>>> >>>>>>>> >>>>>>>> ---------- Forwarded message --------- >>>>>>>> Da: LEONARDO MUTTI > >>>>>>>> Date: mar 9 mag 2023 alle ore 18:29 >>>>>>>> Subject: Re: [petsc-users] Understanding index sets for PCGASM >>>>>>>> To: Matthew Knepley > >>>>>>>> >>>>>>>> >>>>>>>> Thank you for your answer, but I am still confused, sorry. >>>>>>>> Consider https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 on one processor. >>>>>>>> Let M=12 for the sake of simplicity, i.e. we deal with a 12x12 2D grid, hence, a 144x144 matrix. >>>>>>>> Let NSubx = 3, so that on the grid we do 3 vertical and 3 horizontal subdivisions. >>>>>>>> We should obtain 9 subdomains that are grids of 4x4 nodes each, thus corresponding to 9 submatrices of size 16x16. >>>>>>>> In my run I obtain NSub = 9 (great) and subdomain_IS(i), i=1,...,9, reads: >>>>>>>> >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 0 >>>>>>>> 1 1 >>>>>>>> 2 2 >>>>>>>> 3 3 >>>>>>>> 4 12 >>>>>>>> 5 13 >>>>>>>> 6 14 >>>>>>>> 7 15 >>>>>>>> 8 24 >>>>>>>> 9 25 >>>>>>>> 10 26 >>>>>>>> 11 27 >>>>>>>> 12 36 >>>>>>>> 13 37 >>>>>>>> 14 38 >>>>>>>> 15 39 >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 4 >>>>>>>> 1 5 >>>>>>>> 2 6 >>>>>>>> 3 7 >>>>>>>> 4 16 >>>>>>>> 5 17 >>>>>>>> 6 18 >>>>>>>> 7 19 >>>>>>>> 8 28 >>>>>>>> 9 29 >>>>>>>> 10 30 >>>>>>>> 11 31 >>>>>>>> 12 40 >>>>>>>> 13 41 >>>>>>>> 14 42 >>>>>>>> 15 43 >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 8 >>>>>>>> 1 9 >>>>>>>> 2 10 >>>>>>>> 3 11 >>>>>>>> 4 20 >>>>>>>> 5 21 >>>>>>>> 6 22 >>>>>>>> 7 23 >>>>>>>> 8 32 >>>>>>>> 9 33 >>>>>>>> 10 34 >>>>>>>> 11 35 >>>>>>>> 12 44 >>>>>>>> 13 45 >>>>>>>> 14 46 >>>>>>>> 15 47 >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 48 >>>>>>>> 1 49 >>>>>>>> 2 50 >>>>>>>> 3 51 >>>>>>>> 4 60 >>>>>>>> 5 61 >>>>>>>> 6 62 >>>>>>>> 7 63 >>>>>>>> 8 72 >>>>>>>> 9 73 >>>>>>>> 10 74 >>>>>>>> 11 75 >>>>>>>> 12 84 >>>>>>>> 13 85 >>>>>>>> 14 86 >>>>>>>> 15 87 >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 52 >>>>>>>> 1 53 >>>>>>>> 2 54 >>>>>>>> 3 55 >>>>>>>> 4 64 >>>>>>>> 5 65 >>>>>>>> 6 66 >>>>>>>> 7 67 >>>>>>>> 8 76 >>>>>>>> 9 77 >>>>>>>> 10 78 >>>>>>>> 11 79 >>>>>>>> 12 88 >>>>>>>> 13 89 >>>>>>>> 14 90 >>>>>>>> 15 91 >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 56 >>>>>>>> 1 57 >>>>>>>> 2 58 >>>>>>>> 3 59 >>>>>>>> 4 68 >>>>>>>> 5 69 >>>>>>>> 6 70 >>>>>>>> 7 71 >>>>>>>> 8 80 >>>>>>>> 9 81 >>>>>>>> 10 82 >>>>>>>> 11 83 >>>>>>>> 12 92 >>>>>>>> 13 93 >>>>>>>> 14 94 >>>>>>>> 15 95 >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 96 >>>>>>>> 1 97 >>>>>>>> 2 98 >>>>>>>> 3 99 >>>>>>>> 4 108 >>>>>>>> 5 109 >>>>>>>> 6 110 >>>>>>>> 7 111 >>>>>>>> 8 120 >>>>>>>> 9 121 >>>>>>>> 10 122 >>>>>>>> 11 123 >>>>>>>> 12 132 >>>>>>>> 13 133 >>>>>>>> 14 134 >>>>>>>> 15 135 >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 100 >>>>>>>> 1 101 >>>>>>>> 2 102 >>>>>>>> 3 103 >>>>>>>> 4 112 >>>>>>>> 5 113 >>>>>>>> 6 114 >>>>>>>> 7 115 >>>>>>>> 8 124 >>>>>>>> 9 125 >>>>>>>> 10 126 >>>>>>>> 11 127 >>>>>>>> 12 136 >>>>>>>> 13 137 >>>>>>>> 14 138 >>>>>>>> 15 139 >>>>>>>> IS Object: 1 MPI process >>>>>>>> type: general >>>>>>>> Number of indices in set 16 >>>>>>>> 0 104 >>>>>>>> 1 105 >>>>>>>> 2 106 >>>>>>>> 3 107 >>>>>>>> 4 116 >>>>>>>> 5 117 >>>>>>>> 6 118 >>>>>>>> 7 119 >>>>>>>> 8 128 >>>>>>>> 9 129 >>>>>>>> 10 130 >>>>>>>> 11 131 >>>>>>>> 12 140 >>>>>>>> 13 141 >>>>>>>> 14 142 >>>>>>>> 15 143 >>>>>>>> >>>>>>>> As you said, no number here reaches 144. >>>>>>>> But the number stored in subdomain_IS are 9x16= #subdomains x 16, whereas I would expect, also given your latest reply, 9x16x16x2=#subdomains x submatrix height x submatrix width x length of a (row,column) pair. >>>>>>>> It would really help me if you could briefly explain how the output above encodes the subdivision into subdomains. >>>>>>>> Many thanks again, >>>>>>>> Leonardo >>>>>>>> >>>>>>>> >>>>>>>> Il giorno mar 9 mag 2023 alle ore 16:24 Matthew Knepley > ha scritto: >>>>>>>>> On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI > wrote: >>>>>>>>>> Great thanks! I can now successfully run https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90. >>>>>>>>>> >>>>>>>>>> Going forward with my experiments, let me post a new code snippet (very similar to ex71f.F90) that I cannot get to work, probably I must be setting up the IS objects incorrectly. >>>>>>>>>> >>>>>>>>>> I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a vector b=(0.5,...,0.5). We have only one processor, and I want to solve Ax=b using GASM. In particular, KSP is set to preonly, GASM is the preconditioner and it uses on each submatrix an lu direct solver (sub_ksp = preonly, sub_pc = lu). >>>>>>>>>> >>>>>>>>>> For the GASM algorithm, I divide A into diag(1,1) and diag(2,2,...,2). For simplicity I set 0 overlap. Now I want to use GASM to solve Ax=b. The code follows. >>>>>>>>>> >>>>>>>>>> #include >>>>>>>>>> #include >>>>>>>>>> #include >>>>>>>>>> USE petscmat >>>>>>>>>> USE petscksp >>>>>>>>>> USE petscpc >>>>>>>>>> USE MPI >>>>>>>>>> >>>>>>>>>> Mat :: A >>>>>>>>>> Vec :: b, x >>>>>>>>>> PetscInt :: M, I, J, ISLen, NSub >>>>>>>>>> PetscMPIInt :: size >>>>>>>>>> PetscErrorCode :: ierr >>>>>>>>>> PetscScalar :: v >>>>>>>>>> KSP :: ksp >>>>>>>>>> PC :: pc >>>>>>>>>> IS :: subdomains_IS(2), inflated_IS(2) >>>>>>>>>> PetscInt,DIMENSION(4) :: indices_first_domain >>>>>>>>>> PetscInt,DIMENSION(36) :: indices_second_domain >>>>>>>>>> >>>>>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >>>>>>>>>> call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> ! INTRO: create matrix and right hand side >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> >>>>>>>>>> WRITE(*,*) "Assembling A,b" >>>>>>>>>> >>>>>>>>>> M = 8 >>>>>>>>>> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>>>>> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >>>>>>>>>> DO I=1,M >>>>>>>>>> DO J=1,M >>>>>>>>>> IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN >>>>>>>>>> v = 1 >>>>>>>>>> ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN >>>>>>>>>> v = 2 >>>>>>>>>> ELSE >>>>>>>>>> v = 0 >>>>>>>>>> ENDIF >>>>>>>>>> call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) >>>>>>>>>> END DO >>>>>>>>>> END DO >>>>>>>>>> >>>>>>>>>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>>>>>>>>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>>>>>>>>> >>>>>>>>>> call VecCreate(PETSC_COMM_WORLD,b,ierr) >>>>>>>>>> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >>>>>>>>>> call VecSetFromOptions(b,ierr) >>>>>>>>>> >>>>>>>>>> do I=1,M >>>>>>>>>> v = 0.5 >>>>>>>>>> call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) >>>>>>>>>> end do >>>>>>>>>> >>>>>>>>>> call VecAssemblyBegin(b,ierr) >>>>>>>>>> call VecAssemblyEnd(b,ierr) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> ! FIRST KSP/PC SETUP >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> >>>>>>>>>> WRITE(*,*) "KSP/PC first setup" >>>>>>>>>> >>>>>>>>>> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >>>>>>>>>> call KSPSetOperators(ksp, A, A, ierr) >>>>>>>>>> call KSPSetType(ksp, 'preonly', ierr) >>>>>>>>>> call KSPGetPC(ksp, pc, ierr) >>>>>>>>>> call KSPSetUp(ksp, ierr) >>>>>>>>>> call PCSetType(pc, PCGASM, ierr) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> ! GASM, SETTING SUBDOMAINS >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> >>>>>>>>>> WRITE(*,*) "Setting GASM subdomains" >>>>>>>>>> >>>>>>>>>> ! Let's create the subdomain IS and inflated_IS >>>>>>>>>> ! They are equal if no overlap is present >>>>>>>>>> ! They are 1: 0,1,8,9 >>>>>>>>>> ! 2: 10,...,15,18,...,23,...,58,...,63 >>>>>>>>>> >>>>>>>>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>>>>>>> do I=0,5 >>>>>>>>>> do J=0,5 >>>>>>>>>> indices_second_domain(I*6+1+J) = 18 + J + 8*I ! corresponds to diag(2,2,...,2) >>>>>>>>>> !WRITE(*,*) I*6+1+J, 18 + J + 8*I >>>>>>>>>> end do >>>>>>>>>> end do >>>>>>>>>> >>>>>>>>>> ! Convert into IS >>>>>>>>>> ISLen = 4 >>>>>>>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>>>>>>> & PETSC_COPY_VALUES, subdomains_IS(1), ierr) >>>>>>>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>>>>>>> & PETSC_COPY_VALUES, inflated_IS(1), ierr) >>>>>>>>>> ISLen = 36 >>>>>>>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>>>>>>> & PETSC_COPY_VALUES, subdomains_IS(2), ierr) >>>>>>>>>> call ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>>>>>>> & PETSC_COPY_VALUES, inflated_IS(2), ierr) >>>>>>>>>> >>>>>>>>>> NSub = 2 >>>>>>>>>> call PCGASMSetSubdomains(pc,NSub, >>>>>>>>>> & subdomains_IS,inflated_IS,ierr) >>>>>>>>>> call PCGASMDestroySubdomains(NSub, >>>>>>>>>> & subdomains_IS,inflated_IS,ierr) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> ! GASM: SET SUBSOLVERS >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> >>>>>>>>>> WRITE(*,*) "Setting subsolvers for GASM" >>>>>>>>>> >>>>>>>>>> call PCSetUp(pc, ierr) ! should I add this? >>>>>>>>>> >>>>>>>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>>>>>>> & "-sub_pc_type", "lu", ierr) >>>>>>>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>>>>>>> & "-sub_ksp_type", "preonly", ierr) >>>>>>>>>> >>>>>>>>>> call KSPSetFromOptions(ksp, ierr) >>>>>>>>>> call PCSetFromOptions(pc, ierr) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> ! DUMMY SOLUTION: DID IT WORK? >>>>>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>>>>> >>>>>>>>>> WRITE(*,*) "Solve" >>>>>>>>>> >>>>>>>>>> call VecDuplicate(b,x,ierr) >>>>>>>>>> call KSPSolve(ksp,b,x,ierr) >>>>>>>>>> >>>>>>>>>> call MatDestroy(A, ierr) >>>>>>>>>> call KSPDestroy(ksp, ierr) >>>>>>>>>> call PetscFinalize(ierr) >>>>>>>>>> >>>>>>>>>> This code is failing in multiple points. At call PCSetUp(pc, ierr) it produces: >>>>>>>>>> >>>>>>>>>> [0]PETSC ERROR: Argument out of range >>>>>>>>>> [0]PETSC ERROR: Scatter indices in ix are out of range >>>>>>>>>> ... >>>>>>>>>> [0]PETSC ERROR: #1 VecScatterCreate() at ***\src\vec\is\sf\INTERF~1\vscat.c:736 >>>>>>>>>> [0]PETSC ERROR: #2 PCSetUp_GASM() at ***\src\ksp\pc\impls\gasm\gasm.c:433 >>>>>>>>>> [0]PETSC ERROR: #3 PCSetUp() at ***\src\ksp\pc\INTERF~1\precon.c:994 >>>>>>>>>> >>>>>>>>>> And at call KSPSolve(ksp,b,x,ierr) it produces: >>>>>>>>>> >>>>>>>>>> forrtl: severe (157): Program Exception - access violation >>>>>>>>>> >>>>>>>>>> The index sets are setup coherently with the outputs of e.g. https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: in particular each element of the matrix A corresponds to a number from 0 to 63. >>>>>>>>> >>>>>>>>> This is not correct, I believe. The indices are row/col indices, not indices into dense blocks, so for >>>>>>>>> your example, they are all in [0, 8]. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>>> Note that each submatrix does not represent some physical subdomain, the subdivision is just at the algebraic level. >>>>>>>>>> I thus have the following questions: >>>>>>>>>> is this the correct way of creating the IS objects, given my objective at the beginning of the email? Is the ordering correct? >>>>>>>>>> what am I doing wrong that is generating the above errors? >>>>>>>>>> Thanks for the patience and the time. >>>>>>>>>> Best, >>>>>>>>>> Leonardo >>>>>>>>>> >>>>>>>>>> Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith > ha scritto: >>>>>>>>>>> >>>>>>>>>>> Added in barry/2023-05-04/add-pcgasm-set-subdomains see also https://gitlab.com/petsc/petsc/-/merge_requests/6419 >>>>>>>>>>> >>>>>>>>>>> Barry >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI > wrote: >>>>>>>>>>>> >>>>>>>>>>>> Thank you for the help. >>>>>>>>>>>> Adding to my example: >>>>>>>>>>>> call PCGASMSetSubdomains(pc,NSub, subdomains_IS, inflated_IS,ierr) >>>>>>>>>>>> call PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr) >>>>>>>>>>>> results in: >>>>>>>>>>>> Error LNK2019 unresolved external symbol PCGASMDESTROYSUBDOMAINS referenced in function ... >>>>>>>>>>>> Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS referenced in function ... >>>>>>>>>>>> I'm not sure if the interfaces are missing or if I have a compilation problem. >>>>>>>>>>>> Thank you again. >>>>>>>>>>>> Best, >>>>>>>>>>>> Leonardo >>>>>>>>>>>> >>>>>>>>>>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith > ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for the test code. I have a fix in the branch barry/2023-04-29/fix-pcasmcreatesubdomains2d with merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>>>>>>>>>> >>>>>>>>>>>>> The functions did not have proper Fortran stubs and interfaces so I had to provide them manually in the new branch. >>>>>>>>>>>>> >>>>>>>>>>>>> Use >>>>>>>>>>>>> >>>>>>>>>>>>> git fetch >>>>>>>>>>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>>>>>>>> ./configure etc >>>>>>>>>>>>> >>>>>>>>>>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I had to change things slightly and I updated the error handling for the latest version. >>>>>>>>>>>>> >>>>>>>>>>>>> Please let us know if you have any later questions. >>>>>>>>>>>>> >>>>>>>>>>>>> Barry >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI > wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hello. I am having a hard time understanding the index sets to feed PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). To get more intuition on how the IS objects behave I tried the following minimal (non) working example, which should tile a 16x16 matrix into 16 square, non-overlapping submatrices: >>>>>>>>>>>>>> >>>>>>>>>>>>>> #include >>>>>>>>>>>>>> #include >>>>>>>>>>>>>> #include >>>>>>>>>>>>>> USE petscmat >>>>>>>>>>>>>> USE petscksp >>>>>>>>>>>>>> USE petscpc >>>>>>>>>>>>>> >>>>>>>>>>>>>> Mat :: A >>>>>>>>>>>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>>>>>>>>>>> INTEGER :: I,J >>>>>>>>>>>>>> PetscErrorCode :: ierr >>>>>>>>>>>>>> PetscScalar :: v >>>>>>>>>>>>>> KSP :: ksp >>>>>>>>>>>>>> PC :: pc >>>>>>>>>>>>>> IS :: subdomains_IS, inflated_IS >>>>>>>>>>>>>> >>>>>>>>>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>>>>>>>>>>> >>>>>>>>>>>>>> !-----Create a dummy matrix >>>>>>>>>>>>>> M = 16 >>>>>>>>>>>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>>>>>>>>> & M, M, >>>>>>>>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>>>>>>>> & A, ierr) >>>>>>>>>>>>>> >>>>>>>>>>>>>> DO I=1,M >>>>>>>>>>>>>> DO J=1,M >>>>>>>>>>>>>> v = I*J >>>>>>>>>>>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>>>>>>>>>>> & INSERT_VALUES , ierr) >>>>>>>>>>>>>> END DO >>>>>>>>>>>>>> END DO >>>>>>>>>>>>>> >>>>>>>>>>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>>>>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>>>>>>>> >>>>>>>>>>>>>> !-----Create KSP and PC >>>>>>>>>>>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>>>>>>>>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>>>>>>>>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>>>>>>>>>>> call KSPGetPC(ksp,pc,ierr) >>>>>>>>>>>>>> call KSPSetUp(ksp, ierr) >>>>>>>>>>>>>> call PCSetType(pc,PCGASM, ierr) >>>>>>>>>>>>>> call PCSetUp(pc , ierr) >>>>>>>>>>>>>> >>>>>>>>>>>>>> !-----GASM setup >>>>>>>>>>>>>> NSubx = 4 >>>>>>>>>>>>>> dof = 1 >>>>>>>>>>>>>> overlap = 0 >>>>>>>>>>>>>> >>>>>>>>>>>>>> call PCGASMCreateSubdomains2D(pc, >>>>>>>>>>>>>> & M, M, >>>>>>>>>>>>>> & NSubx, NSubx, >>>>>>>>>>>>>> & dof, overlap, >>>>>>>>>>>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>>>>>>>>>>> >>>>>>>>>>>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>>>>>>>>>>> >>>>>>>>>>>>>> call KSPDestroy(ksp, ierr) >>>>>>>>>>>>>> call PetscFinalize(ierr) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Running this on one processor, I get NSub = 4. >>>>>>>>>>>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub = 16 as expected. >>>>>>>>>>>>>> Moreover, I get in the end "forrtl: severe (157): Program Exception - access violation". So: >>>>>>>>>>>>>> 1) why do I get two different results with ASM, and GASM? >>>>>>>>>>>>>> 2) why do I get access violation and how can I solve this? >>>>>>>>>>>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS objects. As I see on the Fortran interface, the arguments to PCGASMCreateSubdomains2D are IS objects: >>>>>>>>>>>>>> >>>>>>>>>>>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>>>>>>>>>>> import tPC,tIS >>>>>>>>>>>>>> PC a ! PC >>>>>>>>>>>>>> PetscInt b ! PetscInt >>>>>>>>>>>>>> PetscInt c ! PetscInt >>>>>>>>>>>>>> PetscInt d ! PetscInt >>>>>>>>>>>>>> PetscInt e ! PetscInt >>>>>>>>>>>>>> PetscInt f ! PetscInt >>>>>>>>>>>>>> PetscInt g ! PetscInt >>>>>>>>>>>>>> PetscInt h ! PetscInt >>>>>>>>>>>>>> IS i ! IS >>>>>>>>>>>>>> IS j ! IS >>>>>>>>>>>>>> PetscErrorCode z >>>>>>>>>>>>>> end subroutine PCGASMCreateSubdomains2D >>>>>>>>>>>>>> Thus: >>>>>>>>>>>>>> 3) what should be inside e.g., subdomains_IS? I expect it to contain, for every created subdomain, the list of rows and columns defining the subblock in the matrix, am I right? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Context: I have a block-tridiagonal system arising from space-time finite elements, and I want to solve it with GMRES+PCGASM preconditioner, where each overlapping submatrix is on the diagonal and of size 3x3 blocks (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks in advance, >>>>>>>>>>>>>> Leonardo >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Thu May 18 19:47:12 2023 From: liufield at gmail.com (neil liu) Date: Thu, 18 May 2023 20:47:12 -0400 Subject: [petsc-users] Using dmplexdistribute do parallel FEM code. In-Reply-To: References: Message-ID: Thanks, Matt. I am using the following steps to build a local to global mapping. Step 1) PetscSectionCreate (); PetscSectionSetNumFields(); PetscSectionSetChart (); //Set dof for each node PetscSectionSetup (s); Step 2) PetscCall(DMGetLocalToGlobalMapping(dm, <ogm)); PetscCall(ISLocalToGlobalMappingGetIndices(ltogm, &g_idx)); For 2D mesh from the example, https://wg-beginners.readthedocs.io/en/latest/tutorials/dm/plex/introductory_tutorial_plex.html the map from rank 0 (MPI) and rank 1(MPI) matches well with the global node ordering. But When I tried a 3D gmsh, (2 hexahedra (12 nodes ) were split into 12 tetrahedra). ( 2 processors in total ) Each processor handles 9 nodes separately. When I checked the local to global mapping, It seems the mapping is not right. For example, the mapping array is Local Global (left: local node number; right: global node number). provided. 0 0 1 3 2 5 3 1 4 6 5 8 6 2 7 9 8 11 But the coordinates between the local (left) and global nodes (right) are not the same, meaning they are not matching. Did I miss something? In addition, the simple 3D gmsh file has been attached if you need. *Gmsh file:* $MeshFormat 4.1 0 8 $EndMeshFormat $PhysicalNames 3 2 27 "Excit" 2 28 "PEC" 3 29 "volume" $EndPhysicalNames $Entities 8 12 6 1 1 0 0 0 0 2 0 0.25 0 0 3 0 0.25 0.25 0 4 0 0 0.25 0 5 1 0 0 0 6 1 0.25 0 0 10 1 0.25 0.25 0 14 1 0 0.25 0 1 0 0 0 0 0.25 0 0 2 1 -2 2 0 0.25 0 0 0.25 0.25 0 2 2 -3 3 0 0 0.25 0 0.25 0.25 0 2 3 -4 4 0 0 0 0 0 0.25 0 2 4 -1 6 1 0 0 1 0.25 0 0 2 5 -6 7 1 0.25 0 1 0.25 0.25 0 2 6 -10 8 1 0 0.25 1 0.25 0.25 0 2 10 -14 9 1 0 0 1 0 0.25 0 2 14 -5 11 0 0 0 1 0 0 0 2 1 -5 12 0 0.25 0 1 0.25 0 0 2 2 -6 16 0 0.25 0.25 1 0.25 0.25 0 2 3 -10 20 0 0 0.25 1 0 0.25 0 2 4 -14 1 0 0 0 0 0.25 0.25 1 27 4 1 2 3 4 13 0 0 0 1 0.25 0 1 28 4 1 12 -6 -11 17 0 0.25 0 1 0.25 0.25 0 4 2 16 -7 -12 21 0 0 0.25 1 0.25 0.25 1 28 4 3 20 -8 -16 25 0 0 0 1 0 0.25 0 4 4 11 -9 -20 26 1 0 0 1 0.25 0.25 1 28 4 6 7 8 9 1 0 0 0 1 0.25 0.25 1 29 6 -1 26 13 17 21 25 $EndEntities $Nodes 17 12 1 12 0 1 0 1 1 0 0 0 0 2 0 1 2 0 0.25 0 0 3 0 1 3 0 0.25 0.25 0 4 0 1 4 0 0 0.25 0 5 0 1 5 1 0 0 0 6 0 1 6 1 0.25 0 0 10 0 1 7 1 0.25 0.25 0 14 0 1 8 1 0 0.25 1 11 0 1 9 0.5 0 0 1 12 0 1 10 0.5 0.25 0 1 16 0 1 11 0.5 0.25 0.25 1 20 0 1 12 0.5 0 0.25 2 1 0 0 2 13 0 0 2 21 0 0 2 26 0 0 3 1 0 0 $EndNodes $Elements 5 24 1 24 2 1 2 2 1 1 2 4 2 4 2 3 2 13 2 4 3 9 1 2 4 9 2 10 5 5 9 10 6 5 10 6 2 21 2 4 7 11 3 4 8 11 4 12 9 7 11 12 10 7 12 8 2 26 2 2 11 5 6 8 12 8 6 7 3 1 4 12 13 1 2 4 12 14 10 9 12 2 15 2 9 12 1 16 9 10 12 8 17 6 5 8 10 18 10 5 8 9 19 4 2 3 11 20 10 12 11 2 21 2 12 11 4 22 12 10 11 7 23 6 8 7 10 24 10 8 7 12 $EndElements On Wed, May 17, 2023 at 7:30?PM Matthew Knepley wrote: > On Wed, May 17, 2023 at 6:58?PM neil liu wrote: > >> Dear Petsc developers, >> >> I am writing my own code to calculate the FEM matrix. The following is my >> general framework, >> >> DMPlexCreateGmsh(); >> MPI_Comm_rank (Petsc_comm_world, &rank); >> DMPlexDistribute (.., .., &dmDist); >> >> dm = dmDist; >> //This can create separate dm s for different processors. (reordering.) >> >> MatCreate (Petsc_comm_world, &A) >> // Loop over every tetrahedral element to calculate the local matrix for >> each processor. Then we can get a local matrix A for each processor. >> >> *My question is : it seems we should build a global matrix B (assemble >> all the As for each partition) and then transfer B to KSP. KSP will do the >> parallelization correctly, right? * >> > > I would not suggest this. The more common strategy is to assemble each > element matrix directly into the > global matrix B, by mapping the cell indices directly to global indices > (rather than to local indices in the matrix A). You can do this in two > stages. You can create a LocalToGlobalMapping in PETSc that maps > every local index to a global index. Then you can assemble into B exactly > as you would assemble into A by calling MatSetValuesLocal(). > > DMPlex handles these mappings for you automatically, but I realize that it > is a large number of things to buy into. > > Thanks, > > Matt > > >> If that is right, I should define a whole domain matrix B before the >> partitioning (MatCreate (Petsc_comm_world, &B); ), and then use >> localtoglobal (which petsc function should I use? Do you have any >> examples.) map to add A to B at the right positions (MatSetValues) ? >> >> Does that make sense? >> >> Thanks, >> >> Xiaodong >> >> >> >> >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardo.mutti01 at universitadipavia.it Thu May 18 20:20:01 2023 From: leonardo.mutti01 at universitadipavia.it (Leonardo Mutti) Date: Fri, 19 May 2023 03:20:01 +0200 Subject: [petsc-users] Understanding index sets for PCGASM In-Reply-To: <07222845-BC1C-4875-BBFC-17B01E08A529@petsc.dev> References: <989A8495-06FF-4D8A-8B45-F3D991D0A486@petsc.dev> <65FAB3E9-8D08-4AEE-874E-636EB2C76A29@petsc.dev> <6CE3B35C-E74E-43B5-A3DF-4D0D77E6A94C@petsc.dev> <07222845-BC1C-4875-BBFC-17B01E08A529@petsc.dev> Message-ID: Many thanks for the hint. As for your last two sentences, I am in a similar situation. So, let me mention some last thoughts, in case they help locate the issue. If not, I will embark on the debugging adventure with Windows debuggers. Consider again the 4x4 matrix (represented by the IS [0,1,2,3], with rows 0,1 on rank 0 and 2,3 on rank 1). If I create the non-inflated IS [0,1,2], [3], with inflated IS [0,1,2], [1,2,3], no problems arise. Here is a comparison with the non-working case: - works: non-inflated IS: [0,1,2], [3], inflated IS: [0,1,2], [1,2,3] - doesn't work: non-inflated IS: [0,1], [2,3], inflated IS: [0,1], [1,2,3] One difference is that in the working case, the non-inflating subdomain [0,1,2], is not on a single process. Moreover, is it possible that the issue has something to do with PetscObjectReference as seen in /src/ksp/pc/impls/gasm/gasm.c line 1732, PCGASMCreateSubdomains2D? Thanks again Il giorno gio 18 mag 2023 alle ore 04:26 Barry Smith ha scritto: > > Yikes. Such huge numbers usually come from integer overflow or memory > corruption. > > The code to decide on the memory that needs allocating is straightforward > > PetscErrorCode MatCreateSubMatrices_MPIAIJ(Mat C, PetscInt ismax, const IS > isrow[], const IS iscol[], MatReuse scall, Mat *submat[]) > { > PetscInt nmax, nstages = 0, i, pos, max_no, nrow, ncol, in[2], > out[2]; > PetscBool rowflag, colflag, wantallmatrix = PETSC_FALSE; > Mat_SeqAIJ *subc; > Mat_SubSppt *smat; > > PetscFunctionBegin; > /* Check for special case: each processor has a single IS */ > if (C->submat_singleis) { /* flag is set in PCSetUp_ASM() to skip > MPI_Allreduce() */ > PetscCall(MatCreateSubMatrices_MPIAIJ_SingleIS(C, ismax, isrow, iscol, > scall, submat)); > C->submat_singleis = PETSC_FALSE; /* resume its default value in case > C will be used for non-single IS */ > PetscFunctionReturn(PETSC_SUCCESS); > } > > /* Collect global wantallmatrix and nstages */ > if (!C->cmap->N) nmax = 20 * 1000000 / sizeof(PetscInt); > else nmax = 20 * 1000000 / (C->cmap->N * sizeof(PetscInt)); > if (!nmax) nmax = 1; > > if (scall == MAT_INITIAL_MATRIX) { > /* Collect global wantallmatrix and nstages */ > if (ismax == 1 && C->rmap->N == C->cmap->N) { > PetscCall(ISIdentity(*isrow, &rowflag)); > PetscCall(ISIdentity(*iscol, &colflag)); > PetscCall(ISGetLocalSize(*isrow, &nrow)); > PetscCall(ISGetLocalSize(*iscol, &ncol)); > if (rowflag && colflag && nrow == C->rmap->N && ncol == C->cmap->N) { > wantallmatrix = PETSC_TRUE; > > PetscCall(PetscOptionsGetBool(((PetscObject)C)->options, > ((PetscObject)C)->prefix, "-use_fast_submatrix", &wantallmatrix, NULL)); > } > } > > /* Determine the number of stages through which submatrices are done > Each stage will extract nmax submatrices. > nmax is determined by the matrix column dimension. > If the original matrix has 20M columns, only one submatrix per > stage is allowed, etc. > */ > nstages = ismax / nmax + ((ismax % nmax) ? 1 : 0); /* local nstages */ > > in[0] = -1 * (PetscInt)wantallmatrix; > in[1] = nstages; > PetscCall(MPIU_Allreduce(in, out, 2, MPIU_INT, MPI_MAX, > PetscObjectComm((PetscObject)C))); > wantallmatrix = (PetscBool)(-out[0]); > nstages = out[1]; /* Make sure every processor loops through the > global nstages */ > > } else { /* MAT_REUSE_MATRIX */ > if (ismax) { > subc = (Mat_SeqAIJ *)(*submat)[0]->data; > smat = subc->submatis1; > } else { /* (*submat)[0] is a dummy matrix */ > smat = (Mat_SubSppt *)(*submat)[0]->data; > } > if (!smat) { > /* smat is not generated by > MatCreateSubMatrix_MPIAIJ_All(...,MAT_INITIAL_MATRIX,...) */ > wantallmatrix = PETSC_TRUE; > } else if (smat->singleis) { > PetscCall(MatCreateSubMatrices_MPIAIJ_SingleIS(C, ismax, isrow, > iscol, scall, submat)); > PetscFunctionReturn(PETSC_SUCCESS); > } else { > nstages = smat->nstages; > } > } > > if (wantallmatrix) { > PetscCall(MatCreateSubMatrix_MPIAIJ_All(C, MAT_GET_VALUES, scall, > submat)); > PetscFunctionReturn(PETSC_SUCCESS); > } > > /* Allocate memory to hold all the submatrices and dummy submatrices */ > if (scall == MAT_INITIAL_MATRIX) PetscCall(PetscCalloc1(ismax + nstages, > submat)); > > for (i = 0, pos = 0; i < nstages; i++) { > if (pos + nmax <= ismax) max_no = nmax; > else if (pos >= ismax) max_no = 0; > else max_no = ismax - pos; > > PetscCall(MatCreateSubMatrices_MPIAIJ_Local(C, max_no, isrow + pos, > iscol + pos, scall, *submat + pos)); > if (!max_no) { > if (scall == MAT_INITIAL_MATRIX) { /* submat[pos] is a dummy matrix > */ > smat = (Mat_SubSppt *)(*submat)[pos]->data; > smat->nstages = nstages; > } > pos++; /* advance to next dummy matrix if any */ > } else pos += max_no; > } > > if (ismax && scall == MAT_INITIAL_MATRIX) { > /* save nstages for reuse */ > subc = (Mat_SeqAIJ *)(*submat)[0]->data; > smat = subc->submatis1; > smat->nstages = nstages; > } > PetscFunctionReturn(PETSC_SUCCESS); > } > > The easiest way to debug would be to put a breakpoint in > MatCreateSubMatrices_MPIAIJ on MPI rank zero and next through the > subroutine to see where the crazy number appears that gets passed down in > the line if (scall == MAT_INITIAL_MATRIX) PetscCall(PetscCalloc1(ismax + > nstages, submat)); where either ismax or nstages has a crazy value. > > If you are using the GNU compilers you can use the command line options > -start_in_debugger noxterm -debugger_ranks 0 to start the Gnu debugger. If > you are using the Microsoft Windows compilers you will need to use their > debugger, I don't know how to do that (and shudder at the thought :-). > > On May 17, 2023, at 7:51 PM, Leonardo Mutti < > leonardo.mutti01 at universitadipavia.it> wrote: > > Thanks for the reply. Even without Valgrind (which I can't use since I'm > on Windows), by further simplifying the example, I was able to have PETSc > display a more informative message. > What I am doing wrong and what should be done differently, this is still > unclear to me. > > The simplified code runs on 2 processors, I built a 4x4 matrix. > The subdomains are now given by [0,1] and [2,3], with [2,3] inflating to > [0,1,2,3]. > > Thank you again. > > Error: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 0 Memory used by process 0 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_view for info. > [0]PETSC ERROR: Memory requested 9437902811936987136 > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could > be the program crashed before they were used or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-pc_gasm_view_subdomains value: 1 > source: code > [0]PETSC ERROR: Option left: name:-sub_ksp_type value: gmres source: code > [0]PETSC ERROR: Option left: name:-sub_pc_type value: none source: code > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.19.1-234-g43977f8d16 > GIT Date: 2023-05-08 14:50:03 +0000 > [...] > [0]PETSC ERROR: #1 PetscMallocAlign() at > ...\Sources\Git\Test-FV\PETSC-~1\src\sys\memory\mal.c:66 > [0]PETSC ERROR: #2 PetscMallocA() at > ...\Sources\Git\Test-FV\PETSC-~1\src\sys\memory\mal.c:411 > [0]PETSC ERROR: #3 MatCreateSubMatrices_MPIAIJ() at > ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:2025 > [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at > ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:3136 > [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at > ...\Sources\Git\Test-FV\PETSC-~1\src\mat\impls\aij\mpi\mpiov.c:3208 > [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at > ...\Sources\Git\Test-FV\PETSC-~1\src\mat\INTERF~1\matrix.c:7071 > [0]PETSC ERROR: #7 PCSetUp_GASM() at > ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\pc\impls\gasm\gasm.c:556 > [0]PETSC ERROR: #8 PCSetUp() at > ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\pc\INTERF~1\precon.c:994 > [0]PETSC ERROR: #9 KSPSetUp() at > ...\Sources\Git\Test-FV\PETSC-~1\src\ksp\ksp\INTERF~1\itfunc.c:406 > > > Code, the important bit is after ! GASM, SETTING SUBDOMAINS: > > Mat :: A > Vec :: b > PetscInt :: M,N_blocks,block_size,NSub,I > PetscErrorCode :: ierr > PetscScalar :: v > KSP :: ksp > PC :: pc > IS :: subdomains_IS(2), inflated_IS(2) > PetscInt :: NMPI,MYRANK,IERMPI > INTEGER :: start > > call PetscInitialize(PETSC_NULL_CHARACTER, ierr) > call PetscLogDefaultBegin(ierr) > call MPI_COMM_SIZE(MPI_COMM_WORLD, NMPI, IERMPI) > CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERMPI) > > N_blocks = 2 > block_size = 2 > M = N_blocks * block_size > > > ! INTRO: create matrix and right hand side, create IS > > call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, > & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, > & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) > call VecCreate(PETSC_COMM_WORLD,b,ierr) > call VecSetSizes(b, PETSC_DECIDE, M,ierr) > call VecSetFromOptions(b,ierr) > > DO I=(MYRANK*(M/NMPI)),((MYRANK+1)*(M/NMPI)-1) > > ! Set matrix > v=1 > call MatSetValue(A, I, I, v, INSERT_VALUES, ierr) > IF (I-block_size .GE. 0) THEN > v=-1 > call MatSetValue(A, I, I-block_size, v, INSERT_VALUES, ierr) > ENDIF > ! Set rhs > v = I > call VecSetValue(b,I,v, INSERT_VALUES,ierr) > > END DO > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > call VecAssemblyBegin(b,ierr) > call VecAssemblyEnd(b,ierr) > > ! FIRST KSP/PC SETUP > > call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) > call KSPSetOperators(ksp, A, A, ierr) > call KSPSetType(ksp, 'preonly', ierr) > call KSPGetPC(ksp, pc, ierr) > call PCSetType(pc, PCGASM, ierr) > > > ! GASM, SETTING SUBDOMAINS > > if (myrank == 0) then > call ISCreateStride(PETSC_COMM_SELF, 2, 0, 1, subdomains_IS(1), > ierr) > call ISCreateStride(PETSC_COMM_WORLD, 0, 0, 1, subdomains_IS(2), > ierr) > call ISCreateStride(PETSC_COMM_SELF, 2, 0, 1, inflated_IS(1), > ierr) > call ISCreateStride(PETSC_COMM_WORLD, 2, 0, 1, inflated_IS(2), > ierr) > start = 1 > NSub = 2 > else > call ISCreateStride(PETSC_COMM_WORLD, 2, 2, 1, subdomains_IS(2), > ierr) > call ISCreateStride(PETSC_COMM_WORLD, 2, 2, 1, inflated_IS(2), > ierr) > start = 2 > NSub = 1 > endif > > call > PCGASMSetSubdomains(pc,NSub,subdomains_IS(start:2),inflated_IS(start:2),ierr) > call > PCGASMDestroySubdomains(NSub,subdomains_IS(start:2),inflated_IS(start:2),ierr) > > ! GASM: SET SUBSOLVERS > > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_ksp_type", > "gmres", ierr) > call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_pc_type", "none", > ierr) > call > PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-pc_gasm_view_subdomains", "1", > ierr) > > call KSPSetUp(ksp, ierr) > call PCSetUp(pc, ierr) > call KSPSetFromOptions(ksp, ierr) > call PCSetFromOptions(pc, ierr) > call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD, ierr) > > call MatDestroy(A, ierr) > call PetscFinalize(ierr) > > > Il giorno mer 17 mag 2023 alle ore 21:55 Barry Smith > ha scritto: > >> >> >> On May 17, 2023, at 11:10 AM, Leonardo Mutti < >> leonardo.mutti01 at universitadipavia.it> wrote: >> >> Dear developers, let me kindly ask for your help again. >> In the following snippet, a bi-diagonal matrix A is set up. It measures >> 8x8 blocks, each block is 2x2 elements. I would like to create the >> correct IS objects for PCGASM. >> The non-overlapping IS should be: [*0,1*], [*2,3*],[*4,5*], ..., [*14,15* >> ]. The overlapping IS should be: [*0,1*], [0,1,*2,3*], [2,3,*4,5*], ..., >> [12,13,*14,15*] >> I am running the code with 4 processors. For some reason, after calling PCGASMDestroySubdomains >> the code crashes with severe (157): Program Exception - access violation. >> A visual inspection of the indices using ISView looks good. >> >> >> Likely memory corruption or use of an object or an array that was >> already freed. Best to use Valgrind to find the exact location of the mess. >> >> >> Thanks again, >> Leonardo >> >> Mat :: A >> Vec :: b >> PetscInt :: >> M,N_blocks,block_size,I,J,NSub,converged_reason,srank,erank,color,subcomm >> PetscMPIInt :: size >> PetscErrorCode :: ierr >> PetscScalar :: v >> KSP :: ksp >> PC :: pc >> IS,ALLOCATABLE :: subdomains_IS(:), inflated_IS(:) >> PetscInt :: NMPI,MYRANK,IERMPI >> INTEGER :: IS_counter, is_start, is_end >> >> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >> call PetscLogDefaultBegin(ierr) >> call MPI_COMM_SIZE(MPI_COMM_WORLD, NMPI, IERMPI) >> CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERMPI) >> >> N_blocks = 8 >> block_size = 2 >> M = N_blocks * block_size >> >> ALLOCATE(subdomains_IS(N_blocks)) >> ALLOCATE(inflated_IS(N_blocks)) >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> ! ASSUMPTION: no block spans more than one rank (the inflated >> blocks can) >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> ! INTRO: create matrix and right hand side >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> ! How many inflated blocks span more than one rank? NMPI-1 ! >> >> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >> call VecCreate(PETSC_COMM_WORLD,b,ierr) >> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >> call VecSetFromOptions(b,ierr) >> >> DO I=(MYRANK*(M/NMPI)),((MYRANK+1)*(M/NMPI)-1) >> >> ! Set matrix >> v=1 >> call MatSetValue(A, I, I, v, INSERT_VALUES, ierr) >> IF (I-block_size .GE. 0) THEN >> v=-1 >> call MatSetValue(A, I, I-block_size, v, INSERT_VALUES, ierr) >> ENDIF >> ! Set rhs >> v = I >> call VecSetValue(b,I,v, INSERT_VALUES,ierr) >> >> END DO >> >> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >> call VecAssemblyBegin(b,ierr) >> call VecAssemblyEnd(b,ierr) >> >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> ! FIRST KSP/PC SETUP >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >> call KSPSetOperators(ksp, A, A, ierr) >> call KSPSetType(ksp, 'preonly', ierr) >> call KSPGetPC(ksp, pc, ierr) >> call PCSetType(pc, PCGASM, ierr) >> >> >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> !! GASM, SETTING SUBDOMAINS >> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> DO IS_COUNTER=1,N_blocks >> >> srank = MAX(((IS_COUNTER-2)*block_size)/(M/NMPI),0) ! start >> rank reached by inflated block >> erank = MIN(((IS_COUNTER-1)*block_size)/(M/NMPI),NMPI-1) ! end >> rank reached by inflated block. Coincides with rank containing non-inflated >> block >> >> ! Create subcomms >> color = MPI_UNDEFINED >> IF (myrank == srank .or. myrank == erank) THEN >> color = 1 >> ENDIF >> call MPI_Comm_split(MPI_COMM_WORLD,color,MYRANK,subcomm,ierr) >> >> >> ! Create IS >> IF (srank .EQ. erank) THEN ! Block and overlap are on the >> same rank >> IF (MYRANK .EQ. srank) THEN >> call >> ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) >> IF (IS_COUNTER .EQ. 1) THEN ! the first block is not >> inflated >> call >> ISCreateStride(PETSC_COMM_SELF,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) >> ELSE >> call >> ISCreateStride(PETSC_COMM_SELF,2*block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) >> ENDIF >> ENDIF >> else ! Block and overlap not on the same rank >> if (myrank == erank) then ! the block >> call ISCreateStride >> (subcomm,block_size,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) >> call ISCreateStride >> (subcomm,block_size,(IS_COUNTER-1)*block_size,1,inflated_IS(IS_COUNTER),ierr) >> endif >> if (myrank == srank) then ! the overlap >> call ISCreateStride >> (subcomm,block_size,(IS_COUNTER-2)*block_size,1,inflated_IS(IS_COUNTER),ierr) >> call ISCreateStride >> (subcomm,0,(IS_COUNTER-1)*block_size,1,subdomains_IS(IS_COUNTER),ierr) >> endif >> endif >> >> call MPI_Comm_free(subcomm, ierr) >> END DO >> >> ! Set the domains/subdomains >> NSub = N_blocks/NMPI >> is_start = 1 + myrank * NSub >> is_end = min(is_start + NSub, N_blocks) >> if (myrank + 1 < NMPI) then >> NSub = NSub + 1 >> endif >> >> call >> PCGASMSetSubdomains(pc,NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) >> call >> PCGASMDestroySubdomains(NSub,subdomains_IS(is_start:is_end),inflated_IS(is_start:is_end),ierr) >> >> call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_ksp_type", >> "gmres", ierr) >> call PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-sub_pc_type", >> "none", ierr) >> call >> PetscOptionsSetValue(PETSC_NULL_OPTIONS,"-pc_gasm_view_subdomains", "1", >> ierr) >> >> call KSPSetUp(ksp, ierr) >> call PCSetUp(pc, ierr) >> call KSPSetFromOptions(ksp, ierr) >> call PCSetFromOptions(pc, ierr) >> >> call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD, ierr) >> >> >> Il giorno mer 10 mag 2023 alle ore 03:02 Barry Smith >> ha scritto: >> >>> >>> >>> On May 9, 2023, at 4:58 PM, LEONARDO MUTTI < >>> leonardo.mutti01 at universitadipavia.it> wrote: >>> >>> In my notation diag(1,1) means a diagonal 2x2 matrix with 1,1 on the >>> diagonal, submatrix in the 8x8 diagonal matrix diag(1,1,2,2,...,2). >>> Am I then correct that the IS representing diag(1,1) is 0,1, and that >>> diag(2,2,...,2) is represented by 2,3,4,5,6,7? >>> >>> >>> I believe so >>> >>> Thanks, >>> Leonardo >>> >>> Il mar 9 mag 2023, 20:45 Barry Smith ha scritto: >>> >>>> >>>> It is simplier than you are making it out to be. Each IS[] is a list >>>> of rows (and columns) in the sub (domain) matrix. In your case with the >>>> matrix of 144 by 144 the indices will go from 0 to 143. >>>> >>>> In your simple Fortran code you have a completely different problem. >>>> A matrix with 8 rows and columns. In that case if you want the first IS to >>>> represent just the first row (and column) in the matrix then it should >>>> contain only 0. The second submatrix which is all rows (but the first) >>>> should have 1,2,3,4,5,6,7 >>>> >>>> I do not understand why your code has >>>> >>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>>>> >>>>>> >>>> it should just be 0 >>>> >>>> >>>> >>>> >>>> >>>> On May 9, 2023, at 12:44 PM, LEONARDO MUTTI < >>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>> >>>> Partial typo: I expect 9x(16+16) numbers to be stored in subdomain_IS : >>>> # subdomains x (row indices of the submatrix + col indices of the >>>> submatrix). >>>> >>>> Il giorno mar 9 mag 2023 alle ore 18:31 LEONARDO MUTTI < >>>> leonardo.mutti01 at universitadipavia.it> ha scritto: >>>> >>>>> >>>>> >>>>> ---------- Forwarded message --------- >>>>> Da: LEONARDO MUTTI >>>>> Date: mar 9 mag 2023 alle ore 18:29 >>>>> Subject: Re: [petsc-users] Understanding index sets for PCGASM >>>>> To: Matthew Knepley >>>>> >>>>> >>>>> Thank you for your answer, but I am still confused, sorry. >>>>> Consider >>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 on >>>>> one processor. >>>>> Let M=12 for the sake of simplicity, i.e. we deal with a 12x12 2D >>>>> grid, hence, a 144x144 matrix. >>>>> Let NSubx = 3, so that on the grid we do 3 vertical and 3 horizontal >>>>> subdivisions. >>>>> We should obtain 9 subdomains that are grids of 4x4 nodes each, thus >>>>> corresponding to 9 submatrices of size 16x16. >>>>> In my run I obtain NSub = 9 (great) and subdomain_IS(i), i=1,...,9, >>>>> reads: >>>>> >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 0* >>>>> *1 1* >>>>> *2 2* >>>>> *3 3* >>>>> *4 12* >>>>> *5 13* >>>>> *6 14* >>>>> *7 15* >>>>> *8 24* >>>>> *9 25* >>>>> *10 26* >>>>> *11 27* >>>>> *12 36* >>>>> *13 37* >>>>> *14 38* >>>>> *15 39* >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 4* >>>>> *1 5* >>>>> *2 6* >>>>> *3 7* >>>>> *4 16* >>>>> *5 17* >>>>> *6 18* >>>>> *7 19* >>>>> *8 28* >>>>> *9 29* >>>>> *10 30* >>>>> *11 31* >>>>> *12 40* >>>>> *13 41* >>>>> *14 42* >>>>> *15 43* >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 8* >>>>> *1 9* >>>>> *2 10* >>>>> *3 11* >>>>> *4 20* >>>>> *5 21* >>>>> *6 22* >>>>> *7 23* >>>>> *8 32* >>>>> *9 33* >>>>> *10 34* >>>>> *11 35* >>>>> *12 44* >>>>> *13 45* >>>>> *14 46* >>>>> *15 47* >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 48* >>>>> *1 49* >>>>> *2 50* >>>>> *3 51* >>>>> *4 60* >>>>> *5 61* >>>>> *6 62* >>>>> *7 63* >>>>> *8 72* >>>>> *9 73* >>>>> *10 74* >>>>> *11 75* >>>>> *12 84* >>>>> *13 85* >>>>> *14 86* >>>>> *15 87* >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 52* >>>>> *1 53* >>>>> *2 54* >>>>> *3 55* >>>>> *4 64* >>>>> *5 65* >>>>> *6 66* >>>>> *7 67* >>>>> *8 76* >>>>> *9 77* >>>>> *10 78* >>>>> *11 79* >>>>> *12 88* >>>>> *13 89* >>>>> *14 90* >>>>> *15 91* >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 56* >>>>> *1 57* >>>>> *2 58* >>>>> *3 59* >>>>> *4 68* >>>>> *5 69* >>>>> *6 70* >>>>> *7 71* >>>>> *8 80* >>>>> *9 81* >>>>> *10 82* >>>>> *11 83* >>>>> *12 92* >>>>> *13 93* >>>>> *14 94* >>>>> *15 95* >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 96* >>>>> *1 97* >>>>> *2 98* >>>>> *3 99* >>>>> *4 108* >>>>> *5 109* >>>>> *6 110* >>>>> *7 111* >>>>> *8 120* >>>>> *9 121* >>>>> *10 122* >>>>> *11 123* >>>>> *12 132* >>>>> *13 133* >>>>> *14 134* >>>>> *15 135* >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 100* >>>>> *1 101* >>>>> *2 102* >>>>> *3 103* >>>>> *4 112* >>>>> *5 113* >>>>> *6 114* >>>>> *7 115* >>>>> *8 124* >>>>> *9 125* >>>>> *10 126* >>>>> *11 127* >>>>> *12 136* >>>>> *13 137* >>>>> *14 138* >>>>> *15 139* >>>>> *IS Object: 1 MPI process* >>>>> * type: general* >>>>> *Number of indices in set 16* >>>>> *0 104* >>>>> *1 105* >>>>> *2 106* >>>>> *3 107* >>>>> *4 116* >>>>> *5 117* >>>>> *6 118* >>>>> *7 119* >>>>> *8 128* >>>>> *9 129* >>>>> *10 130* >>>>> *11 131* >>>>> *12 140* >>>>> *13 141* >>>>> *14 142* >>>>> *15 143* >>>>> >>>>> As you said, no number here reaches 144. >>>>> But the number stored in subdomain_IS are 9x16= #subdomains x 16, >>>>> whereas I would expect, also given your latest reply, 9x16x16x2=#subdomains >>>>> x submatrix height x submatrix width x length of a (row,column) pair. >>>>> It would really help me if you could briefly explain how the output >>>>> above encodes the subdivision into subdomains. >>>>> Many thanks again, >>>>> Leonardo >>>>> >>>>> >>>>> >>>>> Il giorno mar 9 mag 2023 alle ore 16:24 Matthew Knepley < >>>>> knepley at gmail.com> ha scritto: >>>>> >>>>>> On Tue, May 9, 2023 at 10:05?AM LEONARDO MUTTI < >>>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>>> >>>>>>> Great thanks! I can now successfully run >>>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/ex71f.F90 >>>>>>> . >>>>>>> >>>>>>> Going forward with my experiments, let me post a new code snippet >>>>>>> (very similar to ex71f.F90) that I cannot get to work, probably I must be >>>>>>> setting up the IS objects incorrectly. >>>>>>> >>>>>>> I have an 8x8 matrix A=diag(1,1,2,2,...,2) and a >>>>>>> vector b=(0.5,...,0.5). We have only one processor, and I want to solve >>>>>>> Ax=b using GASM. In particular, KSP is set to preonly, GASM is the >>>>>>> preconditioner and it uses on each submatrix an lu direct solver (sub_ksp = >>>>>>> preonly, sub_pc = lu). >>>>>>> >>>>>>> For the GASM algorithm, I divide A into diag(1,1) and >>>>>>> diag(2,2,...,2). For simplicity I set 0 overlap. Now I want to use GASM to >>>>>>> solve Ax=b. The code follows. >>>>>>> >>>>>>> #include >>>>>>> #include >>>>>>> #include >>>>>>> USE petscmat >>>>>>> USE petscksp >>>>>>> USE petscpc >>>>>>> USE MPI >>>>>>> >>>>>>> Mat :: A >>>>>>> Vec :: b, x >>>>>>> PetscInt :: M, I, J, ISLen, NSub >>>>>>> PetscMPIInt :: size >>>>>>> PetscErrorCode :: ierr >>>>>>> PetscScalar :: v >>>>>>> KSP :: ksp >>>>>>> PC :: pc >>>>>>> IS :: subdomains_IS(2), inflated_IS(2) >>>>>>> PetscInt,DIMENSION(4) :: indices_first_domain >>>>>>> PetscInt,DIMENSION(36) :: indices_second_domain >>>>>>> >>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER, ierr) >>>>>>> call MPI_Comm_size(PETSC_COMM_WORLD, size, ierr) >>>>>>> >>>>>>> >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> ! INTRO: create matrix and right hand side >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> >>>>>>> WRITE(*,*) "Assembling A,b" >>>>>>> >>>>>>> M = 8 >>>>>>> call MatCreateAIJ(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>> & M, M, PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER,A, ierr) >>>>>>> DO I=1,M >>>>>>> DO J=1,M >>>>>>> IF ((I .EQ. J) .AND. (I .LE. 2 )) THEN >>>>>>> v = 1 >>>>>>> ELSE IF ((I .EQ. J) .AND. (I .GT. 2 )) THEN >>>>>>> v = 2 >>>>>>> ELSE >>>>>>> v = 0 >>>>>>> ENDIF >>>>>>> call MatSetValue(A, I-1, J-1, v, INSERT_VALUES, ierr) >>>>>>> END DO >>>>>>> END DO >>>>>>> >>>>>>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>>>>>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>>>>>> >>>>>>> call VecCreate(PETSC_COMM_WORLD,b,ierr) >>>>>>> call VecSetSizes(b, PETSC_DECIDE, M,ierr) >>>>>>> call VecSetFromOptions(b,ierr) >>>>>>> >>>>>>> do I=1,M >>>>>>> v = 0.5 >>>>>>> call VecSetValue(b,I-1,v, INSERT_VALUES,ierr) >>>>>>> end do >>>>>>> >>>>>>> call VecAssemblyBegin(b,ierr) >>>>>>> call VecAssemblyEnd(b,ierr) >>>>>>> >>>>>>> >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> ! FIRST KSP/PC SETUP >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> >>>>>>> WRITE(*,*) "KSP/PC first setup" >>>>>>> >>>>>>> call KSPCreate(PETSC_COMM_WORLD, ksp, ierr) >>>>>>> call KSPSetOperators(ksp, A, A, ierr) >>>>>>> call KSPSetType(ksp, 'preonly', ierr) >>>>>>> call KSPGetPC(ksp, pc, ierr) >>>>>>> call KSPSetUp(ksp, ierr) >>>>>>> call PCSetType(pc, PCGASM, ierr) >>>>>>> >>>>>>> >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> ! GASM, SETTING SUBDOMAINS >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> >>>>>>> WRITE(*,*) "Setting GASM subdomains" >>>>>>> >>>>>>> ! Let's create the subdomain IS and inflated_IS >>>>>>> ! They are equal if no overlap is present >>>>>>> ! They are 1: 0,1,8,9 >>>>>>> ! 2: 10,...,15,18,...,23,...,58,...,63 >>>>>>> >>>>>>> indices_first_domain = [0,1,8,9] ! corresponds to diag(1,1) >>>>>>> do I=0,5 >>>>>>> do J=0,5 >>>>>>> indices_second_domain(I*6+1+J) = 18 + J + 8*I ! >>>>>>> corresponds to diag(2,2,...,2) >>>>>>> !WRITE(*,*) I*6+1+J, 18 + J + 8*I >>>>>>> end do >>>>>>> end do >>>>>>> >>>>>>> ! Convert into IS >>>>>>> ISLen = 4 >>>>>>> call >>>>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>>>> & PETSC_COPY_VALUES, subdomains_IS(1), ierr) >>>>>>> call >>>>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_first_domain, >>>>>>> & PETSC_COPY_VALUES, inflated_IS(1), ierr) >>>>>>> ISLen = 36 >>>>>>> call >>>>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>>>> & PETSC_COPY_VALUES, subdomains_IS(2), ierr) >>>>>>> call >>>>>>> ISCreateGeneral(PETSC_COMM_WORLD,ISLen,indices_second_domain, >>>>>>> & PETSC_COPY_VALUES, inflated_IS(2), ierr) >>>>>>> >>>>>>> NSub = 2 >>>>>>> call PCGASMSetSubdomains(pc,NSub, >>>>>>> & subdomains_IS,inflated_IS,ierr) >>>>>>> call PCGASMDestroySubdomains(NSub, >>>>>>> & subdomains_IS,inflated_IS,ierr) >>>>>>> >>>>>>> >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> ! GASM: SET SUBSOLVERS >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> >>>>>>> WRITE(*,*) "Setting subsolvers for GASM" >>>>>>> >>>>>>> call PCSetUp(pc, ierr) ! should I add this? >>>>>>> >>>>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>>>> & "-sub_pc_type", "lu", ierr) >>>>>>> call PetscOptionsSetValue(PETSC_NULL_OPTIONS, >>>>>>> & "-sub_ksp_type", "preonly", ierr) >>>>>>> >>>>>>> call KSPSetFromOptions(ksp, ierr) >>>>>>> call PCSetFromOptions(pc, ierr) >>>>>>> >>>>>>> >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> ! DUMMY SOLUTION: DID IT WORK? >>>>>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >>>>>>> >>>>>>> WRITE(*,*) "Solve" >>>>>>> >>>>>>> call VecDuplicate(b,x,ierr) >>>>>>> call KSPSolve(ksp,b,x,ierr) >>>>>>> >>>>>>> call MatDestroy(A, ierr) >>>>>>> call KSPDestroy(ksp, ierr) >>>>>>> call PetscFinalize(ierr) >>>>>>> >>>>>>> This code is failing in multiple points. At call PCSetUp(pc, ierr) >>>>>>> it produces: >>>>>>> >>>>>>> *[0]PETSC ERROR: Argument out of range* >>>>>>> *[0]PETSC ERROR: Scatter indices in ix are out of range* >>>>>>> *...* >>>>>>> *[0]PETSC ERROR: #1 VecScatterCreate() at >>>>>>> ***\src\vec\is\sf\INTERF~1\vscat.c:736* >>>>>>> *[0]PETSC ERROR: #2 PCSetUp_GASM() at >>>>>>> ***\src\ksp\pc\impls\gasm\gasm.c:433* >>>>>>> *[0]PETSC ERROR: #3 PCSetUp() at >>>>>>> ***\src\ksp\pc\INTERF~1\precon.c:994* >>>>>>> >>>>>>> And at call KSPSolve(ksp,b,x,ierr) it produces: >>>>>>> >>>>>>> *forrtl: severe (157): Program Exception - access violation* >>>>>>> >>>>>>> >>>>>>> The index sets are setup coherently with the outputs of e.g. >>>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tests/output/ex71f_1.out: >>>>>>> in particular each element of the matrix A corresponds to a number from 0 >>>>>>> to 63. >>>>>>> >>>>>> >>>>>> This is not correct, I believe. The indices are row/col indices, not >>>>>> indices into dense blocks, so for >>>>>> your example, they are all in [0, 8]. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Note that each submatrix does not represent some physical subdomain, >>>>>>> the subdivision is just at the algebraic level. >>>>>>> I thus have the following questions: >>>>>>> >>>>>>> - is this the correct way of creating the IS objects, given my >>>>>>> objective at the beginning of the email? Is the ordering correct? >>>>>>> - what am I doing wrong that is generating the above errors? >>>>>>> >>>>>>> Thanks for the patience and the time. >>>>>>> Best, >>>>>>> Leonardo >>>>>>> >>>>>>> Il giorno ven 5 mag 2023 alle ore 18:43 Barry Smith < >>>>>>> bsmith at petsc.dev> ha scritto: >>>>>>> >>>>>>>> >>>>>>>> Added in *barry/2023-05-04/add-pcgasm-set-subdomains *see also >>>>>>>> https://gitlab.com/petsc/petsc/-/merge_requests/6419 >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> >>>>>>>> On May 4, 2023, at 11:23 AM, LEONARDO MUTTI < >>>>>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>>>>> >>>>>>>> Thank you for the help. >>>>>>>> Adding to my example: >>>>>>>> >>>>>>>> >>>>>>>> * call PCGASMSetSubdomains(pc,NSub, subdomains_IS, >>>>>>>> inflated_IS,ierr) call >>>>>>>> PCGASMDestroySubdomains(NSub,subdomains_IS,inflated_IS,ierr)* >>>>>>>> results in: >>>>>>>> >>>>>>>> * Error LNK2019 unresolved external symbol >>>>>>>> PCGASMDESTROYSUBDOMAINS referenced in function ... * >>>>>>>> >>>>>>>> * Error LNK2019 unresolved external symbol PCGASMSETSUBDOMAINS >>>>>>>> referenced in function ... * >>>>>>>> I'm not sure if the interfaces are missing or if I have a >>>>>>>> compilation problem. >>>>>>>> Thank you again. >>>>>>>> Best, >>>>>>>> Leonardo >>>>>>>> >>>>>>>> Il giorno sab 29 apr 2023 alle ore 20:30 Barry Smith < >>>>>>>> bsmith at petsc.dev> ha scritto: >>>>>>>> >>>>>>>>> >>>>>>>>> Thank you for the test code. I have a fix in the branch >>>>>>>>> barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>>>> with >>>>>>>>> merge request https://gitlab.com/petsc/petsc/-/merge_requests/6394 >>>>>>>>> >>>>>>>>> The functions did not have proper Fortran stubs and interfaces >>>>>>>>> so I had to provide them manually in the new branch. >>>>>>>>> >>>>>>>>> Use >>>>>>>>> >>>>>>>>> git fetch >>>>>>>>> git checkout barry/2023-04-29/fix-pcasmcreatesubdomains2d >>>>>>>>> >>>>>>>>> ./configure etc >>>>>>>>> >>>>>>>>> Your now working test code is in src/ksp/ksp/tests/ex71f.F90 I >>>>>>>>> had to change things slightly and I updated the error handling for the >>>>>>>>> latest version. >>>>>>>>> >>>>>>>>> Please let us know if you have any later questions. >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Apr 28, 2023, at 12:07 PM, LEONARDO MUTTI < >>>>>>>>> leonardo.mutti01 at universitadipavia.it> wrote: >>>>>>>>> >>>>>>>>> Hello. I am having a hard time understanding the index sets to >>>>>>>>> feed PCGASMSetSubdomains, and I am working in Fortran (as a PETSc novice). >>>>>>>>> To get more intuition on how the IS objects behave I tried the following >>>>>>>>> minimal (non) working example, which should tile a 16x16 matrix into 16 >>>>>>>>> square, non-overlapping submatrices: >>>>>>>>> >>>>>>>>> #include >>>>>>>>> #include >>>>>>>>> #include >>>>>>>>> USE petscmat >>>>>>>>> USE petscksp >>>>>>>>> USE petscpc >>>>>>>>> >>>>>>>>> Mat :: A >>>>>>>>> PetscInt :: M, NSubx, dof, overlap, NSub >>>>>>>>> INTEGER :: I,J >>>>>>>>> PetscErrorCode :: ierr >>>>>>>>> PetscScalar :: v >>>>>>>>> KSP :: ksp >>>>>>>>> PC :: pc >>>>>>>>> IS :: subdomains_IS, inflated_IS >>>>>>>>> >>>>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER , ierr) >>>>>>>>> >>>>>>>>> !-----Create a dummy matrix >>>>>>>>> M = 16 >>>>>>>>> call MatCreateAIJ(MPI_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >>>>>>>>> & M, M, >>>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>>> & PETSC_DEFAULT_INTEGER, PETSC_NULL_INTEGER, >>>>>>>>> & A, ierr) >>>>>>>>> >>>>>>>>> DO I=1,M >>>>>>>>> DO J=1,M >>>>>>>>> v = I*J >>>>>>>>> CALL MatSetValue (A,I-1,J-1,v, >>>>>>>>> & INSERT_VALUES , ierr) >>>>>>>>> END DO >>>>>>>>> END DO >>>>>>>>> >>>>>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY , ierr) >>>>>>>>> >>>>>>>>> !-----Create KSP and PC >>>>>>>>> call KSPCreate(PETSC_COMM_WORLD,ksp, ierr) >>>>>>>>> call KSPSetOperators(ksp,A,A, ierr) >>>>>>>>> call KSPSetType(ksp,"bcgs",ierr) >>>>>>>>> call KSPGetPC(ksp,pc,ierr) >>>>>>>>> call KSPSetUp(ksp, ierr) >>>>>>>>> call PCSetType(pc,PCGASM, ierr) >>>>>>>>> call PCSetUp(pc , ierr) >>>>>>>>> >>>>>>>>> !-----GASM setup >>>>>>>>> NSubx = 4 >>>>>>>>> dof = 1 >>>>>>>>> overlap = 0 >>>>>>>>> >>>>>>>>> call PCGASMCreateSubdomains2D(pc, >>>>>>>>> & M, M, >>>>>>>>> & NSubx, NSubx, >>>>>>>>> & dof, overlap, >>>>>>>>> & NSub, subdomains_IS, inflated_IS, ierr) >>>>>>>>> >>>>>>>>> call ISView(subdomains_IS, PETSC_VIEWER_STDOUT_WORLD, ierr) >>>>>>>>> >>>>>>>>> call KSPDestroy(ksp, ierr) >>>>>>>>> call PetscFinalize(ierr) >>>>>>>>> >>>>>>>>> Running this on one processor, I get NSub = 4. >>>>>>>>> If PCASM and PCASMCreateSubdomains2D are used instead, I get NSub >>>>>>>>> = 16 as expected. >>>>>>>>> Moreover, I get in the end "forrtl: severe (157): Program >>>>>>>>> Exception - access violation". So: >>>>>>>>> 1) why do I get two different results with ASM, and GASM? >>>>>>>>> 2) why do I get access violation and how can I solve this? >>>>>>>>> In fact, in C, subdomains_IS, inflated_IS should pointers to IS >>>>>>>>> objects. As I see on the Fortran interface, the arguments to >>>>>>>>> PCGASMCreateSubdomains2D are IS objects: >>>>>>>>> >>>>>>>>> subroutine PCGASMCreateSubdomains2D(a,b,c,d,e,f,g,h,i,j,z) >>>>>>>>> import tPC,tIS >>>>>>>>> PC a ! PC >>>>>>>>> PetscInt b ! PetscInt >>>>>>>>> PetscInt c ! PetscInt >>>>>>>>> PetscInt d ! PetscInt >>>>>>>>> PetscInt e ! PetscInt >>>>>>>>> PetscInt f ! PetscInt >>>>>>>>> PetscInt g ! PetscInt >>>>>>>>> PetscInt h ! PetscInt >>>>>>>>> IS i ! IS >>>>>>>>> IS j ! IS >>>>>>>>> PetscErrorCode z >>>>>>>>> end subroutine PCGASMCreateSubdomains2D >>>>>>>>> Thus: >>>>>>>>> 3) what should be inside e.g., subdomains_IS? I expect it to >>>>>>>>> contain, for every created subdomain, the list of rows and columns defining >>>>>>>>> the subblock in the matrix, am I right? >>>>>>>>> >>>>>>>>> Context: I have a block-tridiagonal system arising from space-time >>>>>>>>> finite elements, and I want to solve it with GMRES+PCGASM preconditioner, >>>>>>>>> where each overlapping submatrix is on the diagonal and of size 3x3 blocks >>>>>>>>> (and spanning multiple processes). This is PETSc 3.17.1 on Windows. >>>>>>>>> >>>>>>>>> Thanks in advance, >>>>>>>>> Leonardo >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 18 21:01:57 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 May 2023 22:01:57 -0400 Subject: [petsc-users] Using dmplexdistribute do parallel FEM code. In-Reply-To: References: Message-ID: On Thu, May 18, 2023 at 8:47?PM neil liu wrote: > Thanks, Matt. I am using the following steps to build a local to global > mapping. > Step 1) PetscSectionCreate (); > PetscSectionSetNumFields(); > PetscSectionSetChart (); > //Set dof for each node > PetscSectionSetup (s); > Step 2) > PetscCall(DMGetLocalToGlobalMapping(dm, <ogm)); > PetscCall(ISLocalToGlobalMappingGetIndices(ltogm, &g_idx)); > > For 2D mesh from the example, > https://wg-beginners.readthedocs.io/en/latest/tutorials/dm/plex/introductory_tutorial_plex.html > the map from rank 0 (MPI) and rank 1(MPI) matches well with the global > node ordering. > > But When I tried a 3D gmsh, (2 hexahedra (12 nodes ) were split into 12 > tetrahedra). ( 2 processors in total ) Each processor handles 9 nodes > separately. When I checked the local to global mapping, It seems the > mapping is not right. > For example, the mapping array is > Local Global (left: local node number; right: global node number). > provided. > 0 0 > 1 3 > 2 5 > 3 1 > 4 6 > 5 8 > 6 2 > 7 9 > 8 11 > But the coordinates between the local (left) and global nodes (right) > are not the same, meaning they are not matching. Did I miss something? > Plex reorders the nodes when it partitions the mesh. Add DMViewFromOptions(dm, NULL, "-dm_view"); to your code and -dm_view ::ascii_info_detail to the command line. Then it will print out a mesh description that shows you the coordinates of the vertices, and you can match them up. Thanks, Matt > In addition, the simple 3D gmsh file has been attached if you need. > *Gmsh file:* > $MeshFormat > 4.1 0 8 > $EndMeshFormat > $PhysicalNames > 3 > 2 27 "Excit" > 2 28 "PEC" > 3 29 "volume" > $EndPhysicalNames > $Entities > 8 12 6 1 > 1 0 0 0 0 > 2 0 0.25 0 0 > 3 0 0.25 0.25 0 > 4 0 0 0.25 0 > 5 1 0 0 0 > 6 1 0.25 0 0 > 10 1 0.25 0.25 0 > 14 1 0 0.25 0 > 1 0 0 0 0 0.25 0 0 2 1 -2 > 2 0 0.25 0 0 0.25 0.25 0 2 2 -3 > 3 0 0 0.25 0 0.25 0.25 0 2 3 -4 > 4 0 0 0 0 0 0.25 0 2 4 -1 > 6 1 0 0 1 0.25 0 0 2 5 -6 > 7 1 0.25 0 1 0.25 0.25 0 2 6 -10 > 8 1 0 0.25 1 0.25 0.25 0 2 10 -14 > 9 1 0 0 1 0 0.25 0 2 14 -5 > 11 0 0 0 1 0 0 0 2 1 -5 > 12 0 0.25 0 1 0.25 0 0 2 2 -6 > 16 0 0.25 0.25 1 0.25 0.25 0 2 3 -10 > 20 0 0 0.25 1 0 0.25 0 2 4 -14 > 1 0 0 0 0 0.25 0.25 1 27 4 1 2 3 4 > 13 0 0 0 1 0.25 0 1 28 4 1 12 -6 -11 > 17 0 0.25 0 1 0.25 0.25 0 4 2 16 -7 -12 > 21 0 0 0.25 1 0.25 0.25 1 28 4 3 20 -8 -16 > 25 0 0 0 1 0 0.25 0 4 4 11 -9 -20 > 26 1 0 0 1 0.25 0.25 1 28 4 6 7 8 9 > 1 0 0 0 1 0.25 0.25 1 29 6 -1 26 13 17 21 25 > $EndEntities > $Nodes > 17 12 1 12 > 0 1 0 1 > 1 > 0 0 0 > 0 2 0 1 > 2 > 0 0.25 0 > 0 3 0 1 > 3 > 0 0.25 0.25 > 0 4 0 1 > 4 > 0 0 0.25 > 0 5 0 1 > 5 > 1 0 0 > 0 6 0 1 > 6 > 1 0.25 0 > 0 10 0 1 > 7 > 1 0.25 0.25 > 0 14 0 1 > 8 > 1 0 0.25 > 1 11 0 1 > 9 > 0.5 0 0 > 1 12 0 1 > 10 > 0.5 0.25 0 > 1 16 0 1 > 11 > 0.5 0.25 0.25 > 1 20 0 1 > 12 > 0.5 0 0.25 > 2 1 0 0 > 2 13 0 0 > 2 21 0 0 > 2 26 0 0 > 3 1 0 0 > $EndNodes > $Elements > 5 24 1 24 > 2 1 2 2 > 1 1 2 4 > 2 4 2 3 > 2 13 2 4 > 3 9 1 2 > 4 9 2 10 > 5 5 9 10 > 6 5 10 6 > 2 21 2 4 > 7 11 3 4 > 8 11 4 12 > 9 7 11 12 > 10 7 12 8 > 2 26 2 2 > 11 5 6 8 > 12 8 6 7 > 3 1 4 12 > 13 1 2 4 12 > 14 10 9 12 2 > 15 2 9 12 1 > 16 9 10 12 8 > 17 6 5 8 10 > 18 10 5 8 9 > 19 4 2 3 11 > 20 10 12 11 2 > 21 2 12 11 4 > 22 12 10 11 7 > 23 6 8 7 10 > 24 10 8 7 12 > $EndElements > > On Wed, May 17, 2023 at 7:30?PM Matthew Knepley wrote: > >> On Wed, May 17, 2023 at 6:58?PM neil liu wrote: >> >>> Dear Petsc developers, >>> >>> I am writing my own code to calculate the FEM matrix. The following is >>> my general framework, >>> >>> DMPlexCreateGmsh(); >>> MPI_Comm_rank (Petsc_comm_world, &rank); >>> DMPlexDistribute (.., .., &dmDist); >>> >>> dm = dmDist; >>> //This can create separate dm s for different processors. (reordering.) >>> >>> MatCreate (Petsc_comm_world, &A) >>> // Loop over every tetrahedral element to calculate the local matrix for >>> each processor. Then we can get a local matrix A for each processor. >>> >>> *My question is : it seems we should build a global matrix B (assemble >>> all the As for each partition) and then transfer B to KSP. KSP will do the >>> parallelization correctly, right? * >>> >> >> I would not suggest this. The more common strategy is to assemble each >> element matrix directly into the >> global matrix B, by mapping the cell indices directly to global indices >> (rather than to local indices in the matrix A). You can do this in two >> stages. You can create a LocalToGlobalMapping in PETSc that maps >> every local index to a global index. Then you can assemble into B exactly >> as you would assemble into A by calling MatSetValuesLocal(). >> >> DMPlex handles these mappings for you automatically, but I realize that >> it is a large number of things to buy into. >> >> Thanks, >> >> Matt >> >> >>> If that is right, I should define a whole domain matrix B before the >>> partitioning (MatCreate (Petsc_comm_world, &B); ), and then use >>> localtoglobal (which petsc function should I use? Do you have any >>> examples.) map to add A to B at the right positions (MatSetValues) ? >>> >>> Does that make sense? >>> >>> Thanks, >>> >>> Xiaodong >>> >>> >>> >>> >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Fri May 19 09:11:38 2023 From: liufield at gmail.com (neil liu) Date: Fri, 19 May 2023 10:11:38 -0400 Subject: [petsc-users] Using dmplexdistribute do parallel FEM code. In-Reply-To: References: Message-ID: Thanks, Matt. Following your explanations, my understanding is this "If we use multiple MPI processors, the global numbering of the vertices (global domain) will be different from that with only one processor, right? ". If this is the case, will it be easy for us to check the assembled matrix from multiple processors by comparing against that of a single processor ? I checked the DM details for a simple 2D mesh. How could I find (from the DM view below) the information about the global numbering of a vertex when I use 2 processors ? _______ | | | | __ |__ | | | | |___ |___| DM Object: Parallel Mesh 2 MPI processes type: plex Parallel Mesh in 2 dimensions: Supports: [0] Max support size: 3 [0]: 2 ----> 8 [0]: 2 ----> 12 [0]: 3 ----> 8 [0]: 3 ----> 9 [0]: 3 ----> 13 [0]: 4 ----> 9 [0]: 4 ----> 14 [0]: 5 ----> 10 [0]: 5 ----> 12 [0]: 6 ----> 10 [0]: 6 ----> 11 [0]: 6 ----> 13 [0]: 7 ----> 11 [0]: 7 ----> 14 [0]: 8 ----> 0 [0]: 9 ----> 1 [0]: 10 ----> 0 [0]: 11 ----> 1 [0]: 12 ----> 0 [0]: 13 ----> 0 [0]: 13 ----> 1 [0]: 14 ----> 1 [1] Max support size: 3 [1]: 2 ----> 8 [1]: 2 ----> 12 [1]: 3 ----> 8 [1]: 3 ----> 9 [1]: 3 ----> 13 [1]: 4 ----> 9 [1]: 4 ----> 14 [1]: 5 ----> 10 [1]: 5 ----> 12 [1]: 6 ----> 10 [1]: 6 ----> 11 [1]: 6 ----> 13 [1]: 7 ----> 11 [1]: 7 ----> 14 [1]: 8 ----> 0 [1]: 9 ----> 1 [1]: 10 ----> 0 [1]: 11 ----> 1 [1]: 12 ----> 0 [1]: 13 ----> 0 [1]: 13 ----> 1 [1]: 14 ----> 1 Cones: [0] Max cone size: 4 [0]: 0 <---- 8 (0) [0]: 0 <---- 13 (0) [0]: 0 <---- 10 (-1) [0]: 0 <---- 12 (-1) [0]: 1 <---- 9 (0) [0]: 1 <---- 14 (0) [0]: 1 <---- 11 (-1) [0]: 1 <---- 13 (-1) [0]: 8 <---- 2 (0) [0]: 8 <---- 3 (0) [0]: 9 <---- 3 (0) [0]: 9 <---- 4 (0) [0]: 10 <---- 5 (0) [0]: 10 <---- 6 (0) [0]: 11 <---- 6 (0) [0]: 11 <---- 7 (0) [0]: 12 <---- 2 (0) [0]: 12 <---- 5 (0) [0]: 13 <---- 3 (0) [0]: 13 <---- 6 (0) [0]: 14 <---- 4 (0) [0]: 14 <---- 7 (0) [1] Max cone size: 4 [1]: 0 <---- 8 (0) [1]: 0 <---- 13 (0) [1]: 0 <---- 10 (-1) [1]: 0 <---- 12 (-1) [1]: 1 <---- 9 (0) [1]: 1 <---- 14 (0) [1]: 1 <---- 11 (-1) [1]: 1 <---- 13 (-1) [1]: 8 <---- 2 (0) [1]: 8 <---- 3 (0) [1]: 9 <---- 3 (0) [1]: 9 <---- 4 (0) [1]: 10 <---- 5 (0) [1]: 10 <---- 6 (0) [1]: 11 <---- 6 (0) [1]: 11 <---- 7 (0) [1]: 12 <---- 2 (0) [1]: 12 <---- 5 (0) [1]: 13 <---- 3 (0) [1]: 13 <---- 6 (0) [1]: 14 <---- 4 (0) [1]: 14 <---- 7 (0) coordinates with 1 fields field 0 with 2 components Process 0: ( 2) dim 2 offset 0 0. 0. ( 3) dim 2 offset 2 0.5 0. ( 4) dim 2 offset 4 1. 0. ( 5) dim 2 offset 6 0. 0.5 ( 6) dim 2 offset 8 0.5 0.5 ( 7) dim 2 offset 10 1. 0.5 Process 1: ( 2) dim 2 offset 0 0. 0.5 ( 3) dim 2 offset 2 0.5 0.5 ( 4) dim 2 offset 4 1. 0.5 ( 5) dim 2 offset 6 0. 1. ( 6) dim 2 offset 8 0.5 1. ( 7) dim 2 offset 10 1. 1. Labels: Label 'marker': [0]: 2 (1) [0]: 3 (1) [0]: 4 (1) [0]: 5 (1) [0]: 7 (1) [0]: 8 (1) [0]: 9 (1) [0]: 12 (1) [0]: 14 (1) [1]: 2 (1) [1]: 4 (1) [1]: 5 (1) [1]: 6 (1) [1]: 7 (1) [1]: 10 (1) [1]: 11 (1) [1]: 12 (1) [1]: 14 (1) Label 'Face Sets': [0]: 8 (1) [0]: 9 (1) [0]: 14 (2) [0]: 12 (4) [1]: 14 (2) [1]: 10 (3) [1]: 11 (3) [1]: 12 (4) Label 'celltype': [0]: 2 (0) [0]: 3 (0) [0]: 4 (0) [0]: 5 (0) [0]: 6 (0) [0]: 7 (0) [0]: 8 (1) [0]: 9 (1) [0]: 10 (1) [0]: 11 (1) [0]: 12 (1) [0]: 13 (1) [0]: 14 (1) [0]: 0 (4) [0]: 1 (4) [1]: 2 (0) [1]: 3 (0) [1]: 4 (0) [1]: 5 (0) [1]: 6 (0) [1]: 7 (0) [1]: 8 (1) [1]: 9 (1) [1]: 10 (1) [1]: 11 (1) [1]: 12 (1) [1]: 13 (1) [1]: 14 (1) [1]: 0 (4) [1]: 1 (4) PetscSF Object: 2 MPI processes type: basic [0] Number of roots=15, leaves=5, remote ranks=1 [0] 5 <- (1,2) [0] 6 <- (1,3) [0] 7 <- (1,4) [0] 10 <- (1,8) [0] 11 <- (1,9) [1] Number of roots=15, leaves=0, remote ranks=0 [0] Roots referenced by my leaves, by rank [0] 1: 5 edges [0] 5 <- 2 [0] 6 <- 3 [0] 7 <- 4 [0] 10 <- 8 [0] 11 <- 9 [1] Roots referenced by my leaves, by rank MultiSF sort=rank-order Vec Object: 2 MPI processes On Thu, May 18, 2023 at 10:02?PM Matthew Knepley wrote: > On Thu, May 18, 2023 at 8:47?PM neil liu wrote: > >> Thanks, Matt. I am using the following steps to build a local to global >> mapping. >> Step 1) PetscSectionCreate (); >> PetscSectionSetNumFields(); >> PetscSectionSetChart (); >> //Set dof for each node >> PetscSectionSetup (s); >> Step 2) >> PetscCall(DMGetLocalToGlobalMapping(dm, <ogm)); >> PetscCall(ISLocalToGlobalMappingGetIndices(ltogm, &g_idx)); >> >> For 2D mesh from the example, >> https://wg-beginners.readthedocs.io/en/latest/tutorials/dm/plex/introductory_tutorial_plex.html >> the map from rank 0 (MPI) and rank 1(MPI) matches well with the global >> node ordering. >> >> But When I tried a 3D gmsh, (2 hexahedra (12 nodes ) were split into 12 >> tetrahedra). ( 2 processors in total ) Each processor handles 9 nodes >> separately. When I checked the local to global mapping, It seems the >> mapping is not right. >> For example, the mapping array is >> Local Global (left: local node number; right: global node number). >> provided. >> 0 0 >> 1 3 >> 2 5 >> 3 1 >> 4 6 >> 5 8 >> 6 2 >> 7 9 >> 8 11 >> But the coordinates between the local (left) and global nodes (right) >> are not the same, meaning they are not matching. Did I miss something? >> > > Plex reorders the nodes when it partitions the mesh. Add > > DMViewFromOptions(dm, NULL, "-dm_view"); > > to your code and > > -dm_view ::ascii_info_detail > > to the command line. Then it will print out a mesh description that shows > you the coordinates > of the vertices, and you can match them up. > > Thanks, > > Matt > > >> In addition, the simple 3D gmsh file has been attached if you need. >> *Gmsh file:* >> $MeshFormat >> 4.1 0 8 >> $EndMeshFormat >> $PhysicalNames >> 3 >> 2 27 "Excit" >> 2 28 "PEC" >> 3 29 "volume" >> $EndPhysicalNames >> $Entities >> 8 12 6 1 >> 1 0 0 0 0 >> 2 0 0.25 0 0 >> 3 0 0.25 0.25 0 >> 4 0 0 0.25 0 >> 5 1 0 0 0 >> 6 1 0.25 0 0 >> 10 1 0.25 0.25 0 >> 14 1 0 0.25 0 >> 1 0 0 0 0 0.25 0 0 2 1 -2 >> 2 0 0.25 0 0 0.25 0.25 0 2 2 -3 >> 3 0 0 0.25 0 0.25 0.25 0 2 3 -4 >> 4 0 0 0 0 0 0.25 0 2 4 -1 >> 6 1 0 0 1 0.25 0 0 2 5 -6 >> 7 1 0.25 0 1 0.25 0.25 0 2 6 -10 >> 8 1 0 0.25 1 0.25 0.25 0 2 10 -14 >> 9 1 0 0 1 0 0.25 0 2 14 -5 >> 11 0 0 0 1 0 0 0 2 1 -5 >> 12 0 0.25 0 1 0.25 0 0 2 2 -6 >> 16 0 0.25 0.25 1 0.25 0.25 0 2 3 -10 >> 20 0 0 0.25 1 0 0.25 0 2 4 -14 >> 1 0 0 0 0 0.25 0.25 1 27 4 1 2 3 4 >> 13 0 0 0 1 0.25 0 1 28 4 1 12 -6 -11 >> 17 0 0.25 0 1 0.25 0.25 0 4 2 16 -7 -12 >> 21 0 0 0.25 1 0.25 0.25 1 28 4 3 20 -8 -16 >> 25 0 0 0 1 0 0.25 0 4 4 11 -9 -20 >> 26 1 0 0 1 0.25 0.25 1 28 4 6 7 8 9 >> 1 0 0 0 1 0.25 0.25 1 29 6 -1 26 13 17 21 25 >> $EndEntities >> $Nodes >> 17 12 1 12 >> 0 1 0 1 >> 1 >> 0 0 0 >> 0 2 0 1 >> 2 >> 0 0.25 0 >> 0 3 0 1 >> 3 >> 0 0.25 0.25 >> 0 4 0 1 >> 4 >> 0 0 0.25 >> 0 5 0 1 >> 5 >> 1 0 0 >> 0 6 0 1 >> 6 >> 1 0.25 0 >> 0 10 0 1 >> 7 >> 1 0.25 0.25 >> 0 14 0 1 >> 8 >> 1 0 0.25 >> 1 11 0 1 >> 9 >> 0.5 0 0 >> 1 12 0 1 >> 10 >> 0.5 0.25 0 >> 1 16 0 1 >> 11 >> 0.5 0.25 0.25 >> 1 20 0 1 >> 12 >> 0.5 0 0.25 >> 2 1 0 0 >> 2 13 0 0 >> 2 21 0 0 >> 2 26 0 0 >> 3 1 0 0 >> $EndNodes >> $Elements >> 5 24 1 24 >> 2 1 2 2 >> 1 1 2 4 >> 2 4 2 3 >> 2 13 2 4 >> 3 9 1 2 >> 4 9 2 10 >> 5 5 9 10 >> 6 5 10 6 >> 2 21 2 4 >> 7 11 3 4 >> 8 11 4 12 >> 9 7 11 12 >> 10 7 12 8 >> 2 26 2 2 >> 11 5 6 8 >> 12 8 6 7 >> 3 1 4 12 >> 13 1 2 4 12 >> 14 10 9 12 2 >> 15 2 9 12 1 >> 16 9 10 12 8 >> 17 6 5 8 10 >> 18 10 5 8 9 >> 19 4 2 3 11 >> 20 10 12 11 2 >> 21 2 12 11 4 >> 22 12 10 11 7 >> 23 6 8 7 10 >> 24 10 8 7 12 >> $EndElements >> >> On Wed, May 17, 2023 at 7:30?PM Matthew Knepley >> wrote: >> >>> On Wed, May 17, 2023 at 6:58?PM neil liu wrote: >>> >>>> Dear Petsc developers, >>>> >>>> I am writing my own code to calculate the FEM matrix. The following is >>>> my general framework, >>>> >>>> DMPlexCreateGmsh(); >>>> MPI_Comm_rank (Petsc_comm_world, &rank); >>>> DMPlexDistribute (.., .., &dmDist); >>>> >>>> dm = dmDist; >>>> //This can create separate dm s for different processors. (reordering.) >>>> >>>> MatCreate (Petsc_comm_world, &A) >>>> // Loop over every tetrahedral element to calculate the local matrix >>>> for each processor. Then we can get a local matrix A for each processor. >>>> >>>> *My question is : it seems we should build a global matrix B (assemble >>>> all the As for each partition) and then transfer B to KSP. KSP will do the >>>> parallelization correctly, right? * >>>> >>> >>> I would not suggest this. The more common strategy is to assemble each >>> element matrix directly into the >>> global matrix B, by mapping the cell indices directly to global indices >>> (rather than to local indices in the matrix A). You can do this in two >>> stages. You can create a LocalToGlobalMapping in PETSc that maps >>> every local index to a global index. Then you can assemble into B >>> exactly as you would assemble into A by calling MatSetValuesLocal(). >>> >>> DMPlex handles these mappings for you automatically, but I realize that >>> it is a large number of things to buy into. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> If that is right, I should define a whole domain matrix B before the >>>> partitioning (MatCreate (Petsc_comm_world, &B); ), and then use >>>> localtoglobal (which petsc function should I use? Do you have any >>>> examples.) map to add A to B at the right positions (MatSetValues) ? >>>> >>>> Does that make sense? >>>> >>>> Thanks, >>>> >>>> Xiaodong >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chenju at utexas.edu Fri May 19 13:55:25 2023 From: chenju at utexas.edu (Jau-Uei Chen) Date: Fri, 19 May 2023 13:55:25 -0500 Subject: [petsc-users] Error in building PETSc Message-ID: To whom it may concern, Currently, I am trying to build PETSc-3.17.4 on my own laptop (MacPro Late 2019) but encounter an error when performing "make all". Please see the attachment for my configuration and make.log. Any comments or suggestions on how to resolve this error are greatly appreciated. Best Regard, Jau-Uei Chen Graduate student Department of Aerospace Engineering and Engineering Mechanics The University of Texas at Austin -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 14117 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 6871011 bytes Desc: not available URL: From balay at mcs.anl.gov Fri May 19 14:02:09 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 20 May 2023 00:32:09 +0530 (IST) Subject: [petsc-users] Error in building PETSc In-Reply-To: References: Message-ID: Use "make OMAKE_PRINTDIR=gmake all" instead of "make all" or use latest release Satish On Fri, 19 May 2023, Jau-Uei Chen wrote: > To whom it may concern, > > Currently, I am trying to build PETSc-3.17.4 on my own laptop (MacPro Late > 2019) but encounter an error when performing "make all". Please see the > attachment for my configuration and make.log. > > Any comments or suggestions on how to resolve this error are greatly > appreciated. > > Best Regard, > Jau-Uei Chen > Graduate student > Department of Aerospace Engineering and Engineering Mechanics > The University of Texas at Austin > From chenju at utexas.edu Fri May 19 14:26:59 2023 From: chenju at utexas.edu (Jau-Uei Chen) Date: Fri, 19 May 2023 14:26:59 -0500 Subject: [petsc-users] Error in building PETSc In-Reply-To: References: Message-ID: Thanks for your prompt reply! It works perfectly. Best Regards, Jau-Uei Chen Graduate student Department of Aerospace Engineering and Engineering Mechanics The University of Texas at Austin On Fri, May 19, 2023 at 2:02?PM Satish Balay wrote: > Use "make OMAKE_PRINTDIR=gmake all" instead of "make all" > > or use latest release > > Satish > > On Fri, 19 May 2023, Jau-Uei Chen wrote: > > > To whom it may concern, > > > > Currently, I am trying to build PETSc-3.17.4 on my own laptop (MacPro > Late > > 2019) but encounter an error when performing "make all". Please see the > > attachment for my configuration and make.log. > > > > Any comments or suggestions on how to resolve this error are greatly > > appreciated. > > > > Best Regard, > > Jau-Uei Chen > > Graduate student > > Department of Aerospace Engineering and Engineering Mechanics > > The University of Texas at Austin > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Mon May 22 03:39:09 2023 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Mon, 22 May 2023 10:39:09 +0200 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> <0241f3b5-16b4-786b-61a3-5f8ac64a3f5a@ovgu.de> Message-ID: Dear Matt, I'm really sorry for this stupid bug! I can confirm that setting the coordinates with both CellCoordinatesLocal and CoordinatesLocal works. Best regards, Berend. Many thanks and best regards, Berend. On 5/17/23 23:04, Matthew Knepley wrote: > On Wed, May 17, 2023 at 2:01?PM Berend van Wachem > > wrote: > > Dear Matt, > > I tried it, but it doesn't seem to work. > Attached is a very small working example illustrating the problem. > I create a DMPlexBox Mesh, periodic in the Y direction. I then scale > the Y coordinates with a factor 10, and add 1.0 to it. Both > DMGetCoordinatesLocal and DMGetCellCoordinatesLocal. > Then I evaluate the coordinates with DMPlexGetCellCoordinates. Most > of the Y coordinates are correct, but not all of them - for > instance, the minimum Y coordinate is 0.0, and this should be 1.0. > > Am I doing something wrong? > > > Quickly, I see that > > ? a *= 10.0?+ 1.0; > > is the same as > > ? a *= 11.0; > > not multiply by 10 and add 1. I will send it back when I get everything > the way I want. > > ? Thanks, > > ? ? Matt > > Thanks and best regards, > > Berend. > > On 5/17/23 17:58, Matthew Knepley wrote: > > On Wed, May 17, 2023 at 11:20?AM Berend van Wachem > > >> > wrote: > > > >? ? ?Dear Matt, > > > >? ? ?Is there a way to 'redo' the DMLocalizeCoordinates() ? Or to > undo it? > >? ? ?Alternatively, can we make the calling of > DMLocalizeCoordinates() in the? DMPlexCreate...() routines optional? > > > >? ? ?Otherwise, we would have to copy all arrays of coordinates > from DMGetCoordinatesLocal() and DMGetCellCoordinatesLocal() before > >? ? ?scaling them. > > > > > > I am likely not being clear. I think all you have to do is the > following: > > > >? ? DMGetCoordinatesLocal(dm, &xl); > >? ? VecScale(xl, scale); > >? ? DMSetCoordinatesLocal(dm, xl); > >? ? DMGetCellCoordinatesLocal(dm, &xl); > >? ? VecScale(xl, scale); > >? ? DMSetCellCoordinatesLocal(dm, xl); > > > > Does this not work? > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Best regards, Berend. > > > >? ? ?On 5/17/23 16:35, Matthew Knepley wrote: > >? ? ? > On Wed, May 17, 2023 at 10:21?AM Berend van Wachem > > > > >? ? ? >>> wrote: > >? ? ? > > >? ? ? >? ? ?Dear Matt, > >? ? ? > > >? ? ? >? ? ?Thanks for getting back to me so quickly. > >? ? ? > > >? ? ? >? ? ?If I scale each of the coordinates of the mesh (say, I > want to cube each > >? ? ? >? ? ?co-ordinate), and I do this for both: > >? ? ? > > >? ? ? >? ? ?DMGetCoordinatesLocal(); > >? ? ? >? ? ?DMGetCellCoordinatesLocal(); > >? ? ? > > >? ? ? >? ? ?How do I know I am not cubing one coordinate multiple > times? > >? ? ? > > >? ? ? > > >? ? ? > Good question. Right now, the only connection between the > two sets of coordinates is DMLocalizeCoordinates(). Since > >? ? ?sometimes > >? ? ? > people want to do non-trivial things to > >? ? ? > coordinates, I prefer not to push in an API for "just" > scaling, but I could be convinced > >? ? ? > the other way. > >? ? ? > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? ?Matt > >? ? ? > > >? ? ? >? ? ?Thanks, Berend. > >? ? ? > > >? ? ? >? ? ?On 5/17/23 16:10, Matthew Knepley wrote: > >? ? ? >? ? ? > On Wed, May 17, 2023 at 10:02?AM Berend van Wachem > >? ? ? >? ? ? > > > >? ? ? >> > > >? ? ? >? ? ? >>>> wrote: > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Dear PETSc Team, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?We are using DMPlex, and we create a mesh using > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?DMPlexCreateBoxMesh (.... ); > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?and get a uniform mesh. The mesh is periodic. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?We typically want to "scale" the coordinates > (vertices) of the mesh, > >? ? ? >? ? ? >? ? ?and > >? ? ? >? ? ? >? ? ?to achieve this, we call > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?DMGetCoordinatesLocal(dm, &coordinates); > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?and scale the entries in the Vector coordinates > appropriately. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?and then > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?DMSetCoordinatesLocal(dm, coordinates); > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?After this, we localise the coordinates by calling > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?DMLocalizeCoordinates(dm); > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?This worked fine up to PETSc 3.18, but with > versions after this, the > >? ? ? >? ? ? >? ? ?coordinates we get from the call > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?DMPlexGetCellCoordinates(dm, CellID, &isDG, > &CoordSize, > >? ? ? >? ? ? >? ? ?&ArrayCoordinates, &Coordinates); > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?are no longer correct if the mesh is periodic. > A number of the > >? ? ? >? ? ? >? ? ?coordinates returned from calling > DMPlexGetCellCoordinates are wrong. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?I think, this is because DMLocalizeCoordinates > is now automatically > >? ? ? >? ? ? >? ? ?called within the routine DMPlexCreateBoxMesh. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?So, my question is: How should we scale the > coordinates from a periodic > >? ? ? >? ? ? >? ? ?DMPlex mesh so that they are reflected > correctly when calling both > >? ? ? >? ? ? >? ? ?DMGetCoordinatesLocal and > DMPlexGetCellCoordinates, with PETSc versions > >? ? ? >? ? ? >? ? ? ?>= 3.18? > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > I think we might have to add an API function. For > now, when you scale > >? ? ? >? ? ? > the coordinates, > >? ? ? >? ? ? > can you scale both copies? > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? DMGetCoordinatesLocal() > >? ? ? >? ? ? >? ? DMGetCellCoordinatesLocal(); > >? ? ? >? ? ? > > >? ? ? >? ? ? > and then set them back. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? Thanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ?Matt > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Many thanks, Berend. > >? ? ? >? ? ? > > >? ? ? >? ? ? > -- > >? ? ? >? ? ? > What most experimenters take for granted before > they begin their > >? ? ? >? ? ? > experiments is infinitely more interesting than any > results to which > >? ? ? >? ? ? > their experiments lead. > >? ? ? >? ? ? > -- Norbert Wiener > >? ? ? >? ? ? > > >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > > > > >? ? ? > >> > > >? ? ? > > >? ? ? >? ? ? > >>> > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any > >? ? ?results to > >? ? ? > which their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > > > > > >? ? ? >> > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > > which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Mon May 22 05:10:08 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 May 2023 06:10:08 -0400 Subject: [petsc-users] DMGetCoordinatesLocal and DMPlexGetCellCoordinates in PETSc > 3.18 In-Reply-To: References: <2f3c494f-3d63-a2f4-d8cc-fba6893c0ebb@ovgu.de> <5181359d-c8d6-da6e-8b0a-3fb1c6183026@ovgu.de> <0241f3b5-16b4-786b-61a3-5f8ac64a3f5a@ovgu.de> Message-ID: On Mon, May 22, 2023 at 4:41?AM Berend van Wachem wrote: > Dear Matt, > > I'm really sorry for this stupid bug! > No problem. You have really helped me get the bugs out of Plex. Thanks, Matt > I can confirm that setting the coordinates with both > CellCoordinatesLocal and CoordinatesLocal works. > > Best regards, Berend. > > Many thanks and best regards, Berend. > > On 5/17/23 23:04, Matthew Knepley wrote: > > On Wed, May 17, 2023 at 2:01?PM Berend van Wachem > > > wrote: > > > > Dear Matt, > > > > I tried it, but it doesn't seem to work. > > Attached is a very small working example illustrating the problem. > > I create a DMPlexBox Mesh, periodic in the Y direction. I then scale > > the Y coordinates with a factor 10, and add 1.0 to it. Both > > DMGetCoordinatesLocal and DMGetCellCoordinatesLocal. > > Then I evaluate the coordinates with DMPlexGetCellCoordinates. Most > > of the Y coordinates are correct, but not all of them - for > > instance, the minimum Y coordinate is 0.0, and this should be 1.0. > > > > Am I doing something wrong? > > > > > > Quickly, I see that > > > > a *= 10.0 + 1.0; > > > > is the same as > > > > a *= 11.0; > > > > not multiply by 10 and add 1. I will send it back when I get everything > > the way I want. > > > > Thanks, > > > > Matt > > > > Thanks and best regards, > > > > Berend. > > > > On 5/17/23 17:58, Matthew Knepley wrote: > > > On Wed, May 17, 2023 at 11:20?AM Berend van Wachem > > > > >> > > wrote: > > > > > > Dear Matt, > > > > > > Is there a way to 'redo' the DMLocalizeCoordinates() ? Or to > > undo it? > > > Alternatively, can we make the calling of > > DMLocalizeCoordinates() in the DMPlexCreate...() routines optional? > > > > > > Otherwise, we would have to copy all arrays of coordinates > > from DMGetCoordinatesLocal() and DMGetCellCoordinatesLocal() before > > > scaling them. > > > > > > > > > I am likely not being clear. I think all you have to do is the > > following: > > > > > > DMGetCoordinatesLocal(dm, &xl); > > > VecScale(xl, scale); > > > DMSetCoordinatesLocal(dm, xl); > > > DMGetCellCoordinatesLocal(dm, &xl); > > > VecScale(xl, scale); > > > DMSetCellCoordinatesLocal(dm, xl); > > > > > > Does this not work? > > > > > > Thanks, > > > > > > Matt > > > > > > Best regards, Berend. > > > > > > On 5/17/23 16:35, Matthew Knepley wrote: > > > > On Wed, May 17, 2023 at 10:21?AM Berend van Wachem > > > > > > > > > > >>> wrote: > > > > > > > > Dear Matt, > > > > > > > > Thanks for getting back to me so quickly. > > > > > > > > If I scale each of the coordinates of the mesh (say, I > > want to cube each > > > > co-ordinate), and I do this for both: > > > > > > > > DMGetCoordinatesLocal(); > > > > DMGetCellCoordinatesLocal(); > > > > > > > > How do I know I am not cubing one coordinate multiple > > times? > > > > > > > > > > > > Good question. Right now, the only connection between the > > two sets of coordinates is DMLocalizeCoordinates(). Since > > > sometimes > > > > people want to do non-trivial things to > > > > coordinates, I prefer not to push in an API for "just" > > scaling, but I could be convinced > > > > the other way. > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > Thanks, Berend. > > > > > > > > On 5/17/23 16:10, Matthew Knepley wrote: > > > > > On Wed, May 17, 2023 at 10:02?AM Berend van Wachem > > > > > > > > > > > > > >> > > > > > > > > > >>>> wrote: > > > > > > > > > > Dear PETSc Team, > > > > > > > > > > We are using DMPlex, and we create a mesh using > > > > > > > > > > DMPlexCreateBoxMesh (.... ); > > > > > > > > > > and get a uniform mesh. The mesh is periodic. > > > > > > > > > > We typically want to "scale" the coordinates > > (vertices) of the mesh, > > > > > and > > > > > to achieve this, we call > > > > > > > > > > DMGetCoordinatesLocal(dm, &coordinates); > > > > > > > > > > and scale the entries in the Vector coordinates > > appropriately. > > > > > > > > > > and then > > > > > > > > > > DMSetCoordinatesLocal(dm, coordinates); > > > > > > > > > > > > > > > After this, we localise the coordinates by > calling > > > > > > > > > > DMLocalizeCoordinates(dm); > > > > > > > > > > This worked fine up to PETSc 3.18, but with > > versions after this, the > > > > > coordinates we get from the call > > > > > > > > > > DMPlexGetCellCoordinates(dm, CellID, &isDG, > > &CoordSize, > > > > > &ArrayCoordinates, &Coordinates); > > > > > > > > > > are no longer correct if the mesh is periodic. > > A number of the > > > > > coordinates returned from calling > > DMPlexGetCellCoordinates are wrong. > > > > > > > > > > I think, this is because DMLocalizeCoordinates > > is now automatically > > > > > called within the routine DMPlexCreateBoxMesh. > > > > > > > > > > So, my question is: How should we scale the > > coordinates from a periodic > > > > > DMPlex mesh so that they are reflected > > correctly when calling both > > > > > DMGetCoordinatesLocal and > > DMPlexGetCellCoordinates, with PETSc versions > > > > > >= 3.18? > > > > > > > > > > > > > > > I think we might have to add an API function. For > > now, when you scale > > > > > the coordinates, > > > > > can you scale both copies? > > > > > > > > > > DMGetCoordinatesLocal() > > > > > DMGetCellCoordinatesLocal(); > > > > > > > > > > and then set them back. > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > Many thanks, Berend. > > > > > > > > > > -- > > > > > What most experimenters take for granted before > > they begin their > > > > > experiments is infinitely more interesting than any > > results to which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin > > their experiments is infinitely more interesting than any > > > results to > > > > which their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > >> > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to > > > which their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Mon May 22 10:06:47 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Mon, 22 May 2023 23:06:47 +0800 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes Message-ID: Hi, I hope this letter finds you well. I am writing to seek guidance regarding an error I encountered while solving a matrix using MUMPS on multiple nodes: ```bash Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in PMPI_Iprobe: Other MPI error, error stack: PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, status=0x7ffc130f9e80) failed MPID_Iprobe(240)..............: MPIDI_iprobe_safe(108)........: MPIDI_iprobe_unsafe(35).......: MPIDI_OFI_do_iprobe(69).......: MPIDI_OFI_handle_cq_error(949): OFI poll failed (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line 125: 0 ``` The matrix in question has a degree of freedom (dof) of 3.86e+06. Interestingly, when solving smaller-scale problems, everything functions perfectly without any issues. However, when attempting to solve the larger matrix on multiple nodes, I encounter the aforementioned error. The complete error message I received is as follows: ```bash Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in PMPI_Iprobe: Other MPI error, error stack: PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, status=0x7ffc130f9e80) failed MPID_Iprobe(240)..............: MPIDI_iprobe_safe(108)........: MPIDI_iprobe_unsafe(35).......: MPIDI_OFI_do_iprobe(69).......: MPIDI_OFI_handle_cq_error(949): OFI poll failed (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line 125: 0 /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) [0x7f6076063f2c] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) [0x7f6075fc5c24] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) [0x7f6076044c51] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) [0x7f6076047799] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) [0x7f6075ff9e18] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) [0x7f6075ffa272] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) [0x7f6075e76836] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) [0x7f6075e7690d] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) [0x7f607602937b] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) [0x7f6075ff5471] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) [0x7f6075fafacd] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) [0x7f6075fafbea] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) [0x7f6075ddd542] /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) [0x7f606e08f19f] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) [0x7f60737b194d] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) [0x7f60738ab735] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) [0x7f607378bcc8] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) [0x7f6073881d36] Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in PMPI_Iprobe: Other MPI error, error stack: PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, status=0x7ffe20e14260) failed MPID_Iprobe(244)..............: progress_test(100)............: MPIDI_OFI_handle_cq_error(949): OFI poll failed (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) [0x7f60738831a1] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) [0x7f60738446c9] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) [0x7f60738bf9cf] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) [0x7f60738c33bc] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) [0x7f60738baacb] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) [0x7f6077297560] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) [0x7f60773bb1e6] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) [0x7f6077954665] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) [0x7f60779c77e0] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) [0x7f6077ac2d53] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) [0x7f6077ac4c28] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) [0x7f6077ac8070] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) [0x7f6077b279df] /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) [0x7f6077b676c6] Abort(1) on node 60: Internal error Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in PMPI_Iprobe: Other MPI error, error stack: PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, status=0x7fff4d8284b0) failed MPID_Iprobe(244)..............: progress_test(100)............: MPIDI_OFI_handle_cq_error(949): OFI poll failed (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in PMPI_Iprobe: Other MPI error, error stack: PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, status=0x7fff715ba630) failed MPID_Iprobe(240)..............: MPIDI_iprobe_safe(108)........: MPIDI_iprobe_unsafe(35).......: MPIDI_OFI_do_iprobe(69).......: MPIDI_OFI_handle_cq_error(949): OFI poll failed (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in PMPI_Test: Other MPI error, error stack: PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, flag=0x7ffea65d673c, status=0x7ffea65d6760) failed MPIR_Test(73).................: MPIR_Test_state(33)...........: progress_test(100)............: MPIDI_OFI_handle_cq_error(949): OFI poll failed (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in PMPI_Probe: Other MPI error, error stack: PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, comm=0xc4000015, status=0x7fff9538b7a0) failed MPID_Probe(159)...............: progress_test(100)............: MPIDI_OFI_handle_cq_error(949): OFI poll failed (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in PMPI_Test: Other MPI error, error stack: PMPI_Test(188)................: MPI_Test(request=0x5b638d4, flag=0x7ffd755119cc, status=0x7ffd755121b0) failed MPIR_Test(73).................: MPIR_Test_state(33)...........: progress_test(100)............: MPIDI_OFI_handle_cq_error(949): OFI poll failed (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) ``` Thank you very much for your time and consideration. Best wishes, Zongze -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 22 11:09:01 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 May 2023 12:09:01 -0400 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: On Mon, May 22, 2023 at 11:07?AM Zongze Yang wrote: > Hi, > > I hope this letter finds you well. I am writing to seek guidance regarding > an error I encountered while solving a matrix using MUMPS on multiple nodes: > Iprobe is buggy on several MPI implementations. PETSc has an option for shutting it off for this reason. I do not know how to shut it off inside MUMPS however. I would mail their mailing list to see. Thanks, Matt > ```bash > Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in > PMPI_Iprobe: Other MPI error, error stack: > PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, > tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, > status=0x7ffc130f9e80) failed > MPID_Iprobe(240)..............: > MPIDI_iprobe_safe(108)........: > MPIDI_iprobe_unsafe(35).......: > MPIDI_OFI_do_iprobe(69).......: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line 125: > 0 > ``` > > The matrix in question has a degree of freedom (dof) of 3.86e+06. > Interestingly, when solving smaller-scale problems, everything functions > perfectly without any issues. However, when attempting to solve the larger > matrix on multiple nodes, I encounter the aforementioned error. > > The complete error message I received is as follows: > ```bash > Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in > PMPI_Iprobe: Other MPI error, error stack: > PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, > tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, > status=0x7ffc130f9e80) failed > MPID_Iprobe(240)..............: > MPIDI_iprobe_safe(108)........: > MPIDI_iprobe_unsafe(35).......: > MPIDI_OFI_do_iprobe(69).......: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line 125: > 0 > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) > [0x7f6076063f2c] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) > [0x7f6075fc5c24] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) > [0x7f6076044c51] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) > [0x7f6076047799] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) > [0x7f6075ff9e18] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) > [0x7f6075ffa272] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) > [0x7f6075e76836] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) > [0x7f6075e7690d] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) > [0x7f607602937b] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) > [0x7f6075ff5471] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) > [0x7f6075fafacd] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) > [0x7f6075fafbea] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) > [0x7f6075ddd542] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) > [0x7f606e08f19f] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) > [0x7f60737b194d] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) > [0x7f60738ab735] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) > [0x7f607378bcc8] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) > [0x7f6073881d36] > Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in > PMPI_Iprobe: Other MPI error, error stack: > PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, > tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, > status=0x7ffe20e14260) failed > MPID_Iprobe(244)..............: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) > [0x7f60738831a1] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) > [0x7f60738446c9] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) > [0x7f60738bf9cf] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) > [0x7f60738c33bc] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) > [0x7f60738baacb] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) > [0x7f6077297560] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) > [0x7f60773bb1e6] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) > [0x7f6077954665] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) > [0x7f60779c77e0] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) > [0x7f6077ac2d53] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) > [0x7f6077ac4c28] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) > [0x7f6077ac8070] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) > [0x7f6077b279df] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) > [0x7f6077b676c6] > Abort(1) on node 60: Internal error > Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in > PMPI_Iprobe: Other MPI error, error stack: > PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, > tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, > status=0x7fff4d8284b0) failed > MPID_Iprobe(244)..............: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in > PMPI_Iprobe: Other MPI error, error stack: > PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, > tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, > status=0x7fff715ba630) failed > MPID_Iprobe(240)..............: > MPIDI_iprobe_safe(108)........: > MPIDI_iprobe_unsafe(35).......: > MPIDI_OFI_do_iprobe(69).......: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in PMPI_Test: > Other MPI error, error stack: > PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, > flag=0x7ffea65d673c, status=0x7ffea65d6760) failed > MPIR_Test(73).................: > MPIR_Test_state(33)...........: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in > PMPI_Probe: Other MPI error, error stack: > PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, > comm=0xc4000015, status=0x7fff9538b7a0) failed > MPID_Probe(159)...............: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in PMPI_Test: > Other MPI error, error stack: > PMPI_Test(188)................: MPI_Test(request=0x5b638d4, > flag=0x7ffd755119cc, status=0x7ffd755121b0) failed > MPIR_Test(73).................: > MPIR_Test_state(33)...........: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > ``` > > Thank you very much for your time and consideration. > > Best wishes, > Zongze > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Mon May 22 12:02:35 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Tue, 23 May 2023 01:02:35 +0800 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: Thanks! Zongze Matthew Knepley ?2023?5?23? ??00:09??? > On Mon, May 22, 2023 at 11:07?AM Zongze Yang wrote: > >> Hi, >> >> I hope this letter finds you well. I am writing to seek guidance >> regarding an error I encountered while solving a matrix using MUMPS on >> multiple nodes: >> > > Iprobe is buggy on several MPI implementations. PETSc has an option for > shutting it off for this reason. > I do not know how to shut it off inside MUMPS however. I would mail their > mailing list to see. > > Thanks, > > Matt > > >> ```bash >> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >> PMPI_Iprobe: Other MPI error, error stack: >> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >> status=0x7ffc130f9e80) failed >> MPID_Iprobe(240)..............: >> MPIDI_iprobe_safe(108)........: >> MPIDI_iprobe_unsafe(35).......: >> MPIDI_OFI_do_iprobe(69).......: >> MPIDI_OFI_handle_cq_error(949): OFI poll failed >> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >> 125: 0 >> ``` >> >> The matrix in question has a degree of freedom (dof) of 3.86e+06. >> Interestingly, when solving smaller-scale problems, everything functions >> perfectly without any issues. However, when attempting to solve the larger >> matrix on multiple nodes, I encounter the aforementioned error. >> >> The complete error message I received is as follows: >> ```bash >> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >> PMPI_Iprobe: Other MPI error, error stack: >> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >> status=0x7ffc130f9e80) failed >> MPID_Iprobe(240)..............: >> MPIDI_iprobe_safe(108)........: >> MPIDI_iprobe_unsafe(35).......: >> MPIDI_OFI_do_iprobe(69).......: >> MPIDI_OFI_handle_cq_error(949): OFI poll failed >> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >> 125: 0 >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) >> [0x7f6076063f2c] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) >> [0x7f6075fc5c24] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) >> [0x7f6076044c51] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) >> [0x7f6076047799] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) >> [0x7f6075ff9e18] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) >> [0x7f6075ffa272] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) >> [0x7f6075e76836] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) >> [0x7f6075e7690d] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) >> [0x7f607602937b] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) >> [0x7f6075ff5471] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) >> [0x7f6075fafacd] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) >> [0x7f6075fafbea] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) >> [0x7f6075ddd542] >> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) >> [0x7f606e08f19f] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) >> [0x7f60737b194d] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) >> [0x7f60738ab735] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) >> [0x7f607378bcc8] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) >> [0x7f6073881d36] >> Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in >> PMPI_Iprobe: Other MPI error, error stack: >> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, >> status=0x7ffe20e14260) failed >> MPID_Iprobe(244)..............: >> progress_test(100)............: >> MPIDI_OFI_handle_cq_error(949): OFI poll failed >> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) >> [0x7f60738831a1] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) >> [0x7f60738446c9] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) >> [0x7f60738bf9cf] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) >> [0x7f60738c33bc] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) >> [0x7f60738baacb] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) >> [0x7f6077297560] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) >> [0x7f60773bb1e6] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) >> [0x7f6077954665] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) >> [0x7f60779c77e0] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) >> [0x7f6077ac2d53] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) >> [0x7f6077ac4c28] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) >> [0x7f6077ac8070] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) >> [0x7f6077b279df] >> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) >> [0x7f6077b676c6] >> Abort(1) on node 60: Internal error >> Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in >> PMPI_Iprobe: Other MPI error, error stack: >> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, >> status=0x7fff4d8284b0) failed >> MPID_Iprobe(244)..............: >> progress_test(100)............: >> MPIDI_OFI_handle_cq_error(949): OFI poll failed >> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >> Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in >> PMPI_Iprobe: Other MPI error, error stack: >> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, >> status=0x7fff715ba630) failed >> MPID_Iprobe(240)..............: >> MPIDI_iprobe_safe(108)........: >> MPIDI_iprobe_unsafe(35).......: >> MPIDI_OFI_do_iprobe(69).......: >> MPIDI_OFI_handle_cq_error(949): OFI poll failed >> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >> Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in >> PMPI_Test: Other MPI error, error stack: >> PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, >> flag=0x7ffea65d673c, status=0x7ffea65d6760) failed >> MPIR_Test(73).................: >> MPIR_Test_state(33)...........: >> progress_test(100)............: >> MPIDI_OFI_handle_cq_error(949): OFI poll failed >> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >> Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in >> PMPI_Probe: Other MPI error, error stack: >> PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, >> comm=0xc4000015, status=0x7fff9538b7a0) failed >> MPID_Probe(159)...............: >> progress_test(100)............: >> MPIDI_OFI_handle_cq_error(949): OFI poll failed >> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >> Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in PMPI_Test: >> Other MPI error, error stack: >> PMPI_Test(188)................: MPI_Test(request=0x5b638d4, >> flag=0x7ffd755119cc, status=0x7ffd755121b0) failed >> MPIR_Test(73).................: >> MPIR_Test_state(33)...........: >> progress_test(100)............: >> MPIDI_OFI_handle_cq_error(949): OFI poll failed >> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >> ``` >> >> Thank you very much for your time and consideration. >> >> Best wishes, >> Zongze >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Best wishes, Zongze -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Mon May 22 16:31:32 2023 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Tue, 23 May 2023 00:31:32 +0300 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: If I may add to the discussion, it may be that you are going OOM since you are trying to factorize a 3 million dofs problem, this problem goes undetected and then fails at a later stage Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang ha scritto: > Thanks! > > Zongze > > Matthew Knepley ?2023?5?23? ??00:09??? > >> On Mon, May 22, 2023 at 11:07?AM Zongze Yang >> wrote: >> >>> Hi, >>> >>> I hope this letter finds you well. I am writing to seek guidance >>> regarding an error I encountered while solving a matrix using MUMPS on >>> multiple nodes: >>> >> >> Iprobe is buggy on several MPI implementations. PETSc has an option for >> shutting it off for this reason. >> I do not know how to shut it off inside MUMPS however. I would mail their >> mailing list to see. >> >> Thanks, >> >> Matt >> >> >>> ```bash >>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>> PMPI_Iprobe: Other MPI error, error stack: >>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>> status=0x7ffc130f9e80) failed >>> MPID_Iprobe(240)..............: >>> MPIDI_iprobe_safe(108)........: >>> MPIDI_iprobe_unsafe(35).......: >>> MPIDI_OFI_do_iprobe(69).......: >>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>> 125: 0 >>> ``` >>> >>> The matrix in question has a degree of freedom (dof) of 3.86e+06. >>> Interestingly, when solving smaller-scale problems, everything functions >>> perfectly without any issues. However, when attempting to solve the larger >>> matrix on multiple nodes, I encounter the aforementioned error. >>> >>> The complete error message I received is as follows: >>> ```bash >>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>> PMPI_Iprobe: Other MPI error, error stack: >>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>> status=0x7ffc130f9e80) failed >>> MPID_Iprobe(240)..............: >>> MPIDI_iprobe_safe(108)........: >>> MPIDI_iprobe_unsafe(35).......: >>> MPIDI_OFI_do_iprobe(69).......: >>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>> 125: 0 >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) >>> [0x7f6076063f2c] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) >>> [0x7f6075fc5c24] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) >>> [0x7f6076044c51] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) >>> [0x7f6076047799] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) >>> [0x7f6075ff9e18] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) >>> [0x7f6075ffa272] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) >>> [0x7f6075e76836] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) >>> [0x7f6075e7690d] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) >>> [0x7f607602937b] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) >>> [0x7f6075ff5471] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) >>> [0x7f6075fafacd] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) >>> [0x7f6075fafbea] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) >>> [0x7f6075ddd542] >>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) >>> [0x7f606e08f19f] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) >>> [0x7f60737b194d] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) >>> [0x7f60738ab735] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) >>> [0x7f607378bcc8] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) >>> [0x7f6073881d36] >>> Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in >>> PMPI_Iprobe: Other MPI error, error stack: >>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, >>> status=0x7ffe20e14260) failed >>> MPID_Iprobe(244)..............: >>> progress_test(100)............: >>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) >>> [0x7f60738831a1] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) >>> [0x7f60738446c9] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) >>> [0x7f60738bf9cf] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) >>> [0x7f60738c33bc] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) >>> [0x7f60738baacb] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) >>> [0x7f6077297560] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) >>> [0x7f60773bb1e6] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) >>> [0x7f6077954665] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) >>> [0x7f60779c77e0] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) >>> [0x7f6077ac2d53] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) >>> [0x7f6077ac4c28] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) >>> [0x7f6077ac8070] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) >>> [0x7f6077b279df] >>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) >>> [0x7f6077b676c6] >>> Abort(1) on node 60: Internal error >>> Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in >>> PMPI_Iprobe: Other MPI error, error stack: >>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, >>> status=0x7fff4d8284b0) failed >>> MPID_Iprobe(244)..............: >>> progress_test(100)............: >>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>> Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in >>> PMPI_Iprobe: Other MPI error, error stack: >>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, >>> status=0x7fff715ba630) failed >>> MPID_Iprobe(240)..............: >>> MPIDI_iprobe_safe(108)........: >>> MPIDI_iprobe_unsafe(35).......: >>> MPIDI_OFI_do_iprobe(69).......: >>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>> Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in >>> PMPI_Test: Other MPI error, error stack: >>> PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, >>> flag=0x7ffea65d673c, status=0x7ffea65d6760) failed >>> MPIR_Test(73).................: >>> MPIR_Test_state(33)...........: >>> progress_test(100)............: >>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>> Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in >>> PMPI_Probe: Other MPI error, error stack: >>> PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, >>> comm=0xc4000015, status=0x7fff9538b7a0) failed >>> MPID_Probe(159)...............: >>> progress_test(100)............: >>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>> Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in PMPI_Test: >>> Other MPI error, error stack: >>> PMPI_Test(188)................: MPI_Test(request=0x5b638d4, >>> flag=0x7ffd755119cc, status=0x7ffd755121b0) failed >>> MPIR_Test(73).................: >>> MPIR_Test_state(33)...........: >>> progress_test(100)............: >>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>> ``` >>> >>> Thank you very much for your time and consideration. >>> >>> Best wishes, >>> Zongze >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > Best wishes, > Zongze > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Mon May 22 21:41:51 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Tue, 23 May 2023 10:41:51 +0800 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: On Tue, 23 May 2023 at 05:31, Stefano Zampini wrote: > If I may add to the discussion, it may be that you are going OOM since you > are trying to factorize a 3 million dofs problem, this problem goes > undetected and then fails at a later stage > Thank you for your comment. I ran the problem with 90 processes distributed across three nodes, each equipped with 500G of memory. If this amount of memory is sufficient for solving the matrix with approximately 3 million degrees of freedom? Thanks! Zongze Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang > ha scritto: > >> Thanks! >> >> Zongze >> >> Matthew Knepley ?2023?5?23? ??00:09??? >> >>> On Mon, May 22, 2023 at 11:07?AM Zongze Yang >>> wrote: >>> >>>> Hi, >>>> >>>> I hope this letter finds you well. I am writing to seek guidance >>>> regarding an error I encountered while solving a matrix using MUMPS on >>>> multiple nodes: >>>> >>> >>> Iprobe is buggy on several MPI implementations. PETSc has an option for >>> shutting it off for this reason. >>> I do not know how to shut it off inside MUMPS however. I would mail >>> their mailing list to see. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> ```bash >>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>> PMPI_Iprobe: Other MPI error, error stack: >>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>> status=0x7ffc130f9e80) failed >>>> MPID_Iprobe(240)..............: >>>> MPIDI_iprobe_safe(108)........: >>>> MPIDI_iprobe_unsafe(35).......: >>>> MPIDI_OFI_do_iprobe(69).......: >>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>>> 125: 0 >>>> ``` >>>> >>>> The matrix in question has a degree of freedom (dof) of 3.86e+06. >>>> Interestingly, when solving smaller-scale problems, everything functions >>>> perfectly without any issues. However, when attempting to solve the larger >>>> matrix on multiple nodes, I encounter the aforementioned error. >>>> >>>> The complete error message I received is as follows: >>>> ```bash >>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>> PMPI_Iprobe: Other MPI error, error stack: >>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>> status=0x7ffc130f9e80) failed >>>> MPID_Iprobe(240)..............: >>>> MPIDI_iprobe_safe(108)........: >>>> MPIDI_iprobe_unsafe(35).......: >>>> MPIDI_OFI_do_iprobe(69).......: >>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>>> 125: 0 >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) >>>> [0x7f6076063f2c] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) >>>> [0x7f6075fc5c24] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) >>>> [0x7f6076044c51] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) >>>> [0x7f6076047799] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) >>>> [0x7f6075ff9e18] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) >>>> [0x7f6075ffa272] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) >>>> [0x7f6075e76836] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) >>>> [0x7f6075e7690d] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) >>>> [0x7f607602937b] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) >>>> [0x7f6075ff5471] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) >>>> [0x7f6075fafacd] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) >>>> [0x7f6075fafbea] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) >>>> [0x7f6075ddd542] >>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) >>>> [0x7f606e08f19f] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) >>>> [0x7f60737b194d] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) >>>> [0x7f60738ab735] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) >>>> [0x7f607378bcc8] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) >>>> [0x7f6073881d36] >>>> Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in >>>> PMPI_Iprobe: Other MPI error, error stack: >>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, >>>> status=0x7ffe20e14260) failed >>>> MPID_Iprobe(244)..............: >>>> progress_test(100)............: >>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) >>>> [0x7f60738831a1] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) >>>> [0x7f60738446c9] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) >>>> [0x7f60738bf9cf] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) >>>> [0x7f60738c33bc] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) >>>> [0x7f60738baacb] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) >>>> [0x7f6077297560] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) >>>> [0x7f60773bb1e6] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) >>>> [0x7f6077954665] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) >>>> [0x7f60779c77e0] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) >>>> [0x7f6077ac2d53] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) >>>> [0x7f6077ac4c28] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) >>>> [0x7f6077ac8070] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) >>>> [0x7f6077b279df] >>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) >>>> [0x7f6077b676c6] >>>> Abort(1) on node 60: Internal error >>>> Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in >>>> PMPI_Iprobe: Other MPI error, error stack: >>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, >>>> status=0x7fff4d8284b0) failed >>>> MPID_Iprobe(244)..............: >>>> progress_test(100)............: >>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>> Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in >>>> PMPI_Iprobe: Other MPI error, error stack: >>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, >>>> status=0x7fff715ba630) failed >>>> MPID_Iprobe(240)..............: >>>> MPIDI_iprobe_safe(108)........: >>>> MPIDI_iprobe_unsafe(35).......: >>>> MPIDI_OFI_do_iprobe(69).......: >>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>> Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in >>>> PMPI_Test: Other MPI error, error stack: >>>> PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, >>>> flag=0x7ffea65d673c, status=0x7ffea65d6760) failed >>>> MPIR_Test(73).................: >>>> MPIR_Test_state(33)...........: >>>> progress_test(100)............: >>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>> Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in >>>> PMPI_Probe: Other MPI error, error stack: >>>> PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, >>>> comm=0xc4000015, status=0x7fff9538b7a0) failed >>>> MPID_Probe(159)...............: >>>> progress_test(100)............: >>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>> Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in >>>> PMPI_Test: Other MPI error, error stack: >>>> PMPI_Test(188)................: MPI_Test(request=0x5b638d4, >>>> flag=0x7ffd755119cc, status=0x7ffd755121b0) failed >>>> MPIR_Test(73).................: >>>> MPIR_Test_state(33)...........: >>>> progress_test(100)............: >>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>> ``` >>>> >>>> Thank you very much for your time and consideration. >>>> >>>> Best wishes, >>>> Zongze >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> Best wishes, >> Zongze >> > > > -- > Stefano > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Mon May 22 21:45:55 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Tue, 23 May 2023 10:45:55 +0800 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: I have an additional question to ask: Is it possible for the SuperLU_DIST library to encounter the same MPI problem (PMPI_Iprobe failed) as MUMPS? Best wishes, Zongze On Tue, 23 May 2023 at 10:41, Zongze Yang wrote: > On Tue, 23 May 2023 at 05:31, Stefano Zampini > wrote: > >> If I may add to the discussion, it may be that you are going OOM since >> you are trying to factorize a 3 million dofs problem, this problem goes >> undetected and then fails at a later stage >> > > Thank you for your comment. I ran the problem with 90 processes > distributed across three nodes, each equipped with 500G of memory. If this > amount of memory is sufficient for solving the matrix with approximately 3 > million degrees of freedom? > > Thanks! > Zongze > > Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang >> ha scritto: >> >>> Thanks! >>> >>> Zongze >>> >>> Matthew Knepley ?2023?5?23? ??00:09??? >>> >>>> On Mon, May 22, 2023 at 11:07?AM Zongze Yang >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I hope this letter finds you well. I am writing to seek guidance >>>>> regarding an error I encountered while solving a matrix using MUMPS on >>>>> multiple nodes: >>>>> >>>> >>>> Iprobe is buggy on several MPI implementations. PETSc has an option for >>>> shutting it off for this reason. >>>> I do not know how to shut it off inside MUMPS however. I would mail >>>> their mailing list to see. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> ```bash >>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>> status=0x7ffc130f9e80) failed >>>>> MPID_Iprobe(240)..............: >>>>> MPIDI_iprobe_safe(108)........: >>>>> MPIDI_iprobe_unsafe(35).......: >>>>> MPIDI_OFI_do_iprobe(69).......: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>>>> 125: 0 >>>>> ``` >>>>> >>>>> The matrix in question has a degree of freedom (dof) of 3.86e+06. >>>>> Interestingly, when solving smaller-scale problems, everything functions >>>>> perfectly without any issues. However, when attempting to solve the larger >>>>> matrix on multiple nodes, I encounter the aforementioned error. >>>>> >>>>> The complete error message I received is as follows: >>>>> ```bash >>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>> status=0x7ffc130f9e80) failed >>>>> MPID_Iprobe(240)..............: >>>>> MPIDI_iprobe_safe(108)........: >>>>> MPIDI_iprobe_unsafe(35).......: >>>>> MPIDI_OFI_do_iprobe(69).......: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>>>> 125: 0 >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) >>>>> [0x7f6076063f2c] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) >>>>> [0x7f6075fc5c24] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) >>>>> [0x7f6076044c51] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) >>>>> [0x7f6076047799] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) >>>>> [0x7f6075ff9e18] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) >>>>> [0x7f6075ffa272] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) >>>>> [0x7f6075e76836] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) >>>>> [0x7f6075e7690d] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) >>>>> [0x7f607602937b] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) >>>>> [0x7f6075ff5471] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) >>>>> [0x7f6075fafacd] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) >>>>> [0x7f6075fafbea] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) >>>>> [0x7f6075ddd542] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) >>>>> [0x7f606e08f19f] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) >>>>> [0x7f60737b194d] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) >>>>> [0x7f60738ab735] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) >>>>> [0x7f607378bcc8] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) >>>>> [0x7f6073881d36] >>>>> Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, >>>>> status=0x7ffe20e14260) failed >>>>> MPID_Iprobe(244)..............: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) >>>>> [0x7f60738831a1] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) >>>>> [0x7f60738446c9] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) >>>>> [0x7f60738bf9cf] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) >>>>> [0x7f60738c33bc] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) >>>>> [0x7f60738baacb] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) >>>>> [0x7f6077297560] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) >>>>> [0x7f60773bb1e6] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) >>>>> [0x7f6077954665] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) >>>>> [0x7f60779c77e0] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) >>>>> [0x7f6077ac2d53] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) >>>>> [0x7f6077ac4c28] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) >>>>> [0x7f6077ac8070] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) >>>>> [0x7f6077b279df] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) >>>>> [0x7f6077b676c6] >>>>> Abort(1) on node 60: Internal error >>>>> Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, >>>>> status=0x7fff4d8284b0) failed >>>>> MPID_Iprobe(244)..............: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, >>>>> status=0x7fff715ba630) failed >>>>> MPID_Iprobe(240)..............: >>>>> MPIDI_iprobe_safe(108)........: >>>>> MPIDI_iprobe_unsafe(35).......: >>>>> MPIDI_OFI_do_iprobe(69).......: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in >>>>> PMPI_Test: Other MPI error, error stack: >>>>> PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, >>>>> flag=0x7ffea65d673c, status=0x7ffea65d6760) failed >>>>> MPIR_Test(73).................: >>>>> MPIR_Test_state(33)...........: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in >>>>> PMPI_Probe: Other MPI error, error stack: >>>>> PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, >>>>> comm=0xc4000015, status=0x7fff9538b7a0) failed >>>>> MPID_Probe(159)...............: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in >>>>> PMPI_Test: Other MPI error, error stack: >>>>> PMPI_Test(188)................: MPI_Test(request=0x5b638d4, >>>>> flag=0x7ffd755119cc, status=0x7ffd755121b0) failed >>>>> MPIR_Test(73).................: >>>>> MPIR_Test_state(33)...........: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> ``` >>>>> >>>>> Thank you very much for your time and consideration. >>>>> >>>>> Best wishes, >>>>> Zongze >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> Best wishes, >>> Zongze >>> >> >> >> -- >> Stefano >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 23 04:59:36 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 May 2023 05:59:36 -0400 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: On Mon, May 22, 2023 at 10:42?PM Zongze Yang wrote: > On Tue, 23 May 2023 at 05:31, Stefano Zampini > wrote: > >> If I may add to the discussion, it may be that you are going OOM since >> you are trying to factorize a 3 million dofs problem, this problem goes >> undetected and then fails at a later stage >> > > Thank you for your comment. I ran the problem with 90 processes > distributed across three nodes, each equipped with 500G of memory. If this > amount of memory is sufficient for solving the matrix with approximately 3 > million degrees of freedom? > It really depends on the fill. Suppose that you get 1% fill, then (3e6)^2 * 0.01 * 8 = 1e12 B and you have 1.5e12 B, so I could easily see running out of memory. Thanks, Matt > Thanks! > Zongze > > Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang >> ha scritto: >> >>> Thanks! >>> >>> Zongze >>> >>> Matthew Knepley ?2023?5?23? ??00:09??? >>> >>>> On Mon, May 22, 2023 at 11:07?AM Zongze Yang >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I hope this letter finds you well. I am writing to seek guidance >>>>> regarding an error I encountered while solving a matrix using MUMPS on >>>>> multiple nodes: >>>>> >>>> >>>> Iprobe is buggy on several MPI implementations. PETSc has an option for >>>> shutting it off for this reason. >>>> I do not know how to shut it off inside MUMPS however. I would mail >>>> their mailing list to see. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> ```bash >>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>> status=0x7ffc130f9e80) failed >>>>> MPID_Iprobe(240)..............: >>>>> MPIDI_iprobe_safe(108)........: >>>>> MPIDI_iprobe_unsafe(35).......: >>>>> MPIDI_OFI_do_iprobe(69).......: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>>>> 125: 0 >>>>> ``` >>>>> >>>>> The matrix in question has a degree of freedom (dof) of 3.86e+06. >>>>> Interestingly, when solving smaller-scale problems, everything functions >>>>> perfectly without any issues. However, when attempting to solve the larger >>>>> matrix on multiple nodes, I encounter the aforementioned error. >>>>> >>>>> The complete error message I received is as follows: >>>>> ```bash >>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>> status=0x7ffc130f9e80) failed >>>>> MPID_Iprobe(240)..............: >>>>> MPIDI_iprobe_safe(108)........: >>>>> MPIDI_iprobe_unsafe(35).......: >>>>> MPIDI_OFI_do_iprobe(69).......: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>>>> 125: 0 >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) >>>>> [0x7f6076063f2c] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) >>>>> [0x7f6075fc5c24] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) >>>>> [0x7f6076044c51] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) >>>>> [0x7f6076047799] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) >>>>> [0x7f6075ff9e18] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) >>>>> [0x7f6075ffa272] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) >>>>> [0x7f6075e76836] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) >>>>> [0x7f6075e7690d] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) >>>>> [0x7f607602937b] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) >>>>> [0x7f6075ff5471] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) >>>>> [0x7f6075fafacd] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) >>>>> [0x7f6075fafbea] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) >>>>> [0x7f6075ddd542] >>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) >>>>> [0x7f606e08f19f] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) >>>>> [0x7f60737b194d] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) >>>>> [0x7f60738ab735] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) >>>>> [0x7f607378bcc8] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) >>>>> [0x7f6073881d36] >>>>> Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, >>>>> status=0x7ffe20e14260) failed >>>>> MPID_Iprobe(244)..............: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) >>>>> [0x7f60738831a1] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) >>>>> [0x7f60738446c9] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) >>>>> [0x7f60738bf9cf] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) >>>>> [0x7f60738c33bc] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) >>>>> [0x7f60738baacb] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) >>>>> [0x7f6077297560] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) >>>>> [0x7f60773bb1e6] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) >>>>> [0x7f6077954665] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) >>>>> [0x7f60779c77e0] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) >>>>> [0x7f6077ac2d53] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) >>>>> [0x7f6077ac4c28] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) >>>>> [0x7f6077ac8070] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) >>>>> [0x7f6077b279df] >>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) >>>>> [0x7f6077b676c6] >>>>> Abort(1) on node 60: Internal error >>>>> Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, >>>>> status=0x7fff4d8284b0) failed >>>>> MPID_Iprobe(244)..............: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in >>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, >>>>> status=0x7fff715ba630) failed >>>>> MPID_Iprobe(240)..............: >>>>> MPIDI_iprobe_safe(108)........: >>>>> MPIDI_iprobe_unsafe(35).......: >>>>> MPIDI_OFI_do_iprobe(69).......: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in >>>>> PMPI_Test: Other MPI error, error stack: >>>>> PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, >>>>> flag=0x7ffea65d673c, status=0x7ffea65d6760) failed >>>>> MPIR_Test(73).................: >>>>> MPIR_Test_state(33)...........: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in >>>>> PMPI_Probe: Other MPI error, error stack: >>>>> PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, >>>>> comm=0xc4000015, status=0x7fff9538b7a0) failed >>>>> MPID_Probe(159)...............: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in >>>>> PMPI_Test: Other MPI error, error stack: >>>>> PMPI_Test(188)................: MPI_Test(request=0x5b638d4, >>>>> flag=0x7ffd755119cc, status=0x7ffd755121b0) failed >>>>> MPIR_Test(73).................: >>>>> MPIR_Test_state(33)...........: >>>>> progress_test(100)............: >>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>> ``` >>>>> >>>>> Thank you very much for your time and consideration. >>>>> >>>>> Best wishes, >>>>> Zongze >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> Best wishes, >>> Zongze >>> >> >> >> -- >> Stefano >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 23 05:00:07 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 May 2023 06:00:07 -0400 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: On Mon, May 22, 2023 at 10:46?PM Zongze Yang wrote: > I have an additional question to ask: Is it possible for the SuperLU_DIST > library to encounter the same MPI problem (PMPI_Iprobe failed) as MUMPS? > I do not know if they use that function. But it is easy to try it out, so I would. Thanks, Matt > Best wishes, > Zongze > > > On Tue, 23 May 2023 at 10:41, Zongze Yang wrote: > >> On Tue, 23 May 2023 at 05:31, Stefano Zampini >> wrote: >> >>> If I may add to the discussion, it may be that you are going OOM since >>> you are trying to factorize a 3 million dofs problem, this problem goes >>> undetected and then fails at a later stage >>> >> >> Thank you for your comment. I ran the problem with 90 processes >> distributed across three nodes, each equipped with 500G of memory. If this >> amount of memory is sufficient for solving the matrix with approximately 3 >> million degrees of freedom? >> >> Thanks! >> Zongze >> >> Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang < >>> yangzongze at gmail.com> ha scritto: >>> >>>> Thanks! >>>> >>>> Zongze >>>> >>>> Matthew Knepley ?2023?5?23? ??00:09??? >>>> >>>>> On Mon, May 22, 2023 at 11:07?AM Zongze Yang >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I hope this letter finds you well. I am writing to seek guidance >>>>>> regarding an error I encountered while solving a matrix using MUMPS on >>>>>> multiple nodes: >>>>>> >>>>> >>>>> Iprobe is buggy on several MPI implementations. PETSc has an option >>>>> for shutting it off for this reason. >>>>> I do not know how to shut it off inside MUMPS however. I would mail >>>>> their mailing list to see. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> ```bash >>>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>>> status=0x7ffc130f9e80) failed >>>>>> MPID_Iprobe(240)..............: >>>>>> MPIDI_iprobe_safe(108)........: >>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>>>>> 125: 0 >>>>>> ``` >>>>>> >>>>>> The matrix in question has a degree of freedom (dof) of 3.86e+06. >>>>>> Interestingly, when solving smaller-scale problems, everything functions >>>>>> perfectly without any issues. However, when attempting to solve the larger >>>>>> matrix on multiple nodes, I encounter the aforementioned error. >>>>>> >>>>>> The complete error message I received is as follows: >>>>>> ```bash >>>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>>> status=0x7ffc130f9e80) failed >>>>>> MPID_Iprobe(240)..............: >>>>>> MPIDI_iprobe_safe(108)........: >>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at line >>>>>> 125: 0 >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) >>>>>> [0x7f6076063f2c] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) >>>>>> [0x7f6075fc5c24] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) >>>>>> [0x7f6076044c51] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) >>>>>> [0x7f6076047799] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) >>>>>> [0x7f6075ff9e18] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) >>>>>> [0x7f6075ffa272] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) >>>>>> [0x7f6075e76836] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) >>>>>> [0x7f6075e7690d] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) >>>>>> [0x7f607602937b] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) >>>>>> [0x7f6075ff5471] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) >>>>>> [0x7f6075fafacd] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) >>>>>> [0x7f6075fafbea] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) >>>>>> [0x7f6075ddd542] >>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) >>>>>> [0x7f606e08f19f] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) >>>>>> [0x7f60737b194d] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) >>>>>> [0x7f60738ab735] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) >>>>>> [0x7f607378bcc8] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) >>>>>> [0x7f6073881d36] >>>>>> Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in >>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, >>>>>> status=0x7ffe20e14260) failed >>>>>> MPID_Iprobe(244)..............: >>>>>> progress_test(100)............: >>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) >>>>>> [0x7f60738831a1] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) >>>>>> [0x7f60738446c9] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) >>>>>> [0x7f60738bf9cf] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) >>>>>> [0x7f60738c33bc] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) >>>>>> [0x7f60738baacb] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) >>>>>> [0x7f6077297560] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) >>>>>> [0x7f60773bb1e6] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) >>>>>> [0x7f6077954665] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) >>>>>> [0x7f60779c77e0] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) >>>>>> [0x7f6077ac2d53] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) >>>>>> [0x7f6077ac4c28] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) >>>>>> [0x7f6077ac8070] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) >>>>>> [0x7f6077b279df] >>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) >>>>>> [0x7f6077b676c6] >>>>>> Abort(1) on node 60: Internal error >>>>>> Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in >>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, >>>>>> status=0x7fff4d8284b0) failed >>>>>> MPID_Iprobe(244)..............: >>>>>> progress_test(100)............: >>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>> Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in >>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, >>>>>> status=0x7fff715ba630) failed >>>>>> MPID_Iprobe(240)..............: >>>>>> MPIDI_iprobe_safe(108)........: >>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>> Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in >>>>>> PMPI_Test: Other MPI error, error stack: >>>>>> PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, >>>>>> flag=0x7ffea65d673c, status=0x7ffea65d6760) failed >>>>>> MPIR_Test(73).................: >>>>>> MPIR_Test_state(33)...........: >>>>>> progress_test(100)............: >>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>> Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in >>>>>> PMPI_Probe: Other MPI error, error stack: >>>>>> PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, >>>>>> comm=0xc4000015, status=0x7fff9538b7a0) failed >>>>>> MPID_Probe(159)...............: >>>>>> progress_test(100)............: >>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>> Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in >>>>>> PMPI_Test: Other MPI error, error stack: >>>>>> PMPI_Test(188)................: MPI_Test(request=0x5b638d4, >>>>>> flag=0x7ffd755119cc, status=0x7ffd755121b0) failed >>>>>> MPIR_Test(73).................: >>>>>> MPIR_Test_state(33)...........: >>>>>> progress_test(100)............: >>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>> ``` >>>>>> >>>>>> Thank you very much for your time and consideration. >>>>>> >>>>>> Best wishes, >>>>>> Zongze >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- >>>> Best wishes, >>>> Zongze >>>> >>> >>> >>> -- >>> Stefano >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Tue May 23 06:51:10 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Tue, 23 May 2023 19:51:10 +0800 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: Thank you for your suggestion. I solved the problem with SuperLU_DIST, and it works well. Best wishes, Zongze On Tue, 23 May 2023 at 18:00, Matthew Knepley wrote: > On Mon, May 22, 2023 at 10:46?PM Zongze Yang wrote: > >> I have an additional question to ask: Is it possible for the SuperLU_DIST >> library to encounter the same MPI problem (PMPI_Iprobe failed) as MUMPS? >> > > I do not know if they use that function. But it is easy to try it out, so > I would. > > Thanks, > > Matt > > >> Best wishes, >> Zongze >> >> >> On Tue, 23 May 2023 at 10:41, Zongze Yang wrote: >> >>> On Tue, 23 May 2023 at 05:31, Stefano Zampini >>> wrote: >>> >>>> If I may add to the discussion, it may be that you are going OOM since >>>> you are trying to factorize a 3 million dofs problem, this problem goes >>>> undetected and then fails at a later stage >>>> >>> >>> Thank you for your comment. I ran the problem with 90 processes >>> distributed across three nodes, each equipped with 500G of memory. If this >>> amount of memory is sufficient for solving the matrix with approximately 3 >>> million degrees of freedom? >>> >>> Thanks! >>> Zongze >>> >>> Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang < >>>> yangzongze at gmail.com> ha scritto: >>>> >>>>> Thanks! >>>>> >>>>> Zongze >>>>> >>>>> Matthew Knepley ?2023?5?23? ??00:09??? >>>>> >>>>>> On Mon, May 22, 2023 at 11:07?AM Zongze Yang >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I hope this letter finds you well. I am writing to seek guidance >>>>>>> regarding an error I encountered while solving a matrix using MUMPS on >>>>>>> multiple nodes: >>>>>>> >>>>>> >>>>>> Iprobe is buggy on several MPI implementations. PETSc has an option >>>>>> for shutting it off for this reason. >>>>>> I do not know how to shut it off inside MUMPS however. I would mail >>>>>> their mailing list to see. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> ```bash >>>>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>>>> status=0x7ffc130f9e80) failed >>>>>>> MPID_Iprobe(240)..............: >>>>>>> MPIDI_iprobe_safe(108)........: >>>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at >>>>>>> line 125: 0 >>>>>>> ``` >>>>>>> >>>>>>> The matrix in question has a degree of freedom (dof) of 3.86e+06. >>>>>>> Interestingly, when solving smaller-scale problems, everything functions >>>>>>> perfectly without any issues. However, when attempting to solve the larger >>>>>>> matrix on multiple nodes, I encounter the aforementioned error. >>>>>>> >>>>>>> The complete error message I received is as follows: >>>>>>> ```bash >>>>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>>>> status=0x7ffc130f9e80) failed >>>>>>> MPID_Iprobe(240)..............: >>>>>>> MPIDI_iprobe_safe(108)........: >>>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at >>>>>>> line 125: 0 >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) >>>>>>> [0x7f6076063f2c] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) >>>>>>> [0x7f6075fc5c24] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) >>>>>>> [0x7f6076044c51] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) >>>>>>> [0x7f6076047799] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) >>>>>>> [0x7f6075ff9e18] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) >>>>>>> [0x7f6075ffa272] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) >>>>>>> [0x7f6075e76836] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) >>>>>>> [0x7f6075e7690d] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) >>>>>>> [0x7f607602937b] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) >>>>>>> [0x7f6075ff5471] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) >>>>>>> [0x7f6075fafacd] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) >>>>>>> [0x7f6075fafbea] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) >>>>>>> [0x7f6075ddd542] >>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) >>>>>>> [0x7f606e08f19f] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) >>>>>>> [0x7f60737b194d] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) >>>>>>> [0x7f60738ab735] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) >>>>>>> [0x7f607378bcc8] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) >>>>>>> [0x7f6073881d36] >>>>>>> Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in >>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, >>>>>>> status=0x7ffe20e14260) failed >>>>>>> MPID_Iprobe(244)..............: >>>>>>> progress_test(100)............: >>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) >>>>>>> [0x7f60738831a1] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) >>>>>>> [0x7f60738446c9] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) >>>>>>> [0x7f60738bf9cf] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) >>>>>>> [0x7f60738c33bc] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) >>>>>>> [0x7f60738baacb] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) >>>>>>> [0x7f6077297560] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) >>>>>>> [0x7f60773bb1e6] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) >>>>>>> [0x7f6077954665] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) >>>>>>> [0x7f60779c77e0] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) >>>>>>> [0x7f6077ac2d53] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) >>>>>>> [0x7f6077ac4c28] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) >>>>>>> [0x7f6077ac8070] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) >>>>>>> [0x7f6077b279df] >>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) >>>>>>> [0x7f6077b676c6] >>>>>>> Abort(1) on node 60: Internal error >>>>>>> Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in >>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, >>>>>>> status=0x7fff4d8284b0) failed >>>>>>> MPID_Iprobe(244)..............: >>>>>>> progress_test(100)............: >>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>> Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in >>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, >>>>>>> status=0x7fff715ba630) failed >>>>>>> MPID_Iprobe(240)..............: >>>>>>> MPIDI_iprobe_safe(108)........: >>>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>> Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in >>>>>>> PMPI_Test: Other MPI error, error stack: >>>>>>> PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, >>>>>>> flag=0x7ffea65d673c, status=0x7ffea65d6760) failed >>>>>>> MPIR_Test(73).................: >>>>>>> MPIR_Test_state(33)...........: >>>>>>> progress_test(100)............: >>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>> Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in >>>>>>> PMPI_Probe: Other MPI error, error stack: >>>>>>> PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, tag=7, >>>>>>> comm=0xc4000015, status=0x7fff9538b7a0) failed >>>>>>> MPID_Probe(159)...............: >>>>>>> progress_test(100)............: >>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>> Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in >>>>>>> PMPI_Test: Other MPI error, error stack: >>>>>>> PMPI_Test(188)................: MPI_Test(request=0x5b638d4, >>>>>>> flag=0x7ffd755119cc, status=0x7ffd755121b0) failed >>>>>>> MPIR_Test(73).................: >>>>>>> MPIR_Test_state(33)...........: >>>>>>> progress_test(100)............: >>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>> ``` >>>>>>> >>>>>>> Thank you very much for your time and consideration. >>>>>>> >>>>>>> Best wishes, >>>>>>> Zongze >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> -- >>>>> Best wishes, >>>>> Zongze >>>>> >>>> >>>> >>>> -- >>>> Stefano >>>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Tue May 23 06:59:40 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Tue, 23 May 2023 19:59:40 +0800 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: On Tue, 23 May 2023 at 19:51, Zongze Yang wrote: > Thank you for your suggestion. I solved the problem with SuperLU_DIST, and > it works well. > This is solved with four nodes, each equipped with 500G of memory. Best wishes, Zongze Best wishes, > Zongze > > > On Tue, 23 May 2023 at 18:00, Matthew Knepley wrote: > >> On Mon, May 22, 2023 at 10:46?PM Zongze Yang >> wrote: >> >>> I have an additional question to ask: Is it possible for the >>> SuperLU_DIST library to encounter the same MPI problem (PMPI_Iprobe failed) >>> as MUMPS? >>> >> >> I do not know if they use that function. But it is easy to try it out, so >> I would. >> >> Thanks, >> >> Matt >> >> >>> Best wishes, >>> Zongze >>> >>> >>> On Tue, 23 May 2023 at 10:41, Zongze Yang wrote: >>> >>>> On Tue, 23 May 2023 at 05:31, Stefano Zampini < >>>> stefano.zampini at gmail.com> wrote: >>>> >>>>> If I may add to the discussion, it may be that you are going OOM since >>>>> you are trying to factorize a 3 million dofs problem, this problem goes >>>>> undetected and then fails at a later stage >>>>> >>>> >>>> Thank you for your comment. I ran the problem with 90 processes >>>> distributed across three nodes, each equipped with 500G of memory. If this >>>> amount of memory is sufficient for solving the matrix with approximately 3 >>>> million degrees of freedom? >>>> >>>> Thanks! >>>> Zongze >>>> >>>> Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang < >>>>> yangzongze at gmail.com> ha scritto: >>>>> >>>>>> Thanks! >>>>>> >>>>>> Zongze >>>>>> >>>>>> Matthew Knepley ?2023?5?23? ??00:09??? >>>>>> >>>>>>> On Mon, May 22, 2023 at 11:07?AM Zongze Yang >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I hope this letter finds you well. I am writing to seek guidance >>>>>>>> regarding an error I encountered while solving a matrix using MUMPS on >>>>>>>> multiple nodes: >>>>>>>> >>>>>>> >>>>>>> Iprobe is buggy on several MPI implementations. PETSc has an option >>>>>>> for shutting it off for this reason. >>>>>>> I do not know how to shut it off inside MUMPS however. I would mail >>>>>>> their mailing list to see. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> ```bash >>>>>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>>>>> status=0x7ffc130f9e80) failed >>>>>>>> MPID_Iprobe(240)..............: >>>>>>>> MPIDI_iprobe_safe(108)........: >>>>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at >>>>>>>> line 125: 0 >>>>>>>> ``` >>>>>>>> >>>>>>>> The matrix in question has a degree of freedom (dof) of 3.86e+06. >>>>>>>> Interestingly, when solving smaller-scale problems, everything functions >>>>>>>> perfectly without any issues. However, when attempting to solve the larger >>>>>>>> matrix on multiple nodes, I encounter the aforementioned error. >>>>>>>> >>>>>>>> The complete error message I received is as follows: >>>>>>>> ```bash >>>>>>>> Abort(1681039) on node 60 (rank 60 in comm 240): Fatal error in >>>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>>> tag=MPI_ANY_TAG, comm=0xc4000026, flag=0x7ffc130f9c4c, >>>>>>>> status=0x7ffc130f9e80) failed >>>>>>>> MPID_Iprobe(240)..............: >>>>>>>> MPIDI_iprobe_safe(108)........: >>>>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>>> Assertion failed in file src/mpid/ch4/netmod/ofi/ofi_events.c at >>>>>>>> line 125: 0 >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) >>>>>>>> [0x7f6076063f2c] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) >>>>>>>> [0x7f6075fc5c24] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) >>>>>>>> [0x7f6076044c51] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) >>>>>>>> [0x7f6076047799] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) >>>>>>>> [0x7f6075ff9e18] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) >>>>>>>> [0x7f6075ffa272] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) >>>>>>>> [0x7f6075e76836] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) >>>>>>>> [0x7f6075e7690d] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) >>>>>>>> [0x7f607602937b] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) >>>>>>>> [0x7f6075ff5471] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) >>>>>>>> [0x7f6075fafacd] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) >>>>>>>> [0x7f6075fafbea] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) >>>>>>>> [0x7f6075ddd542] >>>>>>>> /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) >>>>>>>> [0x7f606e08f19f] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) >>>>>>>> [0x7f60737b194d] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) >>>>>>>> [0x7f60738ab735] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) >>>>>>>> [0x7f607378bcc8] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) >>>>>>>> [0x7f6073881d36] >>>>>>>> Abort(805938831) on node 51 (rank 51 in comm 240): Fatal error in >>>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7ffe20e1402c, >>>>>>>> status=0x7ffe20e14260) failed >>>>>>>> MPID_Iprobe(244)..............: >>>>>>>> progress_test(100)............: >>>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) >>>>>>>> [0x7f60738831a1] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) >>>>>>>> [0x7f60738446c9] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) >>>>>>>> [0x7f60738bf9cf] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) >>>>>>>> [0x7f60738c33bc] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) >>>>>>>> [0x7f60738baacb] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) >>>>>>>> [0x7f6077297560] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) >>>>>>>> [0x7f60773bb1e6] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) >>>>>>>> [0x7f6077954665] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) >>>>>>>> [0x7f60779c77e0] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) >>>>>>>> [0x7f6077ac2d53] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) >>>>>>>> [0x7f6077ac4c28] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) >>>>>>>> [0x7f6077ac8070] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) >>>>>>>> [0x7f6077b279df] >>>>>>>> /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) >>>>>>>> [0x7f6077b676c6] >>>>>>>> Abort(1) on node 60: Internal error >>>>>>>> Abort(1007265423) on node 65 (rank 65 in comm 240): Fatal error in >>>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff4d82827c, >>>>>>>> status=0x7fff4d8284b0) failed >>>>>>>> MPID_Iprobe(244)..............: >>>>>>>> progress_test(100)............: >>>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>>> Abort(941205135) on node 32 (rank 32 in comm 240): Fatal error in >>>>>>>> PMPI_Iprobe: Other MPI error, error stack: >>>>>>>> PMPI_Iprobe(124)..............: MPI_Iprobe(src=MPI_ANY_SOURCE, >>>>>>>> tag=MPI_ANY_TAG, comm=0xc4000017, flag=0x7fff715ba3fc, >>>>>>>> status=0x7fff715ba630) failed >>>>>>>> MPID_Iprobe(240)..............: >>>>>>>> MPIDI_iprobe_safe(108)........: >>>>>>>> MPIDI_iprobe_unsafe(35).......: >>>>>>>> MPIDI_OFI_do_iprobe(69).......: >>>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>>> Abort(470941839) on node 75 (rank 75 in comm 0): Fatal error in >>>>>>>> PMPI_Test: Other MPI error, error stack: >>>>>>>> PMPI_Test(188)................: MPI_Test(request=0x7efe31e03014, >>>>>>>> flag=0x7ffea65d673c, status=0x7ffea65d6760) failed >>>>>>>> MPIR_Test(73).................: >>>>>>>> MPIR_Test_state(33)...........: >>>>>>>> progress_test(100)............: >>>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>>> Abort(805946511) on node 31 (rank 31 in comm 256): Fatal error in >>>>>>>> PMPI_Probe: Other MPI error, error stack: >>>>>>>> PMPI_Probe(118)...............: MPI_Probe(src=MPI_ANY_SOURCE, >>>>>>>> tag=7, comm=0xc4000015, status=0x7fff9538b7a0) failed >>>>>>>> MPID_Probe(159)...............: >>>>>>>> progress_test(100)............: >>>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>>> Abort(1179791) on node 73 (rank 73 in comm 0): Fatal error in >>>>>>>> PMPI_Test: Other MPI error, error stack: >>>>>>>> PMPI_Test(188)................: MPI_Test(request=0x5b638d4, >>>>>>>> flag=0x7ffd755119cc, status=0x7ffd755121b0) failed >>>>>>>> MPIR_Test(73).................: >>>>>>>> MPIR_Test_state(33)...........: >>>>>>>> progress_test(100)............: >>>>>>>> MPIDI_OFI_handle_cq_error(949): OFI poll failed >>>>>>>> (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) >>>>>>>> ``` >>>>>>>> >>>>>>>> Thank you very much for your time and consideration. >>>>>>>> >>>>>>>> Best wishes, >>>>>>>> Zongze >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> -- >>>>>> Best wishes, >>>>>> Zongze >>>>>> >>>>> >>>>> >>>>> -- >>>>> Stefano >>>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yann.jobic at univ-amu.fr Tue May 23 07:08:54 2023 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Tue, 23 May 2023 14:08:54 +0200 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: References: Message-ID: <9baaef75-0572-4491-da2c-979c10789b56@univ-amu.fr> If i may, you can use the command line option "-mat_mumps_icntl_4 2" MUMPS then gives infomations about the factorization step, such as the estimated needed memory. Best regards, Yann Le 5/23/2023 ? 11:59 AM, Matthew Knepley a ?crit?: > On Mon, May 22, 2023 at 10:42?PM Zongze Yang > wrote: > > On Tue, 23 May 2023 at 05:31, Stefano Zampini > > wrote: > > If I may add to the discussion, it may be that you are going OOM > since you are trying to factorize a 3 million dofs?problem, this > problem goes undetected and then fails at a later stage > > Thank you for your comment. I ran the problem with 90 processes > distributed across three nodes, each equipped with 500G of memory. > If this amount of memory is sufficient for solving the matrix with > approximately 3 million degrees of freedom? > > > It really depends on the fill. Suppose that you get 1% fill, then > > ? (3e6)^2 * 0.01 * 8 = 1e12 B > > and you have 1.5e12 B, so I could easily see running out of memory. > > ? Thanks, > > ? ? ?Matt > > Thanks! > Zongze > > Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang > > ha scritto: > > Thanks! > > Zongze > > Matthew Knepley >?2023?5?23? ??00:09??? > > On Mon, May 22, 2023 at 11:07?AM Zongze Yang > > wrote: > > Hi, > > I hope this letter finds you well. I am writing to > seek guidance regarding an error I encountered while > solving a matrix using MUMPS on multiple nodes: > > > Iprobe is buggy on several MPI implementations. PETSc > has an option for shutting it off for this reason. > I do not know how to shut it off inside MUMPS however. I > would mail their mailing list to see. > > ? Thanks, > > ? ? ?Matt > > ```bash > Abort(1681039) on node 60 (rank 60 in comm 240): > Fatal error in PMPI_Iprobe: Other MPI error, error > stack: > PMPI_Iprobe(124)..............: > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > comm=0xc4000026, flag=0x7ffc130f9c4c, > status=0x7ffc130f9e80) failed > MPID_Iprobe(240)..............: > MPIDI_iprobe_safe(108)........: > MPIDI_iprobe_unsafe(35).......: > MPIDI_OFI_do_iprobe(69).......: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Assertion failed in file > src/mpid/ch4/netmod/ofi/ofi_events.c at line 125: 0 > ``` > > The matrix in question has a degree of freedom (dof) > of 3.86e+06. Interestingly, when solving > smaller-scale problems, everything functions > perfectly without any issues. However, when > attempting to solve the larger matrix on multiple > nodes, I encounter the aforementioned error. > > The complete error message I received is as follows: > ```bash > Abort(1681039) on node 60 (rank 60 in comm 240): > Fatal error in PMPI_Iprobe: Other MPI error, error > stack: > PMPI_Iprobe(124)..............: > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > comm=0xc4000026, flag=0x7ffc130f9c4c, > status=0x7ffc130f9e80) failed > MPID_Iprobe(240)..............: > MPIDI_iprobe_safe(108)........: > MPIDI_iprobe_unsafe(35).......: > MPIDI_OFI_do_iprobe(69).......: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Assertion failed in file > src/mpid/ch4/netmod/ofi/ofi_events.c at line 125: 0 > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) [0x7f6076063f2c] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) [0x7f6075fc5c24] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) [0x7f6076044c51] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) [0x7f6076047799] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) [0x7f6075ff9e18] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) [0x7f6075ffa272] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) [0x7f6075e76836] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) [0x7f6075e7690d] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) [0x7f607602937b] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) [0x7f6075ff5471] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) [0x7f6075fafacd] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) [0x7f6075fafbea] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) [0x7f6075ddd542] > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) [0x7f606e08f19f] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) [0x7f60737b194d] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) [0x7f60738ab735] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) [0x7f607378bcc8] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) [0x7f6073881d36] > Abort(805938831) on node 51 (rank 51 in comm 240): > Fatal error in PMPI_Iprobe: Other MPI error, error > stack: > PMPI_Iprobe(124)..............: > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > comm=0xc4000017, flag=0x7ffe20e1402c, > status=0x7ffe20e14260) failed > MPID_Iprobe(244)..............: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) [0x7f60738831a1] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) [0x7f60738446c9] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) [0x7f60738bf9cf] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) [0x7f60738c33bc] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) [0x7f60738baacb] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) [0x7f6077297560] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) [0x7f60773bb1e6] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) [0x7f6077954665] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) [0x7f60779c77e0] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) [0x7f6077ac2d53] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) [0x7f6077ac4c28] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) [0x7f6077ac8070] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) [0x7f6077b279df] > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) [0x7f6077b676c6] > Abort(1) on node 60: Internal error > Abort(1007265423) on node 65 (rank 65 in comm 240): > Fatal error in PMPI_Iprobe: Other MPI error, error > stack: > PMPI_Iprobe(124)..............: > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > comm=0xc4000017, flag=0x7fff4d82827c, > status=0x7fff4d8284b0) failed > MPID_Iprobe(244)..............: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Abort(941205135) on node 32 (rank 32 in comm 240): > Fatal error in PMPI_Iprobe: Other MPI error, error > stack: > PMPI_Iprobe(124)..............: > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > comm=0xc4000017, flag=0x7fff715ba3fc, > status=0x7fff715ba630) failed > MPID_Iprobe(240)..............: > MPIDI_iprobe_safe(108)........: > MPIDI_iprobe_unsafe(35).......: > MPIDI_OFI_do_iprobe(69).......: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Abort(470941839) on node 75 (rank 75 in comm 0): > Fatal error in PMPI_Test: Other MPI error, error stack: > PMPI_Test(188)................: > MPI_Test(request=0x7efe31e03014, > flag=0x7ffea65d673c, status=0x7ffea65d6760) failed > MPIR_Test(73).................: > MPIR_Test_state(33)...........: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Abort(805946511) on node 31 (rank 31 in comm 256): > Fatal error in PMPI_Probe: Other MPI error, error stack: > PMPI_Probe(118)...............: > MPI_Probe(src=MPI_ANY_SOURCE, tag=7, > comm=0xc4000015, status=0x7fff9538b7a0) failed > MPID_Probe(159)...............: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > Abort(1179791) on node 73 (rank 73 in comm 0): Fatal > error in PMPI_Test: Other MPI error, error stack: > PMPI_Test(188)................: > MPI_Test(request=0x5b638d4, flag=0x7ffd755119cc, > status=0x7ffd755121b0) failed > MPIR_Test(73).................: > MPIR_Test_state(33)...........: > progress_test(100)............: > MPIDI_OFI_handle_cq_error(949): OFI poll failed > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > ``` > > Thank you very much for your time and consideration. > > Best wishes, > Zongze > > > > -- > What most experimenters take for granted before they > begin their experiments is infinitely more interesting > than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > Best wishes, > Zongze > > > > -- > Stefano > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From yangzongze at gmail.com Tue May 23 07:23:05 2023 From: yangzongze at gmail.com (Zongze Yang) Date: Tue, 23 May 2023 20:23:05 +0800 Subject: [petsc-users] MPI_Iprobe Error with MUMPS Solver on Multi-Nodes In-Reply-To: <9baaef75-0572-4491-da2c-979c10789b56@univ-amu.fr> References: <9baaef75-0572-4491-da2c-979c10789b56@univ-amu.fr> Message-ID: On Tue, 23 May 2023 at 20:09, Yann Jobic wrote: > If i may, you can use the command line option "-mat_mumps_icntl_4 2" > MUMPS then gives infomations about the factorization step, such as the > estimated needed memory. > > Thank you for your suggestion! Best wishes, Zongze Best regards, > > Yann > > Le 5/23/2023 ? 11:59 AM, Matthew Knepley a ?crit : > > On Mon, May 22, 2023 at 10:42?PM Zongze Yang > > wrote: > > > > On Tue, 23 May 2023 at 05:31, Stefano Zampini > > > > wrote: > > > > If I may add to the discussion, it may be that you are going OOM > > since you are trying to factorize a 3 million dofs problem, this > > problem goes undetected and then fails at a later stage > > > > Thank you for your comment. I ran the problem with 90 processes > > distributed across three nodes, each equipped with 500G of memory. > > If this amount of memory is sufficient for solving the matrix with > > approximately 3 million degrees of freedom? > > > > > > It really depends on the fill. Suppose that you get 1% fill, then > > > > (3e6)^2 * 0.01 * 8 = 1e12 B > > > > and you have 1.5e12 B, so I could easily see running out of memory. > > > > Thanks, > > > > Matt > > > > Thanks! > > Zongze > > > > Il giorno lun 22 mag 2023 alle ore 20:03 Zongze Yang > > > ha scritto: > > > > Thanks! > > > > Zongze > > > > Matthew Knepley > >?2023?5?23? ??00:09??? > > > > On Mon, May 22, 2023 at 11:07?AM Zongze Yang > > > > wrote: > > > > Hi, > > > > I hope this letter finds you well. I am writing to > > seek guidance regarding an error I encountered while > > solving a matrix using MUMPS on multiple nodes: > > > > > > Iprobe is buggy on several MPI implementations. PETSc > > has an option for shutting it off for this reason. > > I do not know how to shut it off inside MUMPS however. I > > would mail their mailing list to see. > > > > Thanks, > > > > Matt > > > > ```bash > > Abort(1681039) on node 60 (rank 60 in comm 240): > > Fatal error in PMPI_Iprobe: Other MPI error, error > > stack: > > PMPI_Iprobe(124)..............: > > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > > comm=0xc4000026, flag=0x7ffc130f9c4c, > > status=0x7ffc130f9e80) failed > > MPID_Iprobe(240)..............: > > MPIDI_iprobe_safe(108)........: > > MPIDI_iprobe_unsafe(35).......: > > MPIDI_OFI_do_iprobe(69).......: > > MPIDI_OFI_handle_cq_error(949): OFI poll failed > > > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > > Assertion failed in file > > src/mpid/ch4/netmod/ofi/ofi_events.c at line 125: 0 > > ``` > > > > The matrix in question has a degree of freedom (dof) > > of 3.86e+06. Interestingly, when solving > > smaller-scale problems, everything functions > > perfectly without any issues. However, when > > attempting to solve the larger matrix on multiple > > nodes, I encounter the aforementioned error. > > > > The complete error message I received is as follows: > > ```bash > > Abort(1681039) on node 60 (rank 60 in comm 240): > > Fatal error in PMPI_Iprobe: Other MPI error, error > > stack: > > PMPI_Iprobe(124)..............: > > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > > comm=0xc4000026, flag=0x7ffc130f9c4c, > > status=0x7ffc130f9e80) failed > > MPID_Iprobe(240)..............: > > MPIDI_iprobe_safe(108)........: > > MPIDI_iprobe_unsafe(35).......: > > MPIDI_OFI_do_iprobe(69).......: > > MPIDI_OFI_handle_cq_error(949): OFI poll failed > > > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > > Assertion failed in file > > src/mpid/ch4/netmod/ofi/ofi_events.c at line 125: 0 > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPL_backtrace_show+0x26) > [0x7f6076063f2c] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x41dc24) > [0x7f6075fc5c24] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49cc51) > [0x7f6076044c51] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x49f799) > [0x7f6076047799] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x451e18) > [0x7f6075ff9e18] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x452272) > [0x7f6075ffa272] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce836) > [0x7f6075e76836] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x2ce90d) > [0x7f6075e7690d] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x48137b) > [0x7f607602937b] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x44d471) > [0x7f6075ff5471] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(+0x407acd) > [0x7f6075fafacd] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPIR_Err_return_comm+0x10a) > [0x7f6075fafbea] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpi.so.12(MPI_Iprobe+0x312) > [0x7f6075ddd542] > > > /nfs/opt/cascadelake/linux-centos7-cascadelake/gcc-9.4.0/mpich-3.4.2-qgtz76gekvjzuacy7wq5a26rqlewoxfc/lib/libmpifort.so.12(pmpi_iprobe+0x2f) > [0x7f606e08f19f] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_load_MOD_zmumps_load_recv_msgs+0x142) > [0x7f60737b194d] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_try_recvtreat_+0x34) > [0x7f60738ab735] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(__zmumps_fac_par_m_MOD_zmumps_fac_par+0x991) > [0x7f607378bcc8] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_par_i_+0x240) > [0x7f6073881d36] > > Abort(805938831) on node 51 (rank 51 in comm 240): > > Fatal error in PMPI_Iprobe: Other MPI error, error > > stack: > > PMPI_Iprobe(124)..............: > > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > > comm=0xc4000017, flag=0x7ffe20e1402c, > > status=0x7ffe20e14260) failed > > MPID_Iprobe(244)..............: > > progress_test(100)............: > > MPIDI_OFI_handle_cq_error(949): OFI poll failed > > > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_b_+0x1463) > [0x7f60738831a1] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_fac_driver_+0x6969) > [0x7f60738446c9] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_+0x2d83) > [0x7f60738bf9cf] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_f77_+0x178c) > [0x7f60738c33bc] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/mumps-5.5.1-gb7wlwxwbalf5rw5vkp6gtkhfkdqpntz/lib/libzmumps.so(zmumps_c+0x8f8) > [0x7f60738baacb] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x894560) > [0x7f6077297560] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(MatLUFactorNumeric+0x32e) > [0x7f60773bb1e6] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0xf51665) > [0x7f6077954665] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(PCSetUp+0x64b) > [0x7f60779c77e0] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSetUp+0xfb6) > [0x7f6077ac2d53] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x10c1c28) > [0x7f6077ac4c28] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(KSPSolve+0x13) > [0x7f6077ac8070] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(+0x11249df) > [0x7f6077b279df] > > > /nfs/home/zzyang/opt/software/linux-centos7-cascadelake/gcc-9.4.0/petsc-develop-5wrc3y6lyelr3iyrlm3sr2jlh2wxif3k/lib/libpetsc.so.3.019(SNESSolve+0x10df) > [0x7f6077b676c6] > > Abort(1) on node 60: Internal error > > Abort(1007265423) on node 65 (rank 65 in comm 240): > > Fatal error in PMPI_Iprobe: Other MPI error, error > > stack: > > PMPI_Iprobe(124)..............: > > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > > comm=0xc4000017, flag=0x7fff4d82827c, > > status=0x7fff4d8284b0) failed > > MPID_Iprobe(244)..............: > > progress_test(100)............: > > MPIDI_OFI_handle_cq_error(949): OFI poll failed > > > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > > Abort(941205135) on node 32 (rank 32 in comm 240): > > Fatal error in PMPI_Iprobe: Other MPI error, error > > stack: > > PMPI_Iprobe(124)..............: > > MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, > > comm=0xc4000017, flag=0x7fff715ba3fc, > > status=0x7fff715ba630) failed > > MPID_Iprobe(240)..............: > > MPIDI_iprobe_safe(108)........: > > MPIDI_iprobe_unsafe(35).......: > > MPIDI_OFI_do_iprobe(69).......: > > MPIDI_OFI_handle_cq_error(949): OFI poll failed > > > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > > Abort(470941839) on node 75 (rank 75 in comm 0): > > Fatal error in PMPI_Test: Other MPI error, error > stack: > > PMPI_Test(188)................: > > MPI_Test(request=0x7efe31e03014, > > flag=0x7ffea65d673c, status=0x7ffea65d6760) failed > > MPIR_Test(73).................: > > MPIR_Test_state(33)...........: > > progress_test(100)............: > > MPIDI_OFI_handle_cq_error(949): OFI poll failed > > > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > > Abort(805946511) on node 31 (rank 31 in comm 256): > > Fatal error in PMPI_Probe: Other MPI error, error > stack: > > PMPI_Probe(118)...............: > > MPI_Probe(src=MPI_ANY_SOURCE, tag=7, > > comm=0xc4000015, status=0x7fff9538b7a0) failed > > MPID_Probe(159)...............: > > progress_test(100)............: > > MPIDI_OFI_handle_cq_error(949): OFI poll failed > > > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > > Abort(1179791) on node 73 (rank 73 in comm 0): Fatal > > error in PMPI_Test: Other MPI error, error stack: > > PMPI_Test(188)................: > > MPI_Test(request=0x5b638d4, flag=0x7ffd755119cc, > > status=0x7ffd755121b0) failed > > MPIR_Test(73).................: > > MPIR_Test_state(33)...........: > > progress_test(100)............: > > MPIDI_OFI_handle_cq_error(949): OFI poll failed > > > (ofi_events.c:951:MPIDI_OFI_handle_cq_error:Input/output error) > > ``` > > > > Thank you very much for your time and consideration. > > > > Best wishes, > > Zongze > > > > > > > > -- > > What most experimenters take for granted before they > > begin their experiments is infinitely more interesting > > than any results to which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > Best wishes, > > Zongze > > > > > > > > -- > > Stefano > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From FERRANJ2 at my.erau.edu Tue May 23 19:43:43 2023 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Wed, 24 May 2023 00:43:43 +0000 Subject: [petsc-users] DMLabel to extract height-0 points by their DMPolytope value Message-ID: Dear PETSc team: I am trying to use DMPlex and DMLabel to develop an API to write plexes to .cgns format in parallel. To that end, I need a way to extract the height-0 points and sort them by topological type (i.e., chunk of tetrahedra, followed by chunk of pyramids, etc.). I figured I could use the DMLabel produced by DMPlexComputeCellTypes() as follows: ** I get the "celltype" DMLabel ** PetscBool has_tetrahedra, has_hexahedra, has_pyramids, has_tri_prisms; PetscCall(DMLabelHasStratum(ctype_label, DM_POLYTOPE_TETRAHEDRON, &has_tetrahedra)); PetscCall(DMLabelHasStratum(ctype_label, DM_POLYTOPE_HEXAHEDRON, &has_hexahedra)); PetscCall(DMLabelHasStratum(ctype_label, DM_POLYTOPE_PYRAMID, &has_pyramids)); PetscCall(DMLabelHasStratum(ctype_label, DM_POLYTOPE_TRI_PRISM, &has_tri_prisms)); PetscInt nTopology = (PetscInt)has_tetrahedra + (PetscInt)has_hexahedra + (PetscInt)has_pyramids + (PetscInt)has_tri_prisms; PetscInt *pType, *vType, *nType; PetscMalloc3(nTopology, &pType, nTopology, &vType, nTopology, &nType); PetscInt counter = -1; if(has_tetrahedra){ ??????counter++; ?pType[counter] = DM_POLYTOPE_TETRAHEDRON; ?? ???vType[counter] = 4; ?PetscCall(DMLabelGetStratumSize(ctype_label, DM_POLYTOPE_TETRAHEDRON, &nType[counter])); } ** Repeat this pattern of if-statement for the rest. ** if(has_tri_prisms) { ??????counter++; ?pType[counter] = DM_POLYTOPE_TRI_PRISM; ?? ???vType[counter] = 6; ?PetscCall(DMLabelGetStratumSize(ctype_label, DM_POLYTOPE_TRI_PRISM, &nType[counter])); } IS pIS; PetscInt StratumIdx; const PetscInt* pPoints; for(PetscInt ii = 0; ii < nTopology; ii++){ ??????PetscCall(DMLabelGetValueIndex(ctype_label, pType[ii], &StratumIdx)); ??PetscCall(DMLabelGetStratumIS(ctype_label, StratumIdx, &pIS)); ??PetscCall(ISGetIndices(pIS, &pPoints)); ??????for(PetscInt jj = 0; jj < nType[ii]; jj++){ ????????PetscCall(DMPlexGetTransitiveClosure(dm, pPoints[ii], PETSC_TRUE, &ClosureSize, &pClosure)); ??????** Assemble connectivity array for each chunk of height-0 topology ** ????????PetscCall(DMPlexRestoreTransitiveClosure(dm, pPoints[ii], PETSC_TRUE, &ClosureSize, &pClosure)); ??????} ??PetscCall(ISRestoreIndices(pS, &pPoints)); ??PetscCall(ISDestroy(&pIS)); } PetscFree3(pType, vType, nType); I think, that in principle, this is the correct approach for my immediate objective. However, my problem is that the DAG points returned by pIS through pPoints are outside the height-0 stratum. I can tell because I'm printing the contents of pPoints and comparing againts my DAG's height/depth strata. It is as if the DMLabel has a different numbering from the DAG. Also, for DMLabel's APIs, is the "stratum value" the same thing as the "label value"? My gutt feeling is that the former is 0 <= StratumValue < nStrata, and the latter could be potentially disjoint (e.g., LabelValue in [-1, 0, 3 , 6, 7, 8]). Is that what DMLabelGetValueIndex() is for? Sincerely: J.A. Ferrand Embry-Riddle Aeronautical University - Daytona Beach - FL Ph.D. Candidate, Aerospace Engineering M.Sc. Aerospace Engineering B.Sc. Aerospace Engineering B.Sc. Computational Mathematics Phone: (386)-843-1829 Email(s): ferranj2 at my.erau.edu jesus.ferrand at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From vilmer.dahlberg at solid.lth.se Wed May 24 05:45:12 2023 From: vilmer.dahlberg at solid.lth.se (Vilmer Dahlberg) Date: Wed, 24 May 2023 10:45:12 +0000 Subject: [petsc-users] Issues creating DMPlex from higher order mesh generated by gmsh In-Reply-To: References: <3d6762c6d99c4d319e8e985c91bc739e@solid.lth.se> <87o7mln2mg.fsf@jedbrown.org>, Message-ID: Hi Matt and Jed, Thanks for your replies. I did not consider that distinction Matt, that makes clearer. Maybe it's time to the DAG-stuff on the documentation another attempt... Jed, could you point me in the direction of a possible solution to this, if there exists one? In my (FEM) application I'm not using the weak-form stuff, but do I use PetscFE to describe my data layout and access it with DMPlexVec(Mat)Set(Get)Closure and friends. Thanks! Vilmer ________________________________ Fr?n: Matthew Knepley Skickat: den 15 maj 2023 15:42:20 Till: Jed Brown Kopia: Vilmer Dahlberg; petsc-users at mcs.anl.gov ?mne: Re: [petsc-users] Issues creating DMPlex from higher order mesh generated by gmsh On Mon, May 15, 2023 at 9:30?AM Jed Brown > wrote: Matthew Knepley > writes: > On Fri, May 5, 2023 at 10:55?AM Vilmer Dahlberg via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi. >> >> >> I'm trying to read a mesh of higher element order, in this example a mesh >> consisting of 10-node tetrahedral elements, from gmsh, into PETSC. But It >> looks like the mesh is not properly being loaded and converted into a >> DMPlex. gmsh tells me it has generated a mesh with 7087 nodes, but when I >> view my dm object it tells me it has 1081 0-cells. This is the printout I >> get >> > > Hi Vilmer, > > Plex makes a distinction between topological entities, like vertices, edges > and cells, and the function spaces used to represent fields, like velocity > or coordinates. When formats use "nodes", they mix the two concepts > together. > > You see that if you add the number of vertices and edges, you get 7087, > since for P2 there is a "node" on every edge. Is anything else wrong? Note that quadratic (and higher order) tets are broken with the Gmsh reader. It's been on my todo list for a while. As an example, this works when using linear elements (the projection makes them quadratic and visualization is correct), but is tangled when holes.msh is quadratic. $ $PETSC_ARCH/tests/dm/impls/plex/tutorials/ex1 -dm_plex_filename ~/meshes/holes.msh -dm_view cgns:s.cgns -dm_coord_petscspace_degree 2 Projection to the continuous space is broken because we do not have the lexicographic order on simplicies done. Are you sure you are projecting into the broken space? Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 24 05:47:58 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 May 2023 06:47:58 -0400 Subject: [petsc-users] DMLabel to extract height-0 points by their DMPolytope value In-Reply-To: References: Message-ID: On Tue, May 23, 2023 at 8:44?PM Ferrand, Jesus A. wrote: > Dear PETSc team: > > I am trying to use DMPlex and DMLabel to develop an API to write plexes to > .cgns format in parallel. > To that end, I need a way to extract the height-0 points and sort them by > topological type (i.e., chunk of tetrahedra, followed by chunk of pyramids, > etc.). > I thought I already sorted them this way. Are you adding them in a different order? > I figured I could use the DMLabel produced by DMPlexComputeCellTypes() as > follows: > > ** I get the "celltype" DMLabel ** > > PetscBool has_tetrahedra, has_hexahedra, has_pyramids, has_tri_prisms; > PetscCall(DMLabelHasStratum(ctype_label, DM_POLYTOPE_TETRAHEDRON, > &has_tetrahedra)); > PetscCall(DMLabelHasStratum(ctype_label, DM_POLYTOPE_HEXAHEDRON, > &has_hexahedra)); > PetscCall(DMLabelHasStratum(ctype_label, DM_POLYTOPE_PYRAMID, > &has_pyramids)); > PetscCall(DMLabelHasStratum(ctype_label, DM_POLYTOPE_TRI_PRISM, > &has_tri_prisms)); > > PetscInt nTopology = (PetscInt)has_tetrahedra + (PetscInt)has_hexahedra + > (PetscInt)has_pyramids + (PetscInt)has_tri_prisms; > PetscInt *pType, *vType, *nType; > PetscMalloc3(nTopology, &pType, nTopology, &vType, nTopology, &nType); > PetscInt counter = -1; > if(has_tetrahedra){ > counter++; > pType[counter] = DM_POLYTOPE_TETRAHEDRON; > vType[counter] = 4; > PetscCall(DMLabelGetStratumSize(ctype_label, > DM_POLYTOPE_TETRAHEDRON, &nType[counter])); > } > > ** Repeat this pattern of if-statement for the rest. ** > > if(has_tri_prisms) { > counter++; > pType[counter] = DM_POLYTOPE_TRI_PRISM; > vType[counter] = 6; > PetscCall(DMLabelGetStratumSize(ctype_label, DM_POLYTOPE_TRI_PRISM, > &nType[counter])); > } > > IS pIS; > PetscInt StratumIdx; > const PetscInt* pPoints; > for(PetscInt ii = 0; ii < nTopology; ii++){ > PetscCall(DMLabelGetValueIndex(ctype_label, pType[ii], &StratumIdx)); > I do not understand why you need the index of this value. > PetscCall(DMLabelGetStratumIS(ctype_label, StratumIdx, &pIS)); > This does not look right. You lookup by value, not by value index. Thanks, Matt > > PetscCall(ISGetIndices(pIS, &pPoints)); > for(PetscInt jj = 0; jj < nType[ii]; jj++){ > PetscCall(DMPlexGetTransitiveClosure(dm, pPoints[ii], PETSC_TRUE, > &ClosureSize, &pClosure)); > > ** Assemble connectivity array for each chunk of height-0 topology ** > > PetscCall(DMPlexRestoreTransitiveClosure(dm, pPoints[ii], PETSC_TRUE, > &ClosureSize, &pClosure)); > } > PetscCall(ISRestoreIndices(pS, &pPoints)); > PetscCall(ISDestroy(&pIS)); > } > PetscFree3(pType, vType, nType); > > I think, that in principle, this is the correct approach for my immediate > objective. > However, my problem is that the DAG points returned by pIS through pPoints > are outside the height-0 stratum. > I can tell because I'm printing the contents of pPoints and comparing > againts my DAG's height/depth strata. > > It is as if the DMLabel has a different numbering from the DAG. > Also, for DMLabel's APIs, is the "stratum value" the same thing as the > "label value"? > My gutt feeling is that the former is 0 <= StratumValue < nStrata, and the > latter could be potentially disjoint (e.g., LabelValue in [-1, 0, 3 , 6, 7, > 8]). > Is that what DMLabelGetValueIndex() is for? > > Sincerely: > > *J.A. Ferrand* > > Embry-Riddle Aeronautical University - Daytona Beach - FL > Ph.D. Candidate, Aerospace Engineering > > M.Sc. Aerospace Engineering > > B.Sc. Aerospace Engineering > > B.Sc. Computational Mathematics > > > *Phone:* (386)-843-1829 > > *Email(s):* ferranj2 at my.erau.edu > > jesus.ferrand at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 24 16:38:11 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 May 2023 17:38:11 -0400 Subject: [petsc-users] reading and writing periodic DMPlex to file In-Reply-To: References: <83e2b092-5440-e009-ef84-dfde3ff6804d@ovgu.de> <9b91727e-fa6d-09b6-fd42-93e00947cc38@ovgu.de> Message-ID: Checking back. What does not work? Thanks, Matt On Tue, Jan 24, 2023 at 11:26?AM Matthew Knepley wrote: > On Tue, Jan 24, 2023 at 10:39 AM Berend van Wachem < > berend.vanwachem at ovgu.de> wrote: > >> Dear Matt, >> >> I have been working on this now with Petsc-3.18.3 >> >> 1) I can confirm that enforcing periodicity works for a single core >> simulation. >> >> 2) However, when using multiple cores, the code still hangs. Is there >> something I should do to fix this? Or should this be fixed in the next >> Petsc version? >> > > Dang dang dang. I forgot to merge this fix. Thanks for reminding me. It is > now here: > > https://gitlab.com/petsc/petsc/-/merge_requests/6001 > > >> 3) This is strange, as it works fine for me. >> > > Will try again with current main. > > Thanks > > Matt > > >> Thanks, best, Berend. >> >> >> On 12/15/22 18:56, Matthew Knepley wrote: >> > On Wed, Dec 14, 2022 at 3:58 AM Berend van Wachem >> > > wrote: >> > >> > >> > Dear PETSc team and users, >> > >> > I have asked a few times about this before, but we haven't really >> > gotten >> > this to work yet. >> > >> > In our code, we use the DMPlex framework and are also interested in >> > periodic geometries. >> > >> > As our simulations typically require many time-steps, we would like >> to >> > be able to save the DM to file and to read it again to resume the >> > simulation (a restart). >> > >> > Although this works for a non-periodic DM, we haven't been able to >> get >> > this to work for a periodic one. To illustrate this, I have made a >> > working example, consisting of 2 files, createandwrite.c and >> > readandcreate.c. I have attached these 2 working examples. We are >> using >> > Petsc-3.18.2. >> > >> > In the first file (createandwrite.c) a DMPlex is created and >> written to >> > a file. Periodicity is activated on lines 52-55 of the code. >> > >> > In the second file (readandcreate.c) a DMPlex is read from the file. >> > When a periodic DM is read, this does not work. Also, trying to >> > 'enforce' periodicity, lines 55 - 66, does not work if the number of >> > processes is larger than 1 - the code "hangs" without producing an >> > error. >> > >> > Could you indicate what I am missing? I have really tried many >> > different >> > options, without finding a solution. >> > >> > >> > Hi Berend, >> > >> > There are several problems. I will eventually fix all of them, but I >> > think we can get this working quickly. >> > >> > 1) Periodicity information is not saved. I will fix this, but forcing >> it >> > should work. >> > >> > 2) You were getting a hang because the blocksize on the local >> > coordinates was not set correctly after loading >> > since the vector had zero length. This does not happen in any >> test >> > because HDF5 loads a global vector, but >> > most other things create local coordinates. I have a fix for >> this, >> > which I will get in an MR, Also, I moved DMLocalizeCoordinates() >> > after distribution, since this is where it belongs. >> > >> > knepley/fix-plex-periodic-faces *$:/PETSc3/petsc/petsc-dev$ git diff >> > diff --git a/src/dm/interface/dmcoordinates.c >> > b/src/dm/interface/dmcoordinates.c >> > index a922348f95b..6437e9f7259 100644 >> > --- a/src/dm/interface/dmcoordinates.c >> > +++ b/src/dm/interface/dmcoordinates.c >> > @@ -551,10 +551,14 @@ PetscErrorCode DMGetCoordinatesLocalSetUp(DM dm) >> > PetscFunctionBegin; >> > PetscValidHeaderSpecific(dm, DM_CLASSID, 1); >> > if (!dm->coordinates[0].xl && dm->coordinates[0].x) { >> > - DM cdm = NULL; >> > + DM cdm = NULL; >> > + PetscInt bs; >> > >> > PetscCall(DMGetCoordinateDM(dm, &cdm)); >> > PetscCall(DMCreateLocalVector(cdm, &dm->coordinates[0].xl)); >> > + // If the size of the vector is 0, it will not get the right block >> size >> > + PetscCall(VecGetBlockSize(dm->coordinates[0].x, &bs)); >> > + PetscCall(VecSetBlockSize(dm->coordinates[0].xl, bs)); >> > PetscCall(PetscObjectSetName((PetscObject)dm->coordinates[0].xl, >> > "coordinates")); >> > PetscCall(DMGlobalToLocalBegin(cdm, dm->coordinates[0].x, >> > INSERT_VALUES, dm->coordinates[0].xl)); >> > PetscCall(DMGlobalToLocalEnd(cdm, dm->coordinates[0].x, >> > INSERT_VALUES, dm->coordinates[0].xl)); >> > >> > 3) If I comment out forcing the periodicity, your example does not >> run >> > for me. I will try to figure it out >> > >> > [0]PETSC ERROR: --------------------- Error Message >> > -------------------------------------------------------------- >> > [0]PETSC ERROR: Nonconforming object sizes >> > [0]PETSC ERROR: SF roots 4400 < pEnd 6000 >> > [1]PETSC ERROR: --------------------- Error Message >> > -------------------------------------------------------------- >> > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! >> > Could be the program crashed before they were used or a spelling >> > mistake, etc! >> > [1]PETSC ERROR: Nonconforming object sizes >> > [0]PETSC ERROR: Option left: name:-start_in_debugger_no (no value) >> > source: command line >> > [1]PETSC ERROR: SF roots 4400 < pEnd 6000 >> > [0]PETSC ERROR: See https://petsc.org/release/faq/ >> > for trouble shooting. >> > [0]PETSC ERROR: Petsc Development GIT revision: >> v3.18.1-494-g16200351da0 >> > GIT Date: 2022-12-12 23:42:20 +0000 >> > [1]PETSC ERROR: WARNING! There are option(s) set that were not used! >> > Could be the program crashed before they were used or a spelling >> > mistake, etc! >> > [1]PETSC ERROR: Option left: name:-start_in_debugger_no (no value) >> > source: command line >> > [0]PETSC ERROR: ./readandcreate on a arch-master-debug named >> > MacBook-Pro.cable.rcn.com by >> knepley >> > Thu Dec 15 12:50:26 2022 >> > [1]PETSC ERROR: See https://petsc.org/release/faq/ >> > for trouble shooting. >> > [0]PETSC ERROR: Configure options --PETSC_ARCH=arch-master-debug >> > --download-bamg --download-bison --download-chaco --download-ctetgen >> > --download-egads --download-eigen --download-exodusii --download-fftw >> > --download-hpddm --download-ks --download-libceed --download-libpng >> > --download-metis --download-ml --download-mumps --download-muparser >> > --download-netcdf --download-opencascade --download-p4est >> > --download-parmetis --download-pnetcdf --download-pragmatic >> > --download-ptscotch --download-scalapack --download-slepc >> > --download-suitesparse --download-superlu_dist --download-tetgen >> > --download-triangle --with-cmake-exec=/PETSc3/petsc/apple/bin/cmake >> > --with-ctest-exec=/PETSc3/petsc/apple/bin/ctest >> > --with-hdf5-dir=/PETSc3/petsc/apple --with-mpi-dir=/PETSc3/petsc/apple >> > --with-petsc4py=1 --with-shared-libraries --with-slepc --with-zlib >> > [1]PETSC ERROR: Petsc Development GIT revision: >> v3.18.1-494-g16200351da0 >> > GIT Date: 2022-12-12 23:42:20 +0000 >> > [0]PETSC ERROR: #1 PetscSectionCreateGlobalSection() at >> > /PETSc3/petsc/petsc-dev/src/vec/is/section/interface/section.c:1322 >> > [1]PETSC ERROR: ./readandcreate on a arch-master-debug named >> > MacBook-Pro.cable.rcn.com by >> knepley >> > Thu Dec 15 12:50:26 2022 >> > [0]PETSC ERROR: #2 DMGetGlobalSection() at >> > /PETSc3/petsc/petsc-dev/src/dm/interface/dm.c:4527 >> > [1]PETSC ERROR: Configure options --PETSC_ARCH=arch-master-debug >> > --download-bamg --download-bison --download-chaco --download-ctetgen >> > --download-egads --download-eigen --download-exodusii --download-fftw >> > --download-hpddm --download-ks --download-libceed --download-libpng >> > --download-metis --download-ml --download-mumps --download-muparser >> > --download-netcdf --download-opencascade --download-p4est >> > --download-parmetis --download-pnetcdf --download-pragmatic >> > --download-ptscotch --download-scalapack --download-slepc >> > --download-suitesparse --download-superlu_dist --download-tetgen >> > --download-triangle --with-cmake-exec=/PETSc3/petsc/apple/bin/cmake >> > --with-ctest-exec=/PETSc3/petsc/apple/bin/ctest >> > --with-hdf5-dir=/PETSc3/petsc/apple --with-mpi-dir=/PETSc3/petsc/apple >> > --with-petsc4py=1 --with-shared-libraries --with-slepc --with-zlib >> > [0]PETSC ERROR: #3 DMPlexSectionLoad_HDF5_Internal() at >> > /PETSc3/petsc/petsc-dev/src/dm/impls/plex/plexhdf5.c:2750 >> > [1]PETSC ERROR: #1 PetscSectionCreateGlobalSection() at >> > /PETSc3/petsc/petsc-dev/src/vec/is/section/interface/section.c:1322 >> > [0]PETSC ERROR: #4 DMPlexSectionLoad() at >> > /PETSc3/petsc/petsc-dev/src/dm/impls/plex/plex.c:2364 >> > [1]PETSC ERROR: #2 DMGetGlobalSection() at >> > /PETSc3/petsc/petsc-dev/src/dm/interface/dm.c:4527 >> > [0]PETSC ERROR: #5 main() at >> > /Users/knepley/Downloads/tmp/Berend/readandcreate.c:85 >> > [1]PETSC ERROR: #3 DMPlexSectionLoad_HDF5_Internal() at >> > /PETSc3/petsc/petsc-dev/src/dm/impls/plex/plexhdf5.c:2750 >> > [0]PETSC ERROR: PETSc Option Table entries: >> > [0]PETSC ERROR: -malloc_debug (source: environment) >> > [1]PETSC ERROR: #4 DMPlexSectionLoad() at >> > /PETSc3/petsc/petsc-dev/src/dm/impls/plex/plex.c:2364 >> > [1]PETSC ERROR: #5 main() at >> > /Users/knepley/Downloads/tmp/Berend/readandcreate.c:85 >> > [0]PETSC ERROR: -start_in_debugger_no (source: command line) >> > [1]PETSC ERROR: PETSc Option Table entries: >> > [0]PETSC ERROR: ----------------End of Error Message -------send entire >> > error message to petsc-maint at mcs.anl.gov---------- >> > [1]PETSC ERROR: -malloc_debug (source: environment) >> > application called MPI_Abort(MPI_COMM_SELF, 60) - process 0 >> > [1]PETSC ERROR: -start_in_debugger_no (source: command line) >> > [1]PETSC ERROR: ----------------End of Error Message -------send entire >> > error message to petsc-maint at mcs.anl.gov---------- >> > application called MPI_Abort(MPI_COMM_SELF, 60) - process 0 >> > 4) We now have parallel HDF5 loading, so you should not have to >> manually >> > distribute. I will change your example to use it >> > and send it back when I am done. >> > >> > Thanks! >> > >> > Matt >> > >> > Many thanks and kind regards, >> > Berend. >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments is infinitely more interesting than any results to which >> > their experiments lead. >> > -- Norbert Wiener >> > >> > https://www.cse.buffalo.edu/~knepley/ < >> http://www.cse.buffalo.edu/~knepley/> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mseneca at anl.gov Thu May 25 12:35:13 2023 From: mseneca at anl.gov (SENECA, MICHAEL) Date: Thu, 25 May 2023 17:35:13 +0000 Subject: [petsc-users] Build Error Message-ID: Hi all, I have been attempting to install cardinal on my new MacBook M2 Pro chip but I have run into some errors when attempting to build petsc, the script attempts to access /usr/bin/libtoolize which does not exist on my MacBook. I have libtoolize installed via homebrew and have made a link from the homebrew installation to usr/local/bin/libtoolize But the script does not look in the local directory. From what I have gathered online, the /usr/bin/ should not be edited as it is managed by macOS and its system software which can lead to system instability. Do any of you know of a way around to get the petsc script to just look for libtoolize from my path and not /usr/bin/libtoolize? Best regards, Michael Seneca -------------- next part -------------- An HTML attachment was scrubbed... URL: From xiongziming2010 at gmail.com Fri May 26 09:05:29 2023 From: xiongziming2010 at gmail.com (ziming xiong) Date: Fri, 26 May 2023 16:05:29 +0200 Subject: [petsc-users] Questions about the DMDAGetElements command Message-ID: Hello? I am using the command DMDAGetElements to get the number of elements in the grid and the index of the points within each element, but I found a problem that the accuracy of this command depends on the number of x, y, z direction cels I then create. Take \src\ksp\ksp\tutorials\71 as an example, when the values of cells are 4,3,2. using two processes to complete -dim 3 -cells 4,3,2 -pde_type Poisson -use_global, here the index of the points in the elements in each process will have problems, the given local index corresponds to the coordinates of the points in the local to see that the cell will have problems. Take my results as an example, process 1 gives the local indices of the first element's vertices (0 1 5 4 16 17 21 20). But the corresponding coordinates are 0 1 5 4 16 17 21 20 x 0 2 0 8 2 4 2 0 y 0 0 2 0 6 6 0 0 z 0 0 0 0 0 0 2 2 You can see that this is not a correct element Ziming -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri May 26 09:39:57 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 26 May 2023 10:39:57 -0400 Subject: [petsc-users] Build Error In-Reply-To: References: Message-ID: <5DF1A47F-1466-4F02-B77F-1E154B4C2609@petsc.dev> PETSc configure is suppose to handle this cleanly. Please send configure.log to petsc-maint at mcs.anl.gov as we need more context to understand why it is not working. PETSc configure looks for libgtoolize (which is what brew names it) and uses it for libtoolize You can use --with-libtoolize-exec=pathtolibtoolize (or --with-libtoolize=pathtolibtoolize for older versions of PETSc) to select the executable PETSc uses > On May 25, 2023, at 1:35 PM, SENECA, MICHAEL via petsc-users wrote: > > Hi all, > > I have been attempting to install cardinal on my new MacBook M2 Pro chip but I have run into some errors when attempting to build petsc, the script attempts to access > /usr/bin/libtoolize > which does not exist on my MacBook. I have libtoolize installed via homebrew and have made a link from the homebrew installation to > usr/local/bin/libtoolize > But the script does not look in the local directory. From what I have gathered online, the /usr/bin/ should not be edited as it is managed by macOS and its system software which can lead to system instability. Do any of you know of a way around to get the petsc script to just look for libtoolize from my path and not /usr/bin/libtoolize? > > Best regards, > Michael Seneca -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri May 26 12:41:38 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 26 May 2023 13:41:38 -0400 Subject: [petsc-users] Build Error In-Reply-To: References: <5DF1A47F-1466-4F02-B77F-1E154B4C2609@petsc.dev> Message-ID: <3747B62E-D651-4D49-9C3C-33CD32E95FAA@petsc.dev> When PETSc is configured it creates a file configure.log in the PETSC_DIR directory; when make is run on it it creates a make.log file. The configure.log file is very useful for figuring anything that goes wrong. > On May 26, 2023, at 1:32 PM, SENECA, MICHAEL wrote: > > Where might I find the configure.log? Or do you mean the petsc build log information? > > > From: Barry Smith > > Date: Friday, May 26, 2023 at 9:40 AM > To: SENECA, MICHAEL > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Build Error > > > PETSc configure is suppose to handle this cleanly. Please send configure.log to petsc-maint at mcs.anl.gov as we need more context to understand why it is not working. > > PETSc configure looks for libgtoolize (which is what brew names it) and uses it for libtoolize > > You can use --with-libtoolize-exec=pathtolibtoolize (or --with-libtoolize=pathtolibtoolize for older versions of PETSc) to select the executable PETSc uses > > > > > > On May 25, 2023, at 1:35 PM, SENECA, MICHAEL via petsc-users > wrote: > > Hi all, > > I have been attempting to install cardinal on my new MacBook M2 Pro chip but I have run into some errors when attempting to build petsc, the script attempts to access > /usr/bin/libtoolize > which does not exist on my MacBook. I have libtoolize installed via homebrew and have made a link from the homebrew installation to > usr/local/bin/libtoolize > But the script does not look in the local directory. From what I have gathered online, the /usr/bin/ should not be edited as it is managed by macOS and its system software which can lead to system instability. Do any of you know of a way around to get the petsc script to just look for libtoolize from my path and not /usr/bin/libtoolize? > > Best regards, > Michael Seneca -------------- next part -------------- An HTML attachment was scrubbed... URL: From spradeepmahadeek at gmail.com Sun May 28 19:34:34 2023 From: spradeepmahadeek at gmail.com (s.pradeep kumar) Date: Sun, 28 May 2023 19:34:34 -0500 Subject: [petsc-users] Regarding Issue with Metis Double Precision Message-ID: Dear All, * I recently tried to install Metis along with PETSC 3.19.1, using following command,* ./configure --with-scalar-type=real --with-precision=double --download-metis --download-parmetis --with-cmake=1 --with-mpi-dir=/opt/cray/pe/mpich/8.1.9/ofi/gnu/9.1 --download-fblaslapack=1 --with-debugging=0 COPTFLAGS=-O2 FOPTFLAGS=-O2 *While compiling my code, I get this warning sign, * 34: warning: passing argument 10 of 'ParMETIS_V3_PartKway' from incompatible pointer type [-Wincompatible-pointer-types] 79 | nparts,tpwgts,ubvec,options, | ^~~~~~ | | | PetscScalar * {aka double *} In file included from /v2_petsc_check/ParmetisWrappers.c:5: NumericalLibraries/petsc-3.19.1/gnu-opt/include/parmetis.h:70:22: note: expected 'real_t *' {aka 'float *'} but argument is of type 'PetscScalar *' {aka 'double *'} 70 | real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t *edgecut, idx_t *part, | ~~~~~~~~^~~~~~ /v2_petsc_check/ParmetisWrappers.c:79:41: warning: passing argument 11 of 'ParMETIS_V3_PartKway' from incompatible pointer type [-Wincompatible-pointer-types] 79 | nparts,tpwgts,ubvec,options, | ^~~~~ | | | PetscScalar * {aka double *} *Ignoring the warning sign leads to code crashing due to Parmetis Error,* parmetis error: the sum of tpwgts for constraint #0 is not 1.0 *How should I proceed further?* Regards, Pradeep -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun May 28 20:12:18 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 28 May 2023 21:12:18 -0400 Subject: [petsc-users] Regarding Issue with Metis Double Precision In-Reply-To: References: Message-ID: <9BC6A91A-F199-4BAD-81C1-CD7523A83590@petsc.dev> It looks you are calling Metis routines directly. In that case, you can do one of two things 1) use the additional PETSc ./configure option -download-metis-use-doubleprecision=1 or 2) ensure that the arguments such as tpwgts that you pass in are not PetscScalar but instead float. This would require changing your code slightly. Barry > On May 28, 2023, at 8:34 PM, s.pradeep kumar wrote: > > Dear All, > > I recently tried to install Metis along with PETSC 3.19.1, using following command, > > ./configure --with-scalar-type=real --with-precision=double --download-metis --download-parmetis --with-cmake=1 --with-mpi-dir=/opt/cray/pe/mpich/8.1.9/ofi/gnu/9.1 --download-fblaslapack=1 --with-debugging=0 COPTFLAGS=-O2 FOPTFLAGS=-O2 > > While compiling my code, I get this warning sign, > > 34: warning: passing argument 10 of 'ParMETIS_V3_PartKway' from incompatible pointer type [-Wincompatible-pointer-types] > 79 | nparts,tpwgts,ubvec,options, > | ^~~~~~ > | | > | PetscScalar * {aka double *} > In file included from /v2_petsc_check/ParmetisWrappers.c:5: > NumericalLibraries/petsc-3.19.1/gnu-opt/include/parmetis.h:70:22: note: expected 'real_t *' {aka 'float *'} but argument is of type 'PetscScalar *' {aka 'double *'} > 70 | real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t *edgecut, idx_t *part, > | ~~~~~~~~^~~~~~ > /v2_petsc_check/ParmetisWrappers.c:79:41: warning: passing argument 11 of 'ParMETIS_V3_PartKway' from incompatible pointer type [-Wincompatible-pointer-types] > 79 | nparts,tpwgts,ubvec,options, > | ^~~~~ > | | > | PetscScalar * {aka double *} > > Ignoring the warning sign leads to code crashing due to Parmetis Error, > > parmetis error: the sum of tpwgts for constraint #0 is not 1.0 > > How should I proceed further? > > Regards, > Pradeep > -------------- next part -------------- An HTML attachment was scrubbed... URL: From spradeepmahadeek at gmail.com Sun May 28 20:26:00 2023 From: spradeepmahadeek at gmail.com (s.pradeep kumar) Date: Sun, 28 May 2023 20:26:00 -0500 Subject: [petsc-users] Regarding Issue with Metis Double Precision In-Reply-To: <9BC6A91A-F199-4BAD-81C1-CD7523A83590@petsc.dev> References: <9BC6A91A-F199-4BAD-81C1-CD7523A83590@petsc.dev> Message-ID: Thanks! I tried adding the flag but it throws in the following error: ============================================================================================= Configuring PETSc to compile on your system ============================================================================================= TESTING: configureExternalPackagesDir from config.framework(config/BuildSystem/config/framework.py:1070) ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- Package parmetis requested but dependency metis not requested. Perhaps you want --download-metis or --with-metis-dir=directory or --with-metis-lib=libraries and --with-metis-include=directory ********************************************************************************************* On Sun, May 28, 2023 at 8:12?PM Barry Smith wrote: > > It looks you are calling Metis routines directly. In that case, you can > do one of two things > > 1) use the additional PETSc ./configure option > -download-metis-use-doubleprecision=1 > > or > > 2) ensure that the arguments such as tpwgts that you pass in are not > PetscScalar but instead float. This would require changing your code > slightly. > > Barry > > > On May 28, 2023, at 8:34 PM, s.pradeep kumar > wrote: > > Dear All, > > * I recently tried to install Metis along with PETSC 3.19.1, using > following command,* > > ./configure --with-scalar-type=real --with-precision=double > --download-metis --download-parmetis --with-cmake=1 > --with-mpi-dir=/opt/cray/pe/mpich/8.1.9/ofi/gnu/9.1 > --download-fblaslapack=1 --with-debugging=0 COPTFLAGS=-O2 FOPTFLAGS=-O2 > > *While compiling my code, I get this warning sign, * > > 34: warning: passing argument 10 of 'ParMETIS_V3_PartKway' from > incompatible pointer type [-Wincompatible-pointer-types] > 79 | nparts,tpwgts,ubvec,options, > | ^~~~~~ > | | > | PetscScalar * {aka double *} > In file included from /v2_petsc_check/ParmetisWrappers.c:5: > NumericalLibraries/petsc-3.19.1/gnu-opt/include/parmetis.h:70:22: note: > expected 'real_t *' {aka 'float *'} but argument is of type 'PetscScalar *' > {aka 'double *'} > 70 | real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t > *edgecut, idx_t *part, > | ~~~~~~~~^~~~~~ > /v2_petsc_check/ParmetisWrappers.c:79:41: warning: passing argument 11 of > 'ParMETIS_V3_PartKway' from incompatible pointer type > [-Wincompatible-pointer-types] > 79 | nparts,tpwgts,ubvec,options, > | ^~~~~ > | | > | PetscScalar * {aka double > *} > > *Ignoring the warning sign leads to code crashing due to Parmetis Error,* > > parmetis error: the sum of tpwgts for constraint #0 is not 1.0 > > *How should I proceed further?* > > Regards, > Pradeep > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun May 28 20:57:29 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 28 May 2023 21:57:29 -0400 Subject: [petsc-users] Regarding Issue with Metis Double Precision In-Reply-To: References: <9BC6A91A-F199-4BAD-81C1-CD7523A83590@petsc.dev> Message-ID: <196F83A0-7E7B-4684-A3F5-8C86834D7750@petsc.dev> You must have forgotten the --download-metis in your latest configure command line. > On May 28, 2023, at 9:26 PM, s.pradeep kumar wrote: > > Thanks! I tried adding the flag but it throws in the following error: > > ============================================================================================= > Configuring PETSc to compile on your system > ============================================================================================= > TESTING: configureExternalPackagesDir from config.framework(config/BuildSystem/config/framework.py:1070) > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > Package parmetis requested but dependency metis not requested. > Perhaps you want --download-metis or --with-metis-dir=directory or > --with-metis-lib=libraries and --with-metis-include=directory > ********************************************************************************************* > > On Sun, May 28, 2023 at 8:12?PM Barry Smith > wrote: >> >> It looks you are calling Metis routines directly. In that case, you can do one of two things >> >> 1) use the additional PETSc ./configure option -download-metis-use-doubleprecision=1 >> >> or >> >> 2) ensure that the arguments such as tpwgts that you pass in are not PetscScalar but instead float. This would require changing your code slightly. >> >> Barry >> >> >>> On May 28, 2023, at 8:34 PM, s.pradeep kumar > wrote: >>> >>> Dear All, >>> >>> I recently tried to install Metis along with PETSC 3.19.1, using following command, >>> >>> ./configure --with-scalar-type=real --with-precision=double --download-metis --download-parmetis --with-cmake=1 --with-mpi-dir=/opt/cray/pe/mpich/8.1.9/ofi/gnu/9.1 --download-fblaslapack=1 --with-debugging=0 COPTFLAGS=-O2 FOPTFLAGS=-O2 >>> >>> While compiling my code, I get this warning sign, >>> >>> 34: warning: passing argument 10 of 'ParMETIS_V3_PartKway' from incompatible pointer type [-Wincompatible-pointer-types] >>> 79 | nparts,tpwgts,ubvec,options, >>> | ^~~~~~ >>> | | >>> | PetscScalar * {aka double *} >>> In file included from /v2_petsc_check/ParmetisWrappers.c:5: >>> NumericalLibraries/petsc-3.19.1/gnu-opt/include/parmetis.h:70:22: note: expected 'real_t *' {aka 'float *'} but argument is of type 'PetscScalar *' {aka 'double *'} >>> 70 | real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t *edgecut, idx_t *part, >>> | ~~~~~~~~^~~~~~ >>> /v2_petsc_check/ParmetisWrappers.c:79:41: warning: passing argument 11 of 'ParMETIS_V3_PartKway' from incompatible pointer type [-Wincompatible-pointer-types] >>> 79 | nparts,tpwgts,ubvec,options, >>> | ^~~~~ >>> | | >>> | PetscScalar * {aka double *} >>> >>> Ignoring the warning sign leads to code crashing due to Parmetis Error, >>> >>> parmetis error: the sum of tpwgts for constraint #0 is not 1.0 >>> >>> How should I proceed further? >>> >>> Regards, >>> Pradeep >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From spradeepmahadeek at gmail.com Sun May 28 21:12:49 2023 From: spradeepmahadeek at gmail.com (s.pradeep kumar) Date: Sun, 28 May 2023 21:12:49 -0500 Subject: [petsc-users] Regarding Issue with Metis Double Precision In-Reply-To: <196F83A0-7E7B-4684-A3F5-8C86834D7750@petsc.dev> References: <9BC6A91A-F199-4BAD-81C1-CD7523A83590@petsc.dev> <196F83A0-7E7B-4684-A3F5-8C86834D7750@petsc.dev> Message-ID: Thanks! Resolved my issue. Parmetis works just fine now! On Sun, May 28, 2023 at 8:57?PM Barry Smith wrote: > > You must have forgotten the --download-metis in your latest configure > command line. > > > On May 28, 2023, at 9:26 PM, s.pradeep kumar > wrote: > > Thanks! I tried adding the flag but it throws in the following error: > > > ============================================================================================= > Configuring PETSc to compile on your system > > ============================================================================================= > TESTING: configureExternalPackagesDir from > config.framework(config/BuildSystem/config/framework.py:1070) > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > --------------------------------------------------------------------------------------------- > Package parmetis requested but dependency metis not requested. > Perhaps you want --download-metis or --with-metis-dir=directory or > --with-metis-lib=libraries and --with-metis-include=directory > > ********************************************************************************************* > > On Sun, May 28, 2023 at 8:12?PM Barry Smith wrote: > >> >> It looks you are calling Metis routines directly. In that case, you can >> do one of two things >> >> 1) use the additional PETSc ./configure option >> -download-metis-use-doubleprecision=1 >> >> or >> >> 2) ensure that the arguments such as tpwgts that you pass in are not >> PetscScalar but instead float. This would require changing your code >> slightly. >> >> Barry >> >> >> On May 28, 2023, at 8:34 PM, s.pradeep kumar >> wrote: >> >> Dear All, >> >> * I recently tried to install Metis along with PETSC 3.19.1, >> using following command,* >> >> ./configure --with-scalar-type=real --with-precision=double >> --download-metis --download-parmetis --with-cmake=1 >> --with-mpi-dir=/opt/cray/pe/mpich/8.1.9/ofi/gnu/9.1 >> --download-fblaslapack=1 --with-debugging=0 COPTFLAGS=-O2 FOPTFLAGS=-O2 >> >> *While compiling my code, I get this warning sign, * >> >> 34: warning: passing argument 10 of 'ParMETIS_V3_PartKway' from >> incompatible pointer type [-Wincompatible-pointer-types] >> 79 | nparts,tpwgts,ubvec,options, >> | ^~~~~~ >> | | >> | PetscScalar * {aka double *} >> In file included from /v2_petsc_check/ParmetisWrappers.c:5: >> NumericalLibraries/petsc-3.19.1/gnu-opt/include/parmetis.h:70:22: note: >> expected 'real_t *' {aka 'float *'} but argument is of type 'PetscScalar *' >> {aka 'double *'} >> 70 | real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t >> *edgecut, idx_t *part, >> | ~~~~~~~~^~~~~~ >> /v2_petsc_check/ParmetisWrappers.c:79:41: warning: passing argument 11 of >> 'ParMETIS_V3_PartKway' from incompatible pointer type >> [-Wincompatible-pointer-types] >> 79 | nparts,tpwgts,ubvec,options, >> | ^~~~~ >> | | >> | PetscScalar * {aka double >> *} >> >> *Ignoring the warning sign leads to code crashing due to Parmetis Error,* >> >> parmetis error: the sum of tpwgts for constraint #0 is not 1.0 >> >> *How should I proceed further?* >> >> Regards, >> Pradeep >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jl2862237661 at gmail.com Mon May 29 03:49:40 2023 From: jl2862237661 at gmail.com (Waltz Jan) Date: Mon, 29 May 2023 16:49:40 +0800 Subject: [petsc-users] MatGetValues() can't return the correct values Message-ID: /* Solve J Y = F, where J is Jacobian matrix */ ierr = SNESComputeJacobian(snes, X, snes->jacobian, snes->jacobian_pre); CHKERRQ(ierr); PetscInt rstart, rend; MatGetOwnershipRange(snes->jacobian, &rstart, &rend); PetscInt row=1000, col=1000; PetscScalar v; if (row>=rstart && rowjacobian, 1, &row, 1, &col, &v); PetscPrintf(PETSC_COMM_WORLD, "rstart: %d, rend: %d, row: %d, col: %d, v: %e\n", rstart, rend, row, col, v); } It was supposed to return the value of the matrix at row 1001 and column 1001, but it returned the value at row 2001 and column 2001 instead. There is a two-fold relationship between these coordinates, and I'm not sure if it's related to the fact that I set the number of processes to 2. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon May 29 10:01:32 2023 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 29 May 2023 11:01:32 -0400 Subject: [petsc-users] MatGetValues() can't return the correct values In-Reply-To: References: Message-ID: <99A3D1B8-6CA1-41C3-8B19-E68F90A67102@petsc.dev> Perhaps run a very small problem with 2 ranks and use MatView() to see the matrix before getting the values. Maybe use the debugger (-start_in_debugger) and step through the code as it gets values. I would say there is very little chance it is getting the "wrong values" and is more likely due to a misunderstanding of the matrix usage. Barry > On May 29, 2023, at 4:49 AM, Waltz Jan wrote: > > /* Solve J Y = F, where J is Jacobian matrix */ > ierr = SNESComputeJacobian(snes, X, snes->jacobian, snes->jacobian_pre); > CHKERRQ(ierr); > > PetscInt rstart, rend; > MatGetOwnershipRange(snes->jacobian, &rstart, &rend); > PetscInt row=1000, col=1000; > PetscScalar v; > if (row>=rstart && row { > MatGetValues(snes->jacobian, 1, &row, 1, &col, &v); > PetscPrintf(PETSC_COMM_WORLD, "rstart: %d, rend: %d, row: %d, col: %d, v: %e\n", rstart, rend, row, col, v); > } > > It was supposed to return the value of the matrix at row 1001 and column 1001, but it returned the value at row 2001 and column 2001 instead. There is a two-fold relationship between these coordinates, and I'm not sure if it's related to the fact that I set the number of processes to 2. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysjosh.lo at gmail.com Mon May 29 16:46:43 2023 From: ysjosh.lo at gmail.com (YuSh Lo) Date: Mon, 29 May 2023 16:46:43 -0500 Subject: [petsc-users] Get global offset in global vector for the points not owned by this processor. Message-ID: Hi, How to get the offset in global vector for the points not owned by this processor? I have a parallel DMPlex and a section assigned to it. GetSectionGetOffset with a global section returns -1 for the points not owned by this processor. Thanks, Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 29 19:40:05 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 29 May 2023 20:40:05 -0400 Subject: [petsc-users] Get global offset in global vector for the points not owned by this processor. In-Reply-To: References: Message-ID: On Mon, May 29, 2023 at 5:47?PM YuSh Lo wrote: > Hi, > > How to get the offset in global vector for the points not owned by this > processor? > I have a parallel DMPlex and a section assigned to it. > GetSectionGetOffset with a global section returns -1 for the points not > owned by this processor. > The easiest way to get it is to use DMGetLocalToGlobalMapping() to map the local offset. This is constructed from the global section, because it stores the global offset as -(offset+1). Thanks, Matt > Thanks, > Josh > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jl2862237661 at gmail.com Mon May 29 20:54:56 2023 From: jl2862237661 at gmail.com (Waltz Jan) Date: Tue, 30 May 2023 09:54:56 +0800 Subject: [petsc-users] petsc-users Digest, Vol 173, Issue 131 In-Reply-To: References: Message-ID: Thank you for your reply and help. I followed your advice to try it out, but found that the results were not quite right. It may be due to my misunderstanding. Could you please help me? The specific information is as follows: Codes: /* Solve J Y = F, where J is Jacobian matrix */ ierr = SNESComputeJacobian(snes, X, snes->jacobian, snes->jacobian_pre); CHKERRQ(ierr); PetscViewer viewer; /* View the matrix in a text file */ ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD, "/share/userfile/jianglei/Muti-layers-well-working/petsc/jacobianmatrix.m", &viewer);CHKERRQ(ierr); ierr = PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB);CHKERRQ( ierr); /* Optional: use MATLAB format */ ierr = MatView(snes->jacobian, viewer);CHKERRQ(ierr); ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr); MatAssemblyBegin(snes->jacobian,MAT_FINAL_ASSEMBLY); MatAssemblyEnd(snes->jacobian,MAT_FINAL_ASSEMBLY); PetscInt rstart, rend; MatGetOwnershipRange(snes->jacobian, &rstart, &rend); PetscInt row=1000, col=1000; PetscScalar v; if (row>=rstart && rowjacobian, 1, &row, 1, &col, &v); PetscPrintf(PETSC_COMM_WORLD, "rstart: %d, rend: %d, row: %d, col: %d, v: %e\n", rstart, rend, row, col, PetscRealPart(v)); } Results: [image: image.png] [image: image.png] [image: image.png] ====================== Step: 1, time: 0. days==================== 0 SNES Function norm 1.772062116708e-01 rstart: 0, rend: 4000, row: 1000, col: 1000, v: 7.443232e-06 That's all. Could you help me? On Tue, May 30, 2023 at 1:00?AM wrote: > Send petsc-users mailing list submissions to > petsc-users at mcs.anl.gov > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > petsc-users-request at mcs.anl.gov > > You can reach the person managing the list at > petsc-users-owner at mcs.anl.gov > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. MatGetValues() can't return the correct values (Waltz Jan) > 2. Re: MatGetValues() can't return the correct values (Barry Smith) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 29 May 2023 16:49:40 +0800 > From: Waltz Jan > To: petsc-users at mcs.anl.gov > Subject: [petsc-users] MatGetValues() can't return the correct values > Message-ID: > QF_C-vA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > /* Solve J Y = F, where J is Jacobian matrix */ > ierr = SNESComputeJacobian(snes, X, snes->jacobian, snes->jacobian_pre); > CHKERRQ(ierr); > > PetscInt rstart, rend; > MatGetOwnershipRange(snes->jacobian, &rstart, &rend); > PetscInt row=1000, col=1000; > PetscScalar v; > if (row>=rstart && row { > MatGetValues(snes->jacobian, 1, &row, 1, &col, &v); > PetscPrintf(PETSC_COMM_WORLD, "rstart: %d, rend: %d, row: %d, col: %d, > v: %e\n", rstart, rend, row, col, v); > } > > It was supposed to return the value of the matrix at row 1001 and column > 1001, but it returned the value at row 2001 and column 2001 instead. There > is a two-fold relationship between these coordinates, and I'm not sure if > it's related to the fact that I set the number of processes to 2. > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230529/e6c3eb3b/attachment-0001.html > > > > ------------------------------ > > Message: 2 > Date: Mon, 29 May 2023 11:01:32 -0400 > From: Barry Smith > To: Waltz Jan > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] MatGetValues() can't return the correct > values > Message-ID: <99A3D1B8-6CA1-41C3-8B19-E68F90A67102 at petsc.dev> > Content-Type: text/plain; charset="us-ascii" > > > Perhaps run a very small problem with 2 ranks and use MatView() to see > the matrix before getting the values. Maybe use the debugger > (-start_in_debugger) and step through the code as it gets values. I would > say there is very little chance it is getting the "wrong values" and is > more likely due to a misunderstanding of the matrix usage. > > Barry > > > > > > On May 29, 2023, at 4:49 AM, Waltz Jan wrote: > > > > /* Solve J Y = F, where J is Jacobian matrix */ > > ierr = SNESComputeJacobian(snes, X, snes->jacobian, snes->jacobian_pre); > > CHKERRQ(ierr); > > > > PetscInt rstart, rend; > > MatGetOwnershipRange(snes->jacobian, &rstart, &rend); > > PetscInt row=1000, col=1000; > > PetscScalar v; > > if (row>=rstart && row > { > > MatGetValues(snes->jacobian, 1, &row, 1, &col, &v); > > PetscPrintf(PETSC_COMM_WORLD, "rstart: %d, rend: %d, row: %d, col: > %d, v: %e\n", rstart, rend, row, col, v); > > } > > > > It was supposed to return the value of the matrix at row 1001 and column > 1001, but it returned the value at row 2001 and column 2001 instead. There > is a two-fold relationship between these coordinates, and I'm not sure if > it's related to the fact that I set the number of processes to 2. > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230529/345ddb55/attachment-0001.html > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > ------------------------------ > > End of petsc-users Digest, Vol 173, Issue 131 > ********************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 56591 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 86713 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 58910 bytes Desc: not available URL: From bsmith at petsc.dev Mon May 29 22:06:01 2023 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 29 May 2023 23:06:01 -0400 Subject: [petsc-users] petsc-users Digest, Vol 173, Issue 131 In-Reply-To: References: Message-ID: <32B91C9F-2E5D-4D1F-8F53-D045B570E6BA@petsc.dev> What version of PETSc are you using? The output looks a little funny. Would you be willing to send your code to petsc-maint at mcs.anl.gov so I can debug it quickly? > On May 29, 2023, at 9:54 PM, Waltz Jan wrote: > > Thank you for your reply and help. I followed your advice to try it out, but found that the results were not quite right. It may be due to my misunderstanding. Could you please help me? The specific information is as follows: > > Codes: > /* Solve J Y = F, where J is Jacobian matrix */ > ierr = SNESComputeJacobian(snes, X, snes->jacobian, snes->jacobian_pre); > CHKERRQ(ierr); > > PetscViewer viewer; > /* View the matrix in a text file */ > ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD, "/share/userfile/jianglei/Muti-layers-well-working/petsc/jacobianmatrix.m", &viewer);CHKERRQ(ierr); > ierr = PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); /* Optional: use MATLAB format */ > ierr = MatView(snes->jacobian, viewer);CHKERRQ(ierr); > ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr); > > MatAssemblyBegin(snes->jacobian,MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(snes->jacobian,MAT_FINAL_ASSEMBLY); > PetscInt rstart, rend; > MatGetOwnershipRange(snes->jacobian, &rstart, &rend); > PetscInt row=1000, col=1000; > PetscScalar v; > > if (row>=rstart && row { > MatGetValues(snes->jacobian, 1, &row, 1, &col, &v); > PetscPrintf(PETSC_COMM_WORLD, "rstart: %d, rend: %d, row: %d, col: %d, v: %e\n", rstart, rend, row, col, PetscRealPart(v)); > } > > Results: > > > > ====================== Step: 1, time: 0. days==================== > 0 SNES Function norm 1.772062116708e-01 > rstart: 0, rend: 4000, row: 1000, col: 1000, v: 7.443232e-06 > > That's all. Could you help me? > > > On Tue, May 30, 2023 at 1:00?AM > wrote: >> Send petsc-users mailing list submissions to >> petsc-users at mcs.anl.gov >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://lists.mcs.anl.gov/mailman/listinfo/petsc-users >> or, via email, send a message with subject or body 'help' to >> petsc-users-request at mcs.anl.gov >> >> You can reach the person managing the list at >> petsc-users-owner at mcs.anl.gov >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of petsc-users digest..." >> >> >> Today's Topics: >> >> 1. MatGetValues() can't return the correct values (Waltz Jan) >> 2. Re: MatGetValues() can't return the correct values (Barry Smith) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Mon, 29 May 2023 16:49:40 +0800 >> From: Waltz Jan > >> To: petsc-users at mcs.anl.gov >> Subject: [petsc-users] MatGetValues() can't return the correct values >> Message-ID: >> > >> Content-Type: text/plain; charset="utf-8" >> >> /* Solve J Y = F, where J is Jacobian matrix */ >> ierr = SNESComputeJacobian(snes, X, snes->jacobian, snes->jacobian_pre); >> CHKERRQ(ierr); >> >> PetscInt rstart, rend; >> MatGetOwnershipRange(snes->jacobian, &rstart, &rend); >> PetscInt row=1000, col=1000; >> PetscScalar v; >> if (row>=rstart && row> { >> MatGetValues(snes->jacobian, 1, &row, 1, &col, &v); >> PetscPrintf(PETSC_COMM_WORLD, "rstart: %d, rend: %d, row: %d, col: %d, >> v: %e\n", rstart, rend, row, col, v); >> } >> >> It was supposed to return the value of the matrix at row 1001 and column >> 1001, but it returned the value at row 2001 and column 2001 instead. There >> is a two-fold relationship between these coordinates, and I'm not sure if >> it's related to the fact that I set the number of processes to 2. >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> >> ------------------------------ >> >> Message: 2 >> Date: Mon, 29 May 2023 11:01:32 -0400 >> From: Barry Smith > >> To: Waltz Jan > >> Cc: petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] MatGetValues() can't return the correct >> values >> Message-ID: <99A3D1B8-6CA1-41C3-8B19-E68F90A67102 at petsc.dev > >> Content-Type: text/plain; charset="us-ascii" >> >> >> Perhaps run a very small problem with 2 ranks and use MatView() to see the matrix before getting the values. Maybe use the debugger (-start_in_debugger) and step through the code as it gets values. I would say there is very little chance it is getting the "wrong values" and is more likely due to a misunderstanding of the matrix usage. >> >> Barry >> >> >> >> >> > On May 29, 2023, at 4:49 AM, Waltz Jan > wrote: >> > >> > /* Solve J Y = F, where J is Jacobian matrix */ >> > ierr = SNESComputeJacobian(snes, X, snes->jacobian, snes->jacobian_pre); >> > CHKERRQ(ierr); >> > >> > PetscInt rstart, rend; >> > MatGetOwnershipRange(snes->jacobian, &rstart, &rend); >> > PetscInt row=1000, col=1000; >> > PetscScalar v; >> > if (row>=rstart && row> > { >> > MatGetValues(snes->jacobian, 1, &row, 1, &col, &v); >> > PetscPrintf(PETSC_COMM_WORLD, "rstart: %d, rend: %d, row: %d, col: %d, v: %e\n", rstart, rend, row, col, v); >> > } >> > >> > It was supposed to return the value of the matrix at row 1001 and column 1001, but it returned the value at row 2001 and column 2001 instead. There is a two-fold relationship between these coordinates, and I'm not sure if it's related to the fact that I set the number of processes to 2. >> >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> >> ------------------------------ >> >> Subject: Digest Footer >> >> _______________________________________________ >> petsc-users mailing list >> petsc-users at mcs.anl.gov >> https://lists.mcs.anl.gov/mailman/listinfo/petsc-users >> >> >> ------------------------------ >> >> End of petsc-users Digest, Vol 173, Issue 131 >> ********************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysjosh.lo at gmail.com Wed May 31 00:24:55 2023 From: ysjosh.lo at gmail.com (YuSh Lo) Date: Wed, 31 May 2023 00:24:55 -0500 Subject: [petsc-users] Multiple points constraint in parallel Message-ID: Hi, I have some multiple points constraint input as follows, A_1 a_4 B_2 b_5 C_3 c_6 each columns are stored in different IS. After dmplex distribute, they will be renumbered and distribution to certain processors. I have two questions: (1) I need both complete ISs are all the processors. Can I just do ISALLGather()? (2) Although renumbered, will the original order remain(ABC and abc)? If the number is the node number, after distribution and I do an ISALLGather() will I have the following on each processor? A_3 a_1 B_4 b_2 C_6 c_5 (I randomly renumber them) This is what I can come up with now. Is there any better way to do it? Thanks, Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: From joauma.marichal at uclouvain.be Wed May 31 04:24:57 2023 From: joauma.marichal at uclouvain.be (Joauma Marichal) Date: Wed, 31 May 2023 09:24:57 +0000 Subject: [petsc-users] Error during compile Message-ID: Hello, I am writing to you as I am trying to compile petsc on my mac. I used: $ export PATH=/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/munki $ ./configure --download-fblaslapack --download-hypre --prefix=../marha/lib_petsc Which has worked previously but today, I get: **************************ERROR************************************* Error during compile, check arch-darwin-c-debug/lib/petsc/conf/make.log Send it and arch-darwin-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov ******************************************************************** I attach the files. Thanks a lot for your help. Best regards, Joauma -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 29779 bytes Desc: configure.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 12940 bytes Desc: make.log URL: From knepley at gmail.com Wed May 31 05:02:46 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2023 06:02:46 -0400 Subject: [petsc-users] [petsc-maint] Error during compile In-Reply-To: References: Message-ID: On Wed, May 31, 2023 at 5:25?AM Joauma Marichal < joauma.marichal at uclouvain.be> wrote: > Hello, > > > > I am writing to you as I am trying to compile petsc on my mac. > > > > I used: > > $ export > PATH=/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/munki > > $ ./configure --download-fblaslapack --download-hypre > --prefix=../marha/lib_petsc > This is a bug in make 4.4.1. Either upgrade to the latest PETSc release or make OMAKE_PRINTDIR=make all Thanks, Matt > Which has worked previously but today, I get: > > > > ***************************ERROR************************************** > > *Error during compile, check > arch-darwin-c-debug/lib/petsc/conf/make.log* > > *Send it and arch-darwin-c-debug/lib/petsc/conf/configure.log to > petsc-maint at mcs.anl.gov * > > ********************************************************************** > > > > > > I attach the files. > > Thanks a lot for your help. > > > > Best regards, > > > > Joauma > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 31 05:08:12 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2023 06:08:12 -0400 Subject: [petsc-users] Multiple points constraint in parallel In-Reply-To: References: Message-ID: On Wed, May 31, 2023 at 1:25?AM YuSh Lo wrote: > Hi, > > I have some multiple points constraint input as follows, > > A_1 a_4 > B_2 b_5 > C_3 c_6 > > each columns are stored in different IS. > So one IS lists the capital letter and one lists the lowercase? > After dmplex distribute, they will be renumbered and distribution to > certain processors. > Plex does normally renumber and send ISes, so you are doing this yourself? > I have two questions: > > (1) I need both complete ISs are all the processors. Can I just do > ISALLGather()? > I think so. I cannot tell what you want to do. > (2) Although renumbered, will the original order remain(ABC and abc)? > If the number is the node number, after distribution and I do an > ISALLGather() > will I have the following on each processor? > > A_3 a_1 > B_4 b_2 > C_6 c_5 > (I randomly renumber them) > > > This is what I can come up with now. Is there any better way to do it? > Can you tell me what you want to use this for? Maybe there is an easier way. For example, if we want to impose a constraint on a mesh point, usually I mark that point with a DMLabel. These are propagated during distribution so you do not have to think about it. Thanks, Matt > Thanks, > Josh > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From joauma.marichal at uclouvain.be Wed May 31 05:34:09 2023 From: joauma.marichal at uclouvain.be (Joauma Marichal) Date: Wed, 31 May 2023 10:34:09 +0000 Subject: [petsc-users] [petsc-maint] Error during compile In-Reply-To: References: Message-ID: Hello, Thanks a lot, it worked. However my code does not compile anymore (I haven?t changed anything since yesterday and it worked then). I get the following error: -L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current -lpetsc -lHYPRE -lflapack -lfblas -lc++ -lmpifort -lmpi -lpmpi -lgfortran -lemutls_w -lquadmath -lc++ ld: warning: directory not found for option '-L/opt/homebrew/Cellar/mpich/4.1/lib' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current/gcc/aarch64-apple-darwin22/12' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current/gcc' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current' ld: library not found for -lgfortran clang: error: linker command failed with exit code 1 (use -v to see invocation) I attach the complete log file. Thanks a lot for your help. Best regards, Joauma De : Matthew Knepley Date : mercredi, 31 mai 2023 ? 12:03 ? : Joauma Marichal Cc : PETSc , petsc-users at mcs.anl.gov Objet : Re: [petsc-maint] Error during compile On Wed, May 31, 2023 at 5:25?AM Joauma Marichal > wrote: Hello, I am writing to you as I am trying to compile petsc on my mac. I used: $ export PATH=/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/munki $ ./configure --download-fblaslapack --download-hypre --prefix=../marha/lib_petsc This is a bug in make 4.4.1. Either upgrade to the latest PETSc release or make OMAKE_PRINTDIR=make all Thanks, Matt Which has worked previously but today, I get: **************************ERROR************************************* Error during compile, check arch-darwin-c-debug/lib/petsc/conf/make.log Send it and arch-darwin-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov ******************************************************************** I attach the files. Thanks a lot for your help. Best regards, Joauma -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log_make Type: application/octet-stream Size: 9090 bytes Desc: log_make URL: From knepley at gmail.com Wed May 31 05:55:05 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2023 06:55:05 -0400 Subject: [petsc-users] [petsc-maint] Error during compile In-Reply-To: References: Message-ID: On Wed, May 31, 2023 at 6:34?AM Joauma Marichal < joauma.marichal at uclouvain.be> wrote: > Hello, > > > > Thanks a lot, it worked. However my code does not compile anymore (I > haven?t changed anything since yesterday and it worked then). I get the > following error: > It looks like your Homebrew has changed. In particular, the gcc 12.2.0 is missing, and presumably that is where libgfortran was. Thanks, Matt > -L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current -lpetsc -lHYPRE > -lflapack -lfblas -lc++ -lmpifort -lmpi -lpmpi -lgfortran -lemutls_w > -lquadmath -lc++ > > ld: warning: directory not found for option > '-L/opt/homebrew/Cellar/mpich/4.1/lib' > > ld: warning: directory not found for option > '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current/gcc/aarch64-apple-darwin22/12' > > ld: warning: directory not found for option > '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current/gcc' > > ld: warning: directory not found for option > '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current' > > ld: library not found for -lgfortran > > clang: *error: **linker command failed with exit code 1 (use -v to see > invocation)* > > > > > > I attach the complete log file. > > > > Thanks a lot for your help. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *mercredi, 31 mai 2023 ? 12:03 > *? : *Joauma Marichal > *Cc : *PETSc , petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Objet : *Re: [petsc-maint] Error during compile > > On Wed, May 31, 2023 at 5:25?AM Joauma Marichal < > joauma.marichal at uclouvain.be> wrote: > > Hello, > > > > I am writing to you as I am trying to compile petsc on my mac. > > > > I used: > > $ export > PATH=/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/munki > > $ ./configure --download-fblaslapack --download-hypre > --prefix=../marha/lib_petsc > > > > This is a bug in make 4.4.1. Either upgrade to the latest PETSc release or > > > > make OMAKE_PRINTDIR=make all > > > > Thanks, > > > > Matt > > > > Which has worked previously but today, I get: > > > > ***************************ERROR************************************** > > *Error during compile, check > arch-darwin-c-debug/lib/petsc/conf/make.log* > > *Send it and arch-darwin-c-debug/lib/petsc/conf/configure.log to > petsc-maint at mcs.anl.gov * > > ********************************************************************** > > > > > > I attach the files. > > Thanks a lot for your help. > > > > Best regards, > > > > Joauma > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From joauma.marichal at uclouvain.be Wed May 31 05:58:37 2023 From: joauma.marichal at uclouvain.be (Joauma Marichal) Date: Wed, 31 May 2023 10:58:37 +0000 Subject: [petsc-users] [petsc-maint] Error during compile In-Reply-To: References: Message-ID: Hello, Thanks a lot for your help. What do I have to change in order to make it work again? Best regards, Joauma De : Matthew Knepley Date : mercredi, 31 mai 2023 ? 12:55 ? : Joauma Marichal Cc : PETSc , petsc-users at mcs.anl.gov Objet : Re: [petsc-maint] Error during compile On Wed, May 31, 2023 at 6:34?AM Joauma Marichal > wrote: Hello, Thanks a lot, it worked. However my code does not compile anymore (I haven?t changed anything since yesterday and it worked then). I get the following error: It looks like your Homebrew has changed. In particular, the gcc 12.2.0 is missing, and presumably that is where libgfortran was. Thanks, Matt -L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current -lpetsc -lHYPRE -lflapack -lfblas -lc++ -lmpifort -lmpi -lpmpi -lgfortran -lemutls_w -lquadmath -lc++ ld: warning: directory not found for option '-L/opt/homebrew/Cellar/mpich/4.1/lib' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current/gcc/aarch64-apple-darwin22/12' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current/gcc' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current' ld: library not found for -lgfortran clang: error: linker command failed with exit code 1 (use -v to see invocation) I attach the complete log file. Thanks a lot for your help. Best regards, Joauma De : Matthew Knepley > Date : mercredi, 31 mai 2023 ? 12:03 ? : Joauma Marichal > Cc : PETSc >, petsc-users at mcs.anl.gov > Objet : Re: [petsc-maint] Error during compile On Wed, May 31, 2023 at 5:25?AM Joauma Marichal > wrote: Hello, I am writing to you as I am trying to compile petsc on my mac. I used: $ export PATH=/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/munki $ ./configure --download-fblaslapack --download-hypre --prefix=../marha/lib_petsc This is a bug in make 4.4.1. Either upgrade to the latest PETSc release or make OMAKE_PRINTDIR=make all Thanks, Matt Which has worked previously but today, I get: **************************ERROR************************************* Error during compile, check arch-darwin-c-debug/lib/petsc/conf/make.log Send it and arch-darwin-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov ******************************************************************** I attach the files. Thanks a lot for your help. Best regards, Joauma -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From joauma.marichal at uclouvain.be Wed May 31 06:05:50 2023 From: joauma.marichal at uclouvain.be (Joauma Marichal) Date: Wed, 31 May 2023 11:05:50 +0000 Subject: [petsc-users] [petsc-maint] Error during compile In-Reply-To: References: Message-ID: Hello, I was able to fix it. Thanks again. Best regards, Joauma De : Matthew Knepley Date : mercredi, 31 mai 2023 ? 12:55 ? : Joauma Marichal Cc : PETSc , petsc-users at mcs.anl.gov Objet : Re: [petsc-maint] Error during compile On Wed, May 31, 2023 at 6:34?AM Joauma Marichal > wrote: Hello, Thanks a lot, it worked. However my code does not compile anymore (I haven?t changed anything since yesterday and it worked then). I get the following error: It looks like your Homebrew has changed. In particular, the gcc 12.2.0 is missing, and presumably that is where libgfortran was. Thanks, Matt -L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current -lpetsc -lHYPRE -lflapack -lfblas -lc++ -lmpifort -lmpi -lpmpi -lgfortran -lemutls_w -lquadmath -lc++ ld: warning: directory not found for option '-L/opt/homebrew/Cellar/mpich/4.1/lib' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current/gcc/aarch64-apple-darwin22/12' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current/gcc' ld: warning: directory not found for option '-L/opt/homebrew/Cellar/gcc/12.2.0/lib/gcc/current' ld: library not found for -lgfortran clang: error: linker command failed with exit code 1 (use -v to see invocation) I attach the complete log file. Thanks a lot for your help. Best regards, Joauma De : Matthew Knepley > Date : mercredi, 31 mai 2023 ? 12:03 ? : Joauma Marichal > Cc : PETSc >, petsc-users at mcs.anl.gov > Objet : Re: [petsc-maint] Error during compile On Wed, May 31, 2023 at 5:25?AM Joauma Marichal > wrote: Hello, I am writing to you as I am trying to compile petsc on my mac. I used: $ export PATH=/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/munki $ ./configure --download-fblaslapack --download-hypre --prefix=../marha/lib_petsc This is a bug in make 4.4.1. Either upgrade to the latest PETSc release or make OMAKE_PRINTDIR=make all Thanks, Matt Which has worked previously but today, I get: **************************ERROR************************************* Error during compile, check arch-darwin-c-debug/lib/petsc/conf/make.log Send it and arch-darwin-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov ******************************************************************** I attach the files. Thanks a lot for your help. Best regards, Joauma -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From xiongziming2010 at gmail.com Wed May 31 09:53:02 2023 From: xiongziming2010 at gmail.com (ziming xiong) Date: Wed, 31 May 2023 16:53:02 +0200 Subject: [petsc-users] ask for the ERROR MESSAGE Message-ID: Hello, I created a code for solving the finite element method using the PCBDDC method according to ex71, but I got the following error at the end, I can't find out the cause of the error, can you tell me what the error is based on the error message? (PS. I have tried: Mat AA. PetscCall(MatConvert(A, MATAIJ, MAT_INITIAL_MATRIX, &AA)). to change the matrix type and then solve for AA, it works fine) [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [1]PETSC ERROR: Error in external library Error in external library [0]PETSC ERROR: [1]PETSC ERROR: Error in query to SYEV Lapack routine 0 Error in query to SYEV Lapack routine 0 [0]PETSC ERROR: [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: [1]PETSC ERROR: Petsc Development GIT revision: unknown GIT Date: unknown Petsc Development GIT revision: unknown GIT Date: unknown [0]PETSC ERROR: [1]PETSC ERROR: test_petsc_fem.exe on a arch-mswin-c-debug named lmeep-329 by XiongZiming Wed May 31 16:52:36 2023 test_petsc_fem.exe on a arch-mswin-c-debug named lmeep-329 by XiongZiming Wed May 31 16:52:36 2023 [0]PETSC ERROR: [1]PETSC ERROR: Configure options --with-cc=/cygdrive/f/post_doc_C++/petsc_new/petsc-main/lib/petsc/bin/win32fe/win_cl --with-cxx=/cygdrive/f/post_doc_C++/petsc_new/petsc-main/lib/petsc/bin/win32fe/win_cl --with-fc=0 --download-metis --with-shared-libraries=0 --with-mpi-include="[/cygdrive/c/PROGRA~2/Intel/MPI/Include,/cygdrive/c/PROGRA~2/Intel/MPI/Include/x64]" --with-mpi-lib="-L/cygdrive/c/PROGRA~2/Intel/MPI/lib/x64 msmpifec.lib msmpi.lib" --with-mpiexec=/cygdrive/c/PROGRA~1/Microsoft_MPI/Bin/mpiexec --with-blaslapack-lib="-L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/latest/lib/intel64 mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib" --with-64-bit-indices --with-mkl_pardiso-dir=/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/latest Configure options --with-cc=/cygdrive/f/post_doc_C++/petsc_new/petsc-main/lib/petsc/bin/win32fe/win_cl --with-cxx=/cygdrive/f/post_doc_C++/petsc_new/petsc-main/lib/petsc/bin/win32fe/win_cl --with-fc=0 --download-metis --with-shared-libraries=0 --with-mpi-include="[/cygdrive/c/PROGRA~2/Intel/MPI/Include,/cygdrive/c/PROGRA~2/Intel/MPI/Include/x64]" --with-mpi-lib="-L/cygdrive/c/PROGRA~2/Intel/MPI/lib/x64 msmpifec.lib msmpi.lib" --with-mpiexec=/cygdrive/c/PROGRA~1/Microsoft_MPI/Bin/mpiexec --with-blaslapack-lib="-L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/latest/lib/intel64 mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib" --with-64-bit-indices --with-mkl_pardiso-dir=/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/latest [0]PETSC ERROR: [1]PETSC ERROR: #1 PCBDDCConstraintsSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\impls\bddc\bddcprivate.c:5975 #1 PCBDDCConstraintsSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\impls\bddc\bddcprivate.c:5975 [0]PETSC ERROR: [1]PETSC ERROR: #2 PCSetUp_BDDC() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\impls\bddc\bddc.c:1659 #2 PCSetUp_BDDC() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\impls\bddc\bddc.c:1659 [0]PETSC ERROR: [1]PETSC ERROR: #3 PCSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\interface\precon.c:994 #3 PCSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\interface\precon.c:994 [1]PETSC ERROR: [0]PETSC ERROR: #4 KSPSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\ksp\interface\itfunc.c:406 #4 KSPSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\ksp\interface\itfunc.c:406 [1]PETSC ERROR: [0]PETSC ERROR: #5 petsc_calcul_part_DMDA() at petsc_DMDA.cpp:285 #5 petsc_calcul_part_DMDA() at petsc_DMDA.cpp:285 [1]PETSC ERROR: [0]PETSC ERROR: #6 main() at ..\Main.cpp:281 #6 main() at ..\Main.cpp:281 [1]PETSC ERROR: [0]PETSC ERROR: No PETSc Option Table entries No PETSc Option Table entries [1]PETSC ERROR: [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- job aborted: [ranks] message [0-1] application aborted aborting MPI_COMM_SELF (comm=0x44000001), error 76, comm rank 0 ---- error analysis ----- [0-1] on lmeep-329 test_petsc_fem.exe aborted the job. abort code 76 ---- error analysis ----- Ziming XIONG -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 31 11:57:22 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 31 May 2023 12:57:22 -0400 Subject: [petsc-users] ask for the ERROR MESSAGE In-Reply-To: References: Message-ID: <12944ACA-6E65-4F26-A282-9F9605379BFD@petsc.dev> This is an inconsistent error message. It prints > Error in query to SYEV Lapack routine 0 But the error check is on lierr != 0 so the error check should not have been triggered. I inspected the code and there is no obvious problem with mixing up different-size integers; the prototype for the function expects a PetscBlasInt, and that is what is passed in. Strong compiler optimizations are not turned on so it seems unlikely it is due to a compiler bug. Are you familiar with debugging? You can use the command line option -start_in_debugger type cont in the debugger windows, and when the debugger stops with the error, you can type bt and look around at the variables to see if any look off. Barry > On May 31, 2023, at 10:53 AM, ziming xiong wrote: > > Hello, > I created a code for solving the finite element method using the PCBDDC method according to ex71, but I got the following error at the end, I can't find out the cause of the error, can you tell me what the error is based on the error message? > (PS. I have tried: > Mat AA. > PetscCall(MatConvert(A, MATAIJ, MAT_INITIAL_MATRIX, &AA)). > to change the matrix type and then solve for AA, it works fine) > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: > [1]PETSC ERROR: > Error in external library > Error in external library > [0]PETSC ERROR: > [1]PETSC ERROR: > Error in query to SYEV Lapack routine 0 > Error in query to SYEV Lapack routine 0 > [0]PETSC ERROR: > [1]PETSC ERROR: > See https://petsc.org/release/faq/ for trouble shooting. > See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: > [1]PETSC ERROR: > Petsc Development GIT revision: unknown GIT Date: unknown > Petsc Development GIT revision: unknown GIT Date: unknown > [0]PETSC ERROR: > [1]PETSC ERROR: > test_petsc_fem.exe on a arch-mswin-c-debug named lmeep-329 by XiongZiming Wed May 31 16:52:36 2023 > test_petsc_fem.exe on a arch-mswin-c-debug named lmeep-329 by XiongZiming Wed May 31 16:52:36 2023 > [0]PETSC ERROR: > [1]PETSC ERROR: > Configure options --with-cc=/cygdrive/f/post_doc_C++/petsc_new/petsc-main/lib/petsc/bin/win32fe/win_cl --with-cxx=/cygdrive/f/post_doc_C++/petsc_new/petsc-main/lib/petsc/bin/win32fe/win_cl --with-fc=0 --download-metis --with-shared-libraries=0 --with-mpi-include="[/cygdrive/c/PROGRA~2/Intel/MPI/Include,/cygdrive/c/PROGRA~2/Intel/MPI/Include/x64]" --with-mpi-lib="-L/cygdrive/c/PROGRA~2/Intel/MPI/lib/x64 msmpifec.lib msmpi.lib" --with-mpiexec=/cygdrive/c/PROGRA~1/Microsoft_MPI/Bin/mpiexec --with-blaslapack-lib="-L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/latest/lib/intel64 mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib" --with-64-bit-indices --with-mkl_pardiso-dir=/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/latest > Configure options --with-cc=/cygdrive/f/post_doc_C++/petsc_new/petsc-main/lib/petsc/bin/win32fe/win_cl --with-cxx=/cygdrive/f/post_doc_C++/petsc_new/petsc-main/lib/petsc/bin/win32fe/win_cl --with-fc=0 --download-metis --with-shared-libraries=0 --with-mpi-include="[/cygdrive/c/PROGRA~2/Intel/MPI/Include,/cygdrive/c/PROGRA~2/Intel/MPI/Include/x64]" --with-mpi-lib="-L/cygdrive/c/PROGRA~2/Intel/MPI/lib/x64 msmpifec.lib msmpi.lib" --with-mpiexec=/cygdrive/c/PROGRA~1/Microsoft_MPI/Bin/mpiexec --with-blaslapack-lib="-L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/latest/lib/intel64 mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib" --with-64-bit-indices --with-mkl_pardiso-dir=/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/latest > [0]PETSC ERROR: > [1]PETSC ERROR: > #1 PCBDDCConstraintsSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\impls\bddc\bddcprivate.c:5975 > #1 PCBDDCConstraintsSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\impls\bddc\bddcprivate.c:5975 > [0]PETSC ERROR: > [1]PETSC ERROR: > #2 PCSetUp_BDDC() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\impls\bddc\bddc.c:1659 > #2 PCSetUp_BDDC() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\impls\bddc\bddc.c:1659 > [0]PETSC ERROR: > [1]PETSC ERROR: > #3 PCSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\interface\precon.c:994 > #3 PCSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\pc\interface\precon.c:994 > [1]PETSC ERROR: > [0]PETSC ERROR: > #4 KSPSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\ksp\interface\itfunc.c:406 > #4 KSPSetUp() at F:\post_doc_C++\petsc_new\petsc-main\src\ksp\ksp\interface\itfunc.c:406 > [1]PETSC ERROR: > [0]PETSC ERROR: > #5 petsc_calcul_part_DMDA() at petsc_DMDA.cpp:285 > #5 petsc_calcul_part_DMDA() at petsc_DMDA.cpp:285 > [1]PETSC ERROR: > [0]PETSC ERROR: > #6 main() at ..\Main.cpp:281 > #6 main() at ..\Main.cpp:281 > [1]PETSC ERROR: > [0]PETSC ERROR: > No PETSc Option Table entries > No PETSc Option Table entries > [1]PETSC ERROR: > [0]PETSC ERROR: > ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > job aborted: > [ranks] message > > [0-1] application aborted > aborting MPI_COMM_SELF (comm=0x44000001), error 76, comm rank 0 > > ---- error analysis ----- > > [0-1] on lmeep-329 > test_petsc_fem.exe aborted the job. abort code 76 > > ---- error analysis ----- > > > > Ziming XIONG -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysjosh.lo at gmail.com Wed May 31 12:53:45 2023 From: ysjosh.lo at gmail.com (YuSh Lo) Date: Wed, 31 May 2023 12:53:45 -0500 Subject: [petsc-users] Multiple points constraint in parallel In-Reply-To: References: Message-ID: Hi Matthew, Matthew Knepley ? 2023?5?31? ?? ??5:08??? > On Wed, May 31, 2023 at 1:25?AM YuSh Lo wrote: > >> Hi, >> >> I have some multiple points constraint input as follows, >> >> A_1 a_4 >> B_2 b_5 >> C_3 c_6 >> >> each columns are stored in different IS. >> > > So one IS lists the capital letter and one lists the lowercase? > Yes. > > >> After dmplex distribute, they will be renumbered and distribution to >> certain processors. >> > > Plex does normally renumber and send ISes, so you are doing this yourself? > No, Plex does the distribution. > > >> I have two questions: >> >> (1) I need both complete ISs are all the processors. Can I just do >> ISALLGather()? >> > > I think so. I cannot tell what you want to do. > > >> (2) Although renumbered, will the original order remain(ABC and abc)? >> If the number is the node number, after distribution and I do an >> ISALLGather() >> will I have the following on each processor? >> >> A_3 a_1 >> B_4 b_2 >> C_6 c_5 >> (I randomly renumber them) >> >> >> This is what I can come up with now. Is there any better way to do it? >> > > Can you tell me what you want to use this for? Maybe there is an easier > way. For example, > if we want to impose a constraint on a mesh point, usually I mark that > point with a DMLabel. > These are propagated during distribution so you do not have to think about > it. > So I have to impose constraint between two nodes. The upper case node is controlled by the lower case node, and there can be many pairs of constraint. When looping over elements if one node is controlled by the other then the corresponding entry has to be added to a different location e.x. [an index][A_3] to [an index][a_1]. I know how to use DMLabel, but I have used it on one node at a time only. I used DMLabel to mark those nodes with BCs. Now I must know the info of two nodes at the same time. The operations only have to done in assembling and calculating the element stiffness matrix of the elements that contain the upper case node, but I must know the info of the corresponding lower case node so I know where to assemble the entry. Thanks, Josh > > Thanks, > > Matt > > >> Thanks, >> Josh >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 31 13:02:41 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2023 14:02:41 -0400 Subject: [petsc-users] Multiple points constraint in parallel In-Reply-To: References: Message-ID: On Wed, May 31, 2023 at 1:53?PM YuSh Lo wrote: > Hi Matthew, > > Matthew Knepley ? 2023?5?31? ?? ??5:08??? > >> On Wed, May 31, 2023 at 1:25?AM YuSh Lo wrote: >> >>> Hi, >>> >>> I have some multiple points constraint input as follows, >>> >>> A_1 a_4 >>> B_2 b_5 >>> C_3 c_6 >>> >>> each columns are stored in different IS. >>> >> >> So one IS lists the capital letter and one lists the lowercase? >> > Yes. > >> >> >>> After dmplex distribute, they will be renumbered and distribution to >>> certain processors. >>> >> >> Plex does normally renumber and send ISes, so you are doing this yourself? >> > No, Plex does the distribution. > I do not understand this. How does the DM know about this IS? Are you calling DMDistributeFieldIS()? > >> >>> I have two questions: >>> >>> (1) I need both complete ISs are all the processors. Can I just do >>> ISALLGather()? >>> >> >> I think so. I cannot tell what you want to do. >> >> >>> (2) Although renumbered, will the original order remain(ABC and abc)? >>> If the number is the node number, after distribution and I do an >>> ISALLGather() >>> will I have the following on each processor? >>> >>> A_3 a_1 >>> B_4 b_2 >>> C_6 c_5 >>> (I randomly renumber them) >>> >>> >>> This is what I can come up with now. Is there any better way to do it? >>> >> >> Can you tell me what you want to use this for? Maybe there is an easier >> way. For example, >> if we want to impose a constraint on a mesh point, usually I mark that >> point with a DMLabel. >> These are propagated during distribution so you do not have to think >> about it. >> > > So I have to impose constraint between two nodes. The upper case node > is controlled by the lower case node, and there can be many pairs of > constraint. When looping over elements if one node is controlled by the > other then the corresponding entry has to be added to a different location > e.x. [an index][A_3] to [an index][a_1]. I know how to use DMLabel, but I > have used it on one node at a time only. I used DMLabel to mark those nodes > with BCs. Now I must know the info of two nodes at the same time. The > operations only have to done in assembling and calculating the element > stiffness matrix of the elements that contain the upper case node, but I > must know the info of the corresponding lower case node so I know where to > assemble the entry. > Oh, if you want to associate two mesh points, use a PetscSF. These can also be remapped. Thanks, Matt > Thanks, > Josh > > >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Josh >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysjosh.lo at gmail.com Wed May 31 13:42:51 2023 From: ysjosh.lo at gmail.com (YuSh Lo) Date: Wed, 31 May 2023 13:42:51 -0500 Subject: [petsc-users] Multiple points constraint in parallel In-Reply-To: References: Message-ID: Matthew Knepley ? 2023?5?31? ?? ??1:02??? > On Wed, May 31, 2023 at 1:53?PM YuSh Lo wrote: > >> Hi Matthew, >> >> Matthew Knepley ? 2023?5?31? ?? ??5:08??? >> >>> On Wed, May 31, 2023 at 1:25?AM YuSh Lo wrote: >>> >>>> Hi, >>>> >>>> I have some multiple points constraint input as follows, >>>> >>>> A_1 a_4 >>>> B_2 b_5 >>>> C_3 c_6 >>>> >>>> each columns are stored in different IS. >>>> >>> >>> So one IS lists the capital letter and one lists the lowercase? >>> >> Yes. >> >>> >>> >>>> After dmplex distribute, they will be renumbered and distribution to >>>> certain processors. >>>> >>> >>> Plex does normally renumber and send ISes, so you are doing this >>> yourself? >>> >> No, Plex does the distribution. >> > > I do not understand this. How does the DM know about this IS? Are you > calling DMDistributeFieldIS()? > I first create a serial DM and use DMLabel to mark those nodes, and store them in IS. Then I call DMPlexDistribute. > > >> >>> >>>> I have two questions: >>>> >>>> (1) I need both complete ISs are all the processors. Can I just do >>>> ISALLGather()? >>>> >>> >>> I think so. I cannot tell what you want to do. >>> >>> >>>> (2) Although renumbered, will the original order remain(ABC and abc)? >>>> If the number is the node number, after distribution and I do an >>>> ISALLGather() >>>> will I have the following on each processor? >>>> >>>> A_3 a_1 >>>> B_4 b_2 >>>> C_6 c_5 >>>> (I randomly renumber them) >>>> >>>> >>>> This is what I can come up with now. Is there any better way to do it? >>>> >>> >>> Can you tell me what you want to use this for? Maybe there is an easier >>> way. For example, >>> if we want to impose a constraint on a mesh point, usually I mark that >>> point with a DMLabel. >>> These are propagated during distribution so you do not have to think >>> about it. >>> >> >> So I have to impose constraint between two nodes. The upper case node >> is controlled by the lower case node, and there can be many pairs of >> constraint. When looping over elements if one node is controlled by the >> other then the corresponding entry has to be added to a different location >> e.x. [an index][A_3] to [an index][a_1]. I know how to use DMLabel, but I >> have used it on one node at a time only. I used DMLabel to mark those nodes >> with BCs. Now I must know the info of two nodes at the same time. The >> operations only have to done in assembling and calculating the element >> stiffness matrix of the elements that contain the upper case node, but I >> must know the info of the corresponding lower case node so I know where to >> assemble the entry. >> > > Oh, if you want to associate two mesh points, use a PetscSF. These can > also be remapped. > > Thanks, > > Matt > > >> Thanks, >> Josh >> >> >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Josh >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 31 14:04:16 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2023 15:04:16 -0400 Subject: [petsc-users] Multiple points constraint in parallel In-Reply-To: References: Message-ID: On Wed, May 31, 2023 at 2:43?PM YuSh Lo wrote: > Matthew Knepley ? 2023?5?31? ?? ??1:02??? > >> On Wed, May 31, 2023 at 1:53?PM YuSh Lo wrote: >> >>> Hi Matthew, >>> >>> Matthew Knepley ? 2023?5?31? ?? ??5:08??? >>> >>>> On Wed, May 31, 2023 at 1:25?AM YuSh Lo wrote: >>>> >>>>> Hi, >>>>> >>>>> I have some multiple points constraint input as follows, >>>>> >>>>> A_1 a_4 >>>>> B_2 b_5 >>>>> C_3 c_6 >>>>> >>>>> each columns are stored in different IS. >>>>> >>>> >>>> So one IS lists the capital letter and one lists the lowercase? >>>> >>> Yes. >>> >>>> >>>> >>>>> After dmplex distribute, they will be renumbered and distribution to >>>>> certain processors. >>>>> >>>> >>>> Plex does normally renumber and send ISes, so you are doing this >>>> yourself? >>>> >>> No, Plex does the distribution. >>> >> >> I do not understand this. How does the DM know about this IS? Are you >> calling DMDistributeFieldIS()? >> > I first create a serial DM and use DMLabel to mark those nodes, and > store them in IS. Then I call DMPlexDistribute. > Right, so DM distributes the Label, not the IS. Ah, so the answer to "will the ISes you get from a label remain in the same order" is no since points may be renumbered. I think you want SF here. Thanks, Matt > >> >>> >>>> >>>>> I have two questions: >>>>> >>>>> (1) I need both complete ISs are all the processors. Can I just do >>>>> ISALLGather()? >>>>> >>>> >>>> I think so. I cannot tell what you want to do. >>>> >>>> >>>>> (2) Although renumbered, will the original order remain(ABC and abc)? >>>>> If the number is the node number, after distribution and I do an >>>>> ISALLGather() >>>>> will I have the following on each processor? >>>>> >>>>> A_3 a_1 >>>>> B_4 b_2 >>>>> C_6 c_5 >>>>> (I randomly renumber them) >>>>> >>>>> >>>>> This is what I can come up with now. Is there any better way to do it? >>>>> >>>> >>>> Can you tell me what you want to use this for? Maybe there is an easier >>>> way. For example, >>>> if we want to impose a constraint on a mesh point, usually I mark that >>>> point with a DMLabel. >>>> These are propagated during distribution so you do not have to think >>>> about it. >>>> >>> >>> So I have to impose constraint between two nodes. The upper case >>> node is controlled by the lower case node, and there can be many pairs of >>> constraint. When looping over elements if one node is controlled by the >>> other then the corresponding entry has to be added to a different location >>> e.x. [an index][A_3] to [an index][a_1]. I know how to use DMLabel, but I >>> have used it on one node at a time only. I used DMLabel to mark those nodes >>> with BCs. Now I must know the info of two nodes at the same time. The >>> operations only have to done in assembling and calculating the element >>> stiffness matrix of the elements that contain the upper case node, but I >>> must know the info of the corresponding lower case node so I know where to >>> assemble the entry. >>> >> >> Oh, if you want to associate two mesh points, use a PetscSF. These can >> also be remapped. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Josh >>> >>> >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Josh >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.c.hall at duke.edu Wed May 31 14:21:11 2023 From: kenneth.c.hall at duke.edu (Kenneth C Hall) Date: Wed, 31 May 2023 19:21:11 +0000 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. Message-ID: Hi, I am doing a number of problems using PETSc/SLEPc, but I also work on some non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for this non-PETSc flow solver for compatibility, so I can use the tolerance monitoring, options, viewers, and for direct comparison to PETSc methods I am using. Here is what I am trying to do? I have a CFD solver that iterates with a nonlinear iterator of the form x := N(x). This can be expressed in a fortran routine of the form, SUBROUTINE MySolver(x) or SUBROUTINE MySolver(x,y) In the first case, x is over written. In the second, y = N(x). In any event, I want to do something like what is shown in the subroutine at the bottom of this email. The code below ?works? in the sense that MySolver is called, but it is called exactly *once*. But MyMonitor and MyConverged are *not* called. Again, I want to iterate so MySolver should be called many times, as should MyMonitor and MyConverged. The SNESView before and after SNESSolve looks like this: SNES Object: 1 MPI process type: shell SNES has not been set up so information may be incomplete maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 total number of function evaluations=0 norm schedule ALWAYS SNES Object: 1 MPI process type: shell maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 total number of function evaluations=0 norm schedule ALWAYS Any suggestions on how to do what I am trying to accomplish? Thanks. Kenneth Hall #include #include "macros.h" MODULE SolveWithSNESShell_module USE MyPetscModule CONTAINS ! !==================================================================================================== SUBROUTINE MySolver(snes, x, ierr) !==================================================================================================== !! !! !==================================================================================================== ! USE MyPetscModule IMPLICIT NONE ! !.... declared passed variables SNES :: snes Vec :: x PetscErrorCode :: ierr ! !.... code to find residual x := N(x) !.... (or alternatively y := N(x)) END SUBROUTINE MySolver ! !==================================================================================================== SUBROUTINE MyMonitor(snes, its, rnorm, ierr) !==================================================================================================== !! !! !==================================================================================================== ! USE MyPetscModule IMPLICIT NONE ! !.... Declare passed variables SNES :: snes PetscInt :: its PetscReal :: rnorm PetscErrorCode :: ierr ! !.... Code to print out convergence history !.... Code to print out convergence history END SUBROUTINE MyMonitor !==================================================================================================== SUBROUTINE MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) USE MyPetscModule IMPLICIT NONE SNES :: snes PetscInt :: it,ctx PetscReal :: xnorm, ynorm, znorm KSPConvergedReason :: reason PetscErrorCode :: ierr ! ... add convergence test here ... ! set reason to a positive value if convergence has been achieved END SUBROUTINE MyConverged END MODULE SolveWithSNESShell_module ! !==================================================================================================== SUBROUTINE SolveWithSNESShell !==================================================================================================== !! !! !==================================================================================================== ! USE SolveWithSNESShell_module IMPLICIT NONE ! !.... Declare passed variables INTEGER :: level_tmp ! !.... Declare local variables INTEGER :: iz INTEGER :: imax INTEGER :: jmax INTEGER :: kmax SNES :: snes KSP :: ksp Vec :: x Vec :: y PetscViewer :: viewer PetscErrorCode :: ierr PetscReal :: rtol = 1.0D-10 !! relative tolerance PetscReal :: atol = 1.0D-50 !! absolute tolerance PetscReal :: dtol = 1.0D+06 !! divergence tolerance PetscInt :: maxits = 50 PetscInt :: maxf = 10000 character(len=1000):: args ! !.... count the number of degrees of freedom. level = level_tmp n = 0 DO iz = 1, hb(level)%nzone imax = hb(level)%zone(iz)%imax - 1 jmax = hb(level)%zone(iz)%jmax - 1 kmax = hb(level)%zone(iz)%kmax - 1 n = n + imax * jmax * kmax END DO n = n * neqn ! !.... Initialize PETSc PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) ! !.... Log PetscCall(PetscLogDefaultBegin(ierr)) ! !.... Hard-wired options. ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line style option here" , ierr)) ! !.... Command line options. call GET_COMMAND(args) PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) ! !.... view command line table PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) PetscCall(PetscViewerDestroy(viewer, ierr)) ! !.... Create PETSc vectors PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) PetscCall(VecSet(x, 0.0d0, ierr)) PetscCall(VecSet(y, 0.0d0, ierr)) !.... SNES context PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) PetscCall(SNESSetType(snes, SNESSHELL, ierr)) PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, ierr)) !!!! this line causes a segmentation error if uncommented. PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, PETSC_NULL_FUNCTION, ierr)) ! !.... Set SNES options PetscCall(SNESSetFromOptions(snes, ierr)) !.... Set tolerances PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, ierr)) ! !.... SNES montior PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, PETSC_NULL_FUNCTION,ierr)) ! !.... Set the initial solution CALL HBToVecX(x) ! !.... View snes context PetscCall(SNESView(snes, viewer, ierr)) ! !.... Solve SNES problem PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) ! !.... View snes context PetscCall(SNESView(snes, viewer, ierr)) ! !.... dump the logs ! call PetscLogDump(ierr) ! Why does this cause error ! !.... Destroy PETSc objects PetscCall(SNESDestroy(snes, ierr)) PetscCall(VecDestroy(x, ierr)) PetscCall(VecDestroy(y, ierr)) PetscCall(PetscViewerDestroy(viewer, ierr)) ! !.... Finish PetscCall(PetscFinalize(ierr)) END SUBROUTINE SolveWithSNESShell -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 31 14:48:00 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2023 15:48:00 -0400 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. In-Reply-To: References: Message-ID: On Wed, May 31, 2023 at 3:21?PM Kenneth C Hall wrote: > Hi, > > > > I am doing a number of problems using PETSc/SLEPc, but I also work on some > non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for > this non-PETSc flow solver for compatibility, so I can use the tolerance > monitoring, options, viewers, and for direct comparison to PETSc methods I > am using. > > > > Here is what I am trying to do? I have a CFD solver that iterates with a > nonlinear iterator of the form x := N(x). This can be expressed in a > fortran routine of the form, > > > > SUBROUTINE MySolver(x) > > or > > SUBROUTINE MySolver(x,y) > > > > In the first case, x is over written. In the second, y = N(x). In any > event, I want to do something like what is shown in the subroutine at the > bottom of this email. > > > > The code below ?works? in the sense that MySolver is called, but it is > called exactly **once**. But MyMonitor and MyConverged are **not** > called. Again, I want to iterate so MySolver should be called many times, > as should MyMonitor and MyConverged. > The SNESolve() method is called once per nonlinear solve, just as KSPSolve() is called once per linear solve. There may be iteration inside the method, but that is handled inside the particular implementation. For example, both Newton's method and Nonlinear Conjugate Gradient iterate, but the iteration is internal to both, and they both call the monitor and convergence test at each internal iterate. So, if your nonlinear solver should iterate, it should happen inside the SNESSolve call for the SNESSHELL object. Does this make sense? What does your solver do? Thanks, Matt > The SNESView before and after SNESSolve looks like this: > > > > SNES Object: 1 MPI process > > type: shell > > SNES has not been set up so information may be incomplete > > maximum iterations=50, maximum function evaluations=10000 > > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > > total number of function evaluations=0 > > norm schedule ALWAYS > > > > SNES Object: 1 MPI process > > type: shell > > maximum iterations=50, maximum function evaluations=10000 > > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > > total number of function evaluations=0 > > norm schedule ALWAYS > > > > Any suggestions on how to do what I am trying to accomplish? > > > > Thanks. > > Kenneth Hall > > > > > > > > *#include * > > *#include "macros.h"* > > > > *MODULE* SolveWithSNESShell_module > > *USE* MyPetscModule > > *CONTAINS* > > ! > > > !==================================================================================================== > > *SUBROUTINE* MySolver(snes, x, ierr) > > > !==================================================================================================== > > !! > > !! > > > !==================================================================================================== > > ! > > *USE* MyPetscModule > > *IMPLICIT* *NONE* > > ! > > !.... declared passed variables > > SNES :: snes > > Vec :: x > > PetscErrorCode :: ierr > > ! > > !.... code to find residual x := N(x) > > !.... (or alternatively y := N(x)) > > > > *END* *SUBROUTINE* MySolver > > ! > > > !==================================================================================================== > > *SUBROUTINE* MyMonitor(snes, its, rnorm, ierr) > > > !==================================================================================================== > > !! > > !! > > > !==================================================================================================== > > ! > > *USE* MyPetscModule > > *IMPLICIT* *NONE* > > ! > > !.... Declare passed variables > > SNES :: snes > > PetscInt :: its > > PetscReal :: rnorm > > PetscErrorCode :: ierr > > ! > > !.... Code to print out convergence history > > !.... Code to print out convergence history > > > > *END* *SUBROUTINE* MyMonitor > > > > > !==================================================================================================== > > *SUBROUTINE* MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) > > *USE* MyPetscModule > > *IMPLICIT* *NONE* > > > > SNES :: snes > > PetscInt :: it,ctx > > PetscReal :: xnorm, ynorm, znorm > > KSPConvergedReason :: reason > > PetscErrorCode :: ierr > > > > ! ... add convergence test here ... > > ! set reason to a positive value if convergence has been achieved > > > > > > *END* *SUBROUTINE* MyConverged > > *END* *MODULE* SolveWithSNESShell_module > > ! > > > !==================================================================================================== > > *SUBROUTINE* SolveWithSNESShell > > > !==================================================================================================== > > !! > > !! > > > !==================================================================================================== > > ! > > *USE* SolveWithSNESShell_module > > *IMPLICIT* *NONE* > > ! > > !.... Declare passed variables > > *INTEGER* :: level_tmp > > ! > > !.... Declare local variables > > *INTEGER* :: iz > > *INTEGER* :: imax > > *INTEGER* :: jmax > > *INTEGER* :: kmax > > SNES :: snes > > KSP :: ksp > > Vec :: x > > Vec :: y > > PetscViewer :: viewer > > PetscErrorCode :: ierr > > PetscReal :: rtol = 1.0D-10 !! relative tolerance > > PetscReal :: atol = 1.0D-50 !! absolute tolerance > > PetscReal :: dtol = 1.0D+06 !! divergence tolerance > > PetscInt :: maxits = 50 > > PetscInt :: maxf = 10000 > > *character*(*len*=1000):: args > > ! > > !.... count the number of degrees of freedom. > > level = level_tmp > > n = 0 > > *DO* iz = 1, hb(level)%nzone > > imax = hb(level)%zone(iz)%imax - 1 > > jmax = hb(level)%zone(iz)%jmax - 1 > > kmax = hb(level)%zone(iz)%kmax - 1 > > n = n + imax * jmax * kmax > > *END* *DO* > > n = n * neqn > > ! > > !.... Initialize PETSc > > PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) > > ! > > !.... Log > > PetscCall(PetscLogDefaultBegin(ierr)) > > ! > > !.... Hard-wired options. > > ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line > style option here" , ierr)) > > ! > > !.... Command line options. > > *call* *GET_COMMAND*(args) > > PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) > > ! > > !.... view command line table > > PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, > PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) > > PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) > > PetscCall(PetscViewerDestroy(viewer, ierr)) > > ! > > !.... Create PETSc vectors > > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) > > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) > > PetscCall(VecSet(x, 0.0d0, ierr)) > > PetscCall(VecSet(y, 0.0d0, ierr)) > > > > !.... SNES context > > PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) > > PetscCall(SNESSetType(snes, SNESSHELL, ierr)) > > PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) > > > > !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, > ierr)) > > !!!! this line causes a segmentation error if uncommented. > > > > PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, > PETSC_NULL_FUNCTION, ierr)) > > ! > > !.... Set SNES options > > PetscCall(SNESSetFromOptions(snes, ierr)) > > > > !.... Set tolerances > > PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, > ierr)) > > ! > > !.... SNES montior > > PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, > PETSC_NULL_FUNCTION,ierr)) > > ! > > !.... Set the initial solution > > *CALL* HBToVecX(x) > > ! > > !.... View snes context > > PetscCall(SNESView(snes, viewer, ierr)) > > ! > > !.... Solve SNES problem > > PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) > > ! > > !.... View snes context > > PetscCall(SNESView(snes, viewer, ierr)) > > ! > > !.... dump the logs > > ! call PetscLogDump(ierr) ! Why does this cause error > > ! > > !.... Destroy PETSc objects > > PetscCall(SNESDestroy(snes, ierr)) > > PetscCall(VecDestroy(x, ierr)) > > PetscCall(VecDestroy(y, ierr)) > > PetscCall(PetscViewerDestroy(viewer, ierr)) > > ! > > !.... Finish > > PetscCall(PetscFinalize(ierr)) > > > > *END* *SUBROUTINE* SolveWithSNESShell > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.c.hall at duke.edu Wed May 31 15:16:00 2023 From: kenneth.c.hall at duke.edu (Kenneth C Hall) Date: Wed, 31 May 2023 20:16:00 +0000 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. In-Reply-To: References: Message-ID: Matt, Thanks for your quick reply. I think what you say makes sense. You asked what my code does. The MySolver program performs one iteration of a CFD iteration. The CFD scheme is an explicit scheme that uses multigrid, Mach number preconditioning, and residual smoothing. Typically, I have to call MySolver on the order of 40 to 100 times to get acceptable convergence. And in fact, I have another version of this Petsc code that uses SNESNGMRES to solve the problem with MySolver providing the residuals as R = N(x) - x. But I would like a version where I am using just MySolver, without any other operations applied to it. So I am trying to plug MySolver into the PETSc system to provide monitoring and other features, and for consistency and comparison to these other (more appropriate!) uses of PETSc. Thanks. Kenneth From: Matthew Knepley Date: Wednesday, May 31, 2023 at 3:48 PM To: Kenneth C Hall Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. On Wed, May 31, 2023 at 3:21?PM Kenneth C Hall > wrote: Hi, I am doing a number of problems using PETSc/SLEPc, but I also work on some non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for this non-PETSc flow solver for compatibility, so I can use the tolerance monitoring, options, viewers, and for direct comparison to PETSc methods I am using. Here is what I am trying to do? I have a CFD solver that iterates with a nonlinear iterator of the form x := N(x). This can be expressed in a fortran routine of the form, SUBROUTINE MySolver(x) or SUBROUTINE MySolver(x,y) In the first case, x is over written. In the second, y = N(x). In any event, I want to do something like what is shown in the subroutine at the bottom of this email. The code below ?works? in the sense that MySolver is called, but it is called exactly *once*. But MyMonitor and MyConverged are *not* called. Again, I want to iterate so MySolver should be called many times, as should MyMonitor and MyConverged. The SNESolve() method is called once per nonlinear solve, just as KSPSolve() is called once per linear solve. There may be iteration inside the method, but that is handled inside the particular implementation. For example, both Newton's method and Nonlinear Conjugate Gradient iterate, but the iteration is internal to both, and they both call the monitor and convergence test at each internal iterate. So, if your nonlinear solver should iterate, it should happen inside the SNESSolve call for the SNESSHELL object. Does this make sense? What does your solver do? Thanks, Matt The SNESView before and after SNESSolve looks like this: SNES Object: 1 MPI process type: shell SNES has not been set up so information may be incomplete maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 total number of function evaluations=0 norm schedule ALWAYS SNES Object: 1 MPI process type: shell maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 total number of function evaluations=0 norm schedule ALWAYS Any suggestions on how to do what I am trying to accomplish? Thanks. Kenneth Hall #include #include "macros.h" MODULE SolveWithSNESShell_module USE MyPetscModule CONTAINS ! !==================================================================================================== SUBROUTINE MySolver(snes, x, ierr) !==================================================================================================== !! !! !==================================================================================================== ! USE MyPetscModule IMPLICIT NONE ! !.... declared passed variables SNES :: snes Vec :: x PetscErrorCode :: ierr ! !.... code to find residual x := N(x) !.... (or alternatively y := N(x)) END SUBROUTINE MySolver ! !==================================================================================================== SUBROUTINE MyMonitor(snes, its, rnorm, ierr) !==================================================================================================== !! !! !==================================================================================================== ! USE MyPetscModule IMPLICIT NONE ! !.... Declare passed variables SNES :: snes PetscInt :: its PetscReal :: rnorm PetscErrorCode :: ierr ! !.... Code to print out convergence history !.... Code to print out convergence history END SUBROUTINE MyMonitor !==================================================================================================== SUBROUTINE MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) USE MyPetscModule IMPLICIT NONE SNES :: snes PetscInt :: it,ctx PetscReal :: xnorm, ynorm, znorm KSPConvergedReason :: reason PetscErrorCode :: ierr ! ... add convergence test here ... ! set reason to a positive value if convergence has been achieved END SUBROUTINE MyConverged END MODULE SolveWithSNESShell_module ! !==================================================================================================== SUBROUTINE SolveWithSNESShell !==================================================================================================== !! !! !==================================================================================================== ! USE SolveWithSNESShell_module IMPLICIT NONE ! !.... Declare passed variables INTEGER :: level_tmp ! !.... Declare local variables INTEGER :: iz INTEGER :: imax INTEGER :: jmax INTEGER :: kmax SNES :: snes KSP :: ksp Vec :: x Vec :: y PetscViewer :: viewer PetscErrorCode :: ierr PetscReal :: rtol = 1.0D-10 !! relative tolerance PetscReal :: atol = 1.0D-50 !! absolute tolerance PetscReal :: dtol = 1.0D+06 !! divergence tolerance PetscInt :: maxits = 50 PetscInt :: maxf = 10000 character(len=1000):: args ! !.... count the number of degrees of freedom. level = level_tmp n = 0 DO iz = 1, hb(level)%nzone imax = hb(level)%zone(iz)%imax - 1 jmax = hb(level)%zone(iz)%jmax - 1 kmax = hb(level)%zone(iz)%kmax - 1 n = n + imax * jmax * kmax END DO n = n * neqn ! !.... Initialize PETSc PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) ! !.... Log PetscCall(PetscLogDefaultBegin(ierr)) ! !.... Hard-wired options. ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line style option here" , ierr)) ! !.... Command line options. call GET_COMMAND(args) PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) ! !.... view command line table PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) PetscCall(PetscViewerDestroy(viewer, ierr)) ! !.... Create PETSc vectors PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) PetscCall(VecSet(x, 0.0d0, ierr)) PetscCall(VecSet(y, 0.0d0, ierr)) !.... SNES context PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) PetscCall(SNESSetType(snes, SNESSHELL, ierr)) PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, ierr)) !!!! this line causes a segmentation error if uncommented. PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, PETSC_NULL_FUNCTION, ierr)) ! !.... Set SNES options PetscCall(SNESSetFromOptions(snes, ierr)) !.... Set tolerances PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, ierr)) ! !.... SNES montior PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, PETSC_NULL_FUNCTION,ierr)) ! !.... Set the initial solution CALL HBToVecX(x) ! !.... View snes context PetscCall(SNESView(snes, viewer, ierr)) ! !.... Solve SNES problem PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) ! !.... View snes context PetscCall(SNESView(snes, viewer, ierr)) ! !.... dump the logs ! call PetscLogDump(ierr) ! Why does this cause error ! !.... Destroy PETSc objects PetscCall(SNESDestroy(snes, ierr)) PetscCall(VecDestroy(x, ierr)) PetscCall(VecDestroy(y, ierr)) PetscCall(PetscViewerDestroy(viewer, ierr)) ! !.... Finish PetscCall(PetscFinalize(ierr)) END SUBROUTINE SolveWithSNESShell -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 31 15:56:35 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 31 May 2023 16:56:35 -0400 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. In-Reply-To: References: Message-ID: <750004D2-B119-4F10-9F0F-E619F803BF28@petsc.dev> > On May 31, 2023, at 4:16 PM, Kenneth C Hall wrote: > > Matt, > > Thanks for your quick reply. I think what you say makes sense. > > You asked what my code does. The MySolver program performs one iteration of a CFD iteration. The CFD scheme is an explicit scheme that uses multigrid, Mach number preconditioning, and residual smoothing. Typically, I have to call MySolver on the order of 40 to 100 times to get acceptable convergence. > > And in fact, I have another version of this Petsc code that uses SNESNGMRES to solve the problem with MySolver providing the residuals as R = N(x) - x. So you use SNESSetFunction( a function you wrote that computes R = N(x) - x )? Have you considered using the SNESSetNPC() mechanism? That is, use your solver as a nonlinear preconditioner for any of the PETSc solvers? In particular, you could use it with SNESNRICHARDSON which provides a lineseach, monitor, and convergence testing in a loop. I think this is exactly the loop structure you want, so no need for SNESSHELL. > But I would like a version where I am using just MySolver, without any other operations applied to it. So I am trying to plug MySolver into the PETSc system to provide monitoring and other features, and for consistency and comparison to these other (more appropriate!) uses of PETSc. > > Thanks. > Kenneth > > > From: Matthew Knepley > > Date: Wednesday, May 31, 2023 at 3:48 PM > To: Kenneth C Hall > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. > > On Wed, May 31, 2023 at 3:21?PM Kenneth C Hall > wrote: > Hi, > > I am doing a number of problems using PETSc/SLEPc, but I also work on some non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for this non-PETSc flow solver for compatibility, so I can use the tolerance monitoring, options, viewers, and for direct comparison to PETSc methods I am using. > > Here is what I am trying to do? I have a CFD solver that iterates with a nonlinear iterator of the form x := N(x). This can be expressed in a fortran routine of the form, > > SUBROUTINE MySolver(x) > or > SUBROUTINE MySolver(x,y) > > In the first case, x is over written. In the second, y = N(x). In any event, I want to do something like what is shown in the subroutine at the bottom of this email. > > The code below ?works? in the sense that MySolver is called, but it is called exactly *once*. But MyMonitor and MyConverged are *not* called. Again, I want to iterate so MySolver should be called many times, as should MyMonitor and MyConverged. > > The SNESolve() method is called once per nonlinear solve, just as KSPSolve() is called once per linear solve. There may be iteration inside the method, but that is handled inside the particular implementation. For example, both Newton's method and Nonlinear Conjugate Gradient iterate, but the iteration is internal to both, and they both call the monitor and convergence test at each internal iterate. > > So, if your nonlinear solver should iterate, it should happen inside the SNESSolve call for the SNESSHELL object. Does this make sense? What does your solver do? > > Thanks, > > Matt > > The SNESView before and after SNESSolve looks like this: > > SNES Object: 1 MPI process > type: shell > SNES has not been set up so information may be incomplete > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > total number of function evaluations=0 > norm schedule ALWAYS > > SNES Object: 1 MPI process > type: shell > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > total number of function evaluations=0 > norm schedule ALWAYS > > Any suggestions on how to do what I am trying to accomplish? > > Thanks. > Kenneth Hall > > > > #include > #include "macros.h" > > MODULE SolveWithSNESShell_module > USE MyPetscModule > CONTAINS > ! > !==================================================================================================== > SUBROUTINE MySolver(snes, x, ierr) > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE MyPetscModule > IMPLICIT NONE > ! > !.... declared passed variables > SNES :: snes > Vec :: x > PetscErrorCode :: ierr > ! > !.... code to find residual x := N(x) > !.... (or alternatively y := N(x)) > > END SUBROUTINE MySolver > ! > !==================================================================================================== > SUBROUTINE MyMonitor(snes, its, rnorm, ierr) > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE MyPetscModule > IMPLICIT NONE > ! > !.... Declare passed variables > SNES :: snes > PetscInt :: its > PetscReal :: rnorm > PetscErrorCode :: ierr > ! > !.... Code to print out convergence history > !.... Code to print out convergence history > > END SUBROUTINE MyMonitor > > !==================================================================================================== > SUBROUTINE MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) > USE MyPetscModule > IMPLICIT NONE > > SNES :: snes > PetscInt :: it,ctx > PetscReal :: xnorm, ynorm, znorm > KSPConvergedReason :: reason > PetscErrorCode :: ierr > > ! ... add convergence test here ... > ! set reason to a positive value if convergence has been achieved > > > END SUBROUTINE MyConverged > END MODULE SolveWithSNESShell_module > ! > !==================================================================================================== > SUBROUTINE SolveWithSNESShell > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE SolveWithSNESShell_module > IMPLICIT NONE > ! > !.... Declare passed variables > INTEGER :: level_tmp > ! > !.... Declare local variables > INTEGER :: iz > INTEGER :: imax > INTEGER :: jmax > INTEGER :: kmax > SNES :: snes > KSP :: ksp > Vec :: x > Vec :: y > PetscViewer :: viewer > PetscErrorCode :: ierr > PetscReal :: rtol = 1.0D-10 !! relative tolerance > PetscReal :: atol = 1.0D-50 !! absolute tolerance > PetscReal :: dtol = 1.0D+06 !! divergence tolerance > PetscInt :: maxits = 50 > PetscInt :: maxf = 10000 > character(len=1000):: args > ! > !.... count the number of degrees of freedom. > level = level_tmp > n = 0 > DO iz = 1, hb(level)%nzone > imax = hb(level)%zone(iz)%imax - 1 > jmax = hb(level)%zone(iz)%jmax - 1 > kmax = hb(level)%zone(iz)%kmax - 1 > n = n + imax * jmax * kmax > END DO > n = n * neqn > ! > !.... Initialize PETSc > PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) > ! > !.... Log > PetscCall(PetscLogDefaultBegin(ierr)) > ! > !.... Hard-wired options. > ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line style option here" , ierr)) > ! > !.... Command line options. > call GET_COMMAND(args) > PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) > ! > !.... view command line table > PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) > PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > ! > !.... Create PETSc vectors > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) > PetscCall(VecSet(x, 0.0d0, ierr)) > PetscCall(VecSet(y, 0.0d0, ierr)) > > !.... SNES context > PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) > PetscCall(SNESSetType(snes, SNESSHELL, ierr)) > PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) > > !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, ierr)) > !!!! this line causes a segmentation error if uncommented. > > PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, PETSC_NULL_FUNCTION, ierr)) > ! > !.... Set SNES options > PetscCall(SNESSetFromOptions(snes, ierr)) > > !.... Set tolerances > PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, ierr)) > ! > !.... SNES montior > PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, PETSC_NULL_FUNCTION,ierr)) > ! > !.... Set the initial solution > CALL HBToVecX(x) > ! > !.... View snes context > PetscCall(SNESView(snes, viewer, ierr)) > ! > !.... Solve SNES problem > PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) > ! > !.... View snes context > PetscCall(SNESView(snes, viewer, ierr)) > ! > !.... dump the logs > ! call PetscLogDump(ierr) ! Why does this cause error > ! > !.... Destroy PETSc objects > PetscCall(SNESDestroy(snes, ierr)) > PetscCall(VecDestroy(x, ierr)) > PetscCall(VecDestroy(y, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > ! > !.... Finish > PetscCall(PetscFinalize(ierr)) > > END SUBROUTINE SolveWithSNESShell > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 31 16:50:04 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 31 May 2023 17:50:04 -0400 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. In-Reply-To: References: Message-ID: <7AE73466-5DF0-4493-88EC-DBAEB4D60FA2@petsc.dev> Sorry, I wrote to quickly in my last email. You will need to create a SNESSHELL its solve simply calls your solver (for its one iteration) SNESNRICHARSON handles the rest. > On May 31, 2023, at 4:16 PM, Kenneth C Hall wrote: > > Matt, > > Thanks for your quick reply. I think what you say makes sense. > > You asked what my code does. The MySolver program performs one iteration of a CFD iteration. The CFD scheme is an explicit scheme that uses multigrid, Mach number preconditioning, and residual smoothing. Typically, I have to call MySolver on the order of 40 to 100 times to get acceptable convergence. > > And in fact, I have another version of this Petsc code that uses SNESNGMRES to solve the problem with MySolver providing the residuals as R = N(x) - x. But I would like a version where I am using just MySolver, without any other operations applied to it. So I am trying to plug MySolver into the PETSc system to provide monitoring and other features, and for consistency and comparison to these other (more appropriate!) uses of PETSc. > > Thanks. > Kenneth > > > From: Matthew Knepley > > Date: Wednesday, May 31, 2023 at 3:48 PM > To: Kenneth C Hall > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. > > On Wed, May 31, 2023 at 3:21?PM Kenneth C Hall > wrote: > Hi, > > I am doing a number of problems using PETSc/SLEPc, but I also work on some non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for this non-PETSc flow solver for compatibility, so I can use the tolerance monitoring, options, viewers, and for direct comparison to PETSc methods I am using. > > Here is what I am trying to do? I have a CFD solver that iterates with a nonlinear iterator of the form x := N(x). This can be expressed in a fortran routine of the form, > > SUBROUTINE MySolver(x) > or > SUBROUTINE MySolver(x,y) > > In the first case, x is over written. In the second, y = N(x). In any event, I want to do something like what is shown in the subroutine at the bottom of this email. > > The code below ?works? in the sense that MySolver is called, but it is called exactly *once*. But MyMonitor and MyConverged are *not* called. Again, I want to iterate so MySolver should be called many times, as should MyMonitor and MyConverged. > > The SNESolve() method is called once per nonlinear solve, just as KSPSolve() is called once per linear solve. There may be iteration inside the method, but that is handled inside the particular implementation. For example, both Newton's method and Nonlinear Conjugate Gradient iterate, but the iteration is internal to both, and they both call the monitor and convergence test at each internal iterate. > > So, if your nonlinear solver should iterate, it should happen inside the SNESSolve call for the SNESSHELL object. Does this make sense? What does your solver do? > > Thanks, > > Matt > > The SNESView before and after SNESSolve looks like this: > > SNES Object: 1 MPI process > type: shell > SNES has not been set up so information may be incomplete > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > total number of function evaluations=0 > norm schedule ALWAYS > > SNES Object: 1 MPI process > type: shell > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > total number of function evaluations=0 > norm schedule ALWAYS > > Any suggestions on how to do what I am trying to accomplish? > > Thanks. > Kenneth Hall > > > > #include > #include "macros.h" > > MODULE SolveWithSNESShell_module > USE MyPetscModule > CONTAINS > ! > !==================================================================================================== > SUBROUTINE MySolver(snes, x, ierr) > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE MyPetscModule > IMPLICIT NONE > ! > !.... declared passed variables > SNES :: snes > Vec :: x > PetscErrorCode :: ierr > ! > !.... code to find residual x := N(x) > !.... (or alternatively y := N(x)) > > END SUBROUTINE MySolver > ! > !==================================================================================================== > SUBROUTINE MyMonitor(snes, its, rnorm, ierr) > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE MyPetscModule > IMPLICIT NONE > ! > !.... Declare passed variables > SNES :: snes > PetscInt :: its > PetscReal :: rnorm > PetscErrorCode :: ierr > ! > !.... Code to print out convergence history > !.... Code to print out convergence history > > END SUBROUTINE MyMonitor > > !==================================================================================================== > SUBROUTINE MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) > USE MyPetscModule > IMPLICIT NONE > > SNES :: snes > PetscInt :: it,ctx > PetscReal :: xnorm, ynorm, znorm > KSPConvergedReason :: reason > PetscErrorCode :: ierr > > ! ... add convergence test here ... > ! set reason to a positive value if convergence has been achieved > > > END SUBROUTINE MyConverged > END MODULE SolveWithSNESShell_module > ! > !==================================================================================================== > SUBROUTINE SolveWithSNESShell > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE SolveWithSNESShell_module > IMPLICIT NONE > ! > !.... Declare passed variables > INTEGER :: level_tmp > ! > !.... Declare local variables > INTEGER :: iz > INTEGER :: imax > INTEGER :: jmax > INTEGER :: kmax > SNES :: snes > KSP :: ksp > Vec :: x > Vec :: y > PetscViewer :: viewer > PetscErrorCode :: ierr > PetscReal :: rtol = 1.0D-10 !! relative tolerance > PetscReal :: atol = 1.0D-50 !! absolute tolerance > PetscReal :: dtol = 1.0D+06 !! divergence tolerance > PetscInt :: maxits = 50 > PetscInt :: maxf = 10000 > character(len=1000):: args > ! > !.... count the number of degrees of freedom. > level = level_tmp > n = 0 > DO iz = 1, hb(level)%nzone > imax = hb(level)%zone(iz)%imax - 1 > jmax = hb(level)%zone(iz)%jmax - 1 > kmax = hb(level)%zone(iz)%kmax - 1 > n = n + imax * jmax * kmax > END DO > n = n * neqn > ! > !.... Initialize PETSc > PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) > ! > !.... Log > PetscCall(PetscLogDefaultBegin(ierr)) > ! > !.... Hard-wired options. > ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line style option here" , ierr)) > ! > !.... Command line options. > call GET_COMMAND(args) > PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) > ! > !.... view command line table > PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) > PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > ! > !.... Create PETSc vectors > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) > PetscCall(VecSet(x, 0.0d0, ierr)) > PetscCall(VecSet(y, 0.0d0, ierr)) > > !.... SNES context > PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) > PetscCall(SNESSetType(snes, SNESSHELL, ierr)) > PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) > > !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, ierr)) > !!!! this line causes a segmentation error if uncommented. > > PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, PETSC_NULL_FUNCTION, ierr)) > ! > !.... Set SNES options > PetscCall(SNESSetFromOptions(snes, ierr)) > > !.... Set tolerances > PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, ierr)) > ! > !.... SNES montior > PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, PETSC_NULL_FUNCTION,ierr)) > ! > !.... Set the initial solution > CALL HBToVecX(x) > ! > !.... View snes context > PetscCall(SNESView(snes, viewer, ierr)) > ! > !.... Solve SNES problem > PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) > ! > !.... View snes context > PetscCall(SNESView(snes, viewer, ierr)) > ! > !.... dump the logs > ! call PetscLogDump(ierr) ! Why does this cause error > ! > !.... Destroy PETSc objects > PetscCall(SNESDestroy(snes, ierr)) > PetscCall(VecDestroy(x, ierr)) > PetscCall(VecDestroy(y, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > ! > !.... Finish > PetscCall(PetscFinalize(ierr)) > > END SUBROUTINE SolveWithSNESShell > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 31 17:21:14 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2023 18:21:14 -0400 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. In-Reply-To: <7AE73466-5DF0-4493-88EC-DBAEB4D60FA2@petsc.dev> References: <7AE73466-5DF0-4493-88EC-DBAEB4D60FA2@petsc.dev> Message-ID: On Wed, May 31, 2023 at 5:51?PM Barry Smith wrote: > > Sorry, I wrote to quickly in my last email. You will need to create a > SNESSHELL its solve simply calls your solver (for its one iteration) > SNESNRICHARSON handles the rest. > Yes, this is a great suggestion. It should work with your current SNESSHELL. I would also point out that we have the SNESMS, which is the multistage solver from Jameson. You can use this as a residual smooth with FAS to do something similar to what you have, although I do not know what Mach number preconditioning is (Jed probably knows). Thanks, Matt > On May 31, 2023, at 4:16 PM, Kenneth C Hall > wrote: > > Matt, > > Thanks for your quick reply. I think what you say makes sense. > > You asked what my code does. The MySolver program performs one iteration > of a CFD iteration. The CFD scheme is an explicit scheme that uses > multigrid, Mach number preconditioning, and residual smoothing. Typically, > I have to call MySolver on the order of 40 to 100 times to get acceptable > convergence. > > And in fact, I have another version of this Petsc code that uses > SNESNGMRES to solve the problem with MySolver providing the residuals as R > = N(x) - x. But I would like a version where I am using just MySolver, > without any other operations applied to it. So I am trying to plug > MySolver into the PETSc system to provide monitoring and other features, > and for consistency and comparison to these other (more appropriate!) uses > of PETSc. > > Thanks. > Kenneth > > > > *From: *Matthew Knepley > *Date: *Wednesday, May 31, 2023 at 3:48 PM > *To: *Kenneth C Hall > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD > solver. > On Wed, May 31, 2023 at 3:21?PM Kenneth C Hall > wrote: > > Hi, > > I am doing a number of problems using PETSc/SLEPc, but I also work on some > non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for > this non-PETSc flow solver for compatibility, so I can use the tolerance > monitoring, options, viewers, and for direct comparison to PETSc methods I > am using. > > Here is what I am trying to do? I have a CFD solver that iterates with a > nonlinear iterator of the form x := N(x). This can be expressed in a > fortran routine of the form, > > SUBROUTINE MySolver(x) > or > SUBROUTINE MySolver(x,y) > > In the first case, x is over written. In the second, y = N(x). In any > event, I want to do something like what is shown in the subroutine at the > bottom of this email. > > The code below ?works? in the sense that MySolver is called, but it is > called exactly **once**. But MyMonitor and MyConverged are **not** > called. Again, I want to iterate so MySolver should be called many times, > as should MyMonitor and MyConverged. > > > The SNESolve() method is called once per nonlinear solve, just as > KSPSolve() is called once per linear solve. There may be iteration inside > the method, but that is handled inside the particular implementation. For > example, both Newton's method and Nonlinear Conjugate Gradient iterate, but > the iteration is internal to both, and they both call the monitor and > convergence test at each internal iterate. > > So, if your nonlinear solver should iterate, it should happen inside the > SNESSolve call for the SNESSHELL object. Does this make sense? What does > your solver do? > > Thanks, > > Matt > > > The SNESView before and after SNESSolve looks like this: > > > SNES Object: 1 MPI process > > type: shell > > SNES has not been set up so information may be incomplete > > maximum iterations=50, maximum function evaluations=10000 > > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > > total number of function evaluations=0 > > norm schedule ALWAYS > > > SNES Object: 1 MPI process > > type: shell > > maximum iterations=50, maximum function evaluations=10000 > > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > > total number of function evaluations=0 > > norm schedule ALWAYS > > Any suggestions on how to do what I am trying to accomplish? > > Thanks. > Kenneth Hall > > > > *#include * > *#include "macros.h"* > > > *MODULE* SolveWithSNESShell_module > *USE* MyPetscModule > *CONTAINS* > ! > > !==================================================================================================== > *SUBROUTINE* MySolver(snes, x, ierr) > > !==================================================================================================== > !! > !! > > !==================================================================================================== > ! > *USE* MyPetscModule > *IMPLICIT* *NONE* > ! > !.... declared passed variables > SNES :: snes > Vec :: x > PetscErrorCode :: ierr > ! > !.... code to find residual x := N(x) > !.... (or alternatively y := N(x)) > > > *END* *SUBROUTINE* MySolver > ! > > !==================================================================================================== > *SUBROUTINE* MyMonitor(snes, its, rnorm, ierr) > > !==================================================================================================== > !! > !! > > !==================================================================================================== > ! > *USE* MyPetscModule > *IMPLICIT* *NONE* > ! > !.... Declare passed variables > SNES :: snes > PetscInt :: its > PetscReal :: rnorm > PetscErrorCode :: ierr > ! > !.... Code to print out convergence history > !.... Code to print out convergence history > > > *END* *SUBROUTINE* MyMonitor > > > > !==================================================================================================== > *SUBROUTINE* MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) > *USE* MyPetscModule > *IMPLICIT* *NONE* > > > SNES :: snes > PetscInt :: it,ctx > PetscReal :: xnorm, ynorm, znorm > KSPConvergedReason :: reason > PetscErrorCode :: ierr > > > ! ... add convergence test here ... > ! set reason to a positive value if convergence has been achieved > > > > > *END* *SUBROUTINE* MyConverged > *END* *MODULE* SolveWithSNESShell_module > ! > > !==================================================================================================== > *SUBROUTINE* SolveWithSNESShell > > !==================================================================================================== > !! > !! > > !==================================================================================================== > ! > *USE* SolveWithSNESShell_module > *IMPLICIT* *NONE* > ! > !.... Declare passed variables > *INTEGER* :: level_tmp > ! > !.... Declare local variables > *INTEGER* :: iz > *INTEGER* :: imax > *INTEGER* :: jmax > *INTEGER* :: kmax > SNES :: snes > KSP :: ksp > Vec :: x > Vec :: y > PetscViewer :: viewer > PetscErrorCode :: ierr > PetscReal :: rtol = 1.0D-10 !! relative tolerance > PetscReal :: atol = 1.0D-50 !! absolute tolerance > PetscReal :: dtol = 1.0D+06 !! divergence tolerance > PetscInt :: maxits = 50 > PetscInt :: maxf = 10000 > *character*(*len*=1000):: args > ! > !.... count the number of degrees of freedom. > level = level_tmp > n = 0 > *DO* iz = 1, hb(level)%nzone > imax = hb(level)%zone(iz)%imax - 1 > jmax = hb(level)%zone(iz)%jmax - 1 > kmax = hb(level)%zone(iz)%kmax - 1 > n = n + imax * jmax * kmax > *END* *DO* > n = n * neqn > ! > !.... Initialize PETSc > PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) > ! > !.... Log > PetscCall(PetscLogDefaultBegin(ierr)) > ! > !.... Hard-wired options. > ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line > style option here" , ierr)) > ! > !.... Command line options. > *call* *GET_COMMAND*(args) > PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) > ! > !.... view command line table > PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, > PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) > PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > ! > !.... Create PETSc vectors > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) > PetscCall(VecSet(x, 0.0d0, ierr)) > PetscCall(VecSet(y, 0.0d0, ierr)) > > > !.... SNES context > PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) > PetscCall(SNESSetType(snes, SNESSHELL, ierr)) > PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) > > > !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, > ierr)) > !!!! this line causes a segmentation error if uncommented. > > > PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, > PETSC_NULL_FUNCTION, ierr)) > ! > !.... Set SNES options > PetscCall(SNESSetFromOptions(snes, ierr)) > > > !.... Set tolerances > PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, > ierr)) > ! > !.... SNES montior > PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, > PETSC_NULL_FUNCTION,ierr)) > ! > !.... Set the initial solution > *CALL* HBToVecX(x) > ! > !.... View snes context > PetscCall(SNESView(snes, viewer, ierr)) > ! > !.... Solve SNES problem > PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) > ! > !.... View snes context > PetscCall(SNESView(snes, viewer, ierr)) > ! > !.... dump the logs > ! call PetscLogDump(ierr) ! Why does this cause error > ! > !.... Destroy PETSc objects > PetscCall(SNESDestroy(snes, ierr)) > PetscCall(VecDestroy(x, ierr)) > PetscCall(VecDestroy(y, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > ! > !.... Finish > PetscCall(PetscFinalize(ierr)) > > > *END* *SUBROUTINE* SolveWithSNESShell > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.c.hall at duke.edu Wed May 31 17:33:59 2023 From: kenneth.c.hall at duke.edu (Kenneth C Hall) Date: Wed, 31 May 2023 22:33:59 +0000 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. In-Reply-To: <7AE73466-5DF0-4493-88EC-DBAEB4D60FA2@petsc.dev> References: <7AE73466-5DF0-4493-88EC-DBAEB4D60FA2@petsc.dev> Message-ID: Barry, Thanks for your reply. Yes, for SNESNGMRES, I use N(x) ? x as the residuals, with basic line search. This converges in half the ?iterations? as the original CFD solver, but each SNESNGMRES iteration requires two residual calls, so ends up converging in almost exactly the same amount of time as the CFD solver alone. I will try the SNESSetNPC/SNESNRICHARSON/SNESSHELL combination for the ?bare CFD? version. Thanks for your helpful comments and reply. Kenneth From: Barry Smith Date: Wednesday, May 31, 2023 at 5:50 PM To: Kenneth C Hall Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. Sorry, I wrote to quickly in my last email. You will need to create a SNESSHELL its solve simply calls your solver (for its one iteration) SNESNRICHARSON handles the rest. On May 31, 2023, at 4:16 PM, Kenneth C Hall wrote: Matt, Thanks for your quick reply. I think what you say makes sense. You asked what my code does. The MySolver program performs one iteration of a CFD iteration. The CFD scheme is an explicit scheme that uses multigrid, Mach number preconditioning, and residual smoothing. Typically, I have to call MySolver on the order of 40 to 100 times to get acceptable convergence. And in fact, I have another version of this Petsc code that uses SNESNGMRES to solve the problem with MySolver providing the residuals as R = N(x) - x. But I would like a version where I am using just MySolver, without any other operations applied to it. So I am trying to plug MySolver into the PETSc system to provide monitoring and other features, and for consistency and comparison to these other (more appropriate!) uses of PETSc. Thanks. Kenneth From: Matthew Knepley > Date: Wednesday, May 31, 2023 at 3:48 PM To: Kenneth C Hall > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. On Wed, May 31, 2023 at 3:21?PM Kenneth C Hall > wrote: Hi, I am doing a number of problems using PETSc/SLEPc, but I also work on some non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for this non-PETSc flow solver for compatibility, so I can use the tolerance monitoring, options, viewers, and for direct comparison to PETSc methods I am using. Here is what I am trying to do? I have a CFD solver that iterates with a nonlinear iterator of the form x := N(x). This can be expressed in a fortran routine of the form, SUBROUTINE MySolver(x) or SUBROUTINE MySolver(x,y) In the first case, x is over written. In the second, y = N(x). In any event, I want to do something like what is shown in the subroutine at the bottom of this email. The code below ?works? in the sense that MySolver is called, but it is called exactly *once*. But MyMonitor and MyConverged are *not* called. Again, I want to iterate so MySolver should be called many times, as should MyMonitor and MyConverged. The SNESolve() method is called once per nonlinear solve, just as KSPSolve() is called once per linear solve. There may be iteration inside the method, but that is handled inside the particular implementation. For example, both Newton's method and Nonlinear Conjugate Gradient iterate, but the iteration is internal to both, and they both call the monitor and convergence test at each internal iterate. So, if your nonlinear solver should iterate, it should happen inside the SNESSolve call for the SNESSHELL object. Does this make sense? What does your solver do? Thanks, Matt The SNESView before and after SNESSolve looks like this: SNES Object: 1 MPI process type: shell SNES has not been set up so information may be incomplete maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 total number of function evaluations=0 norm schedule ALWAYS SNES Object: 1 MPI process type: shell maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 total number of function evaluations=0 norm schedule ALWAYS Any suggestions on how to do what I am trying to accomplish? Thanks. Kenneth Hall #include #include "macros.h" MODULE SolveWithSNESShell_module USE MyPetscModule CONTAINS ! !==================================================================================================== SUBROUTINE MySolver(snes, x, ierr) !==================================================================================================== !! !! !==================================================================================================== ! USE MyPetscModule IMPLICIT NONE ! !.... declared passed variables SNES :: snes Vec :: x PetscErrorCode :: ierr ! !.... code to find residual x := N(x) !.... (or alternatively y := N(x)) END SUBROUTINE MySolver ! !==================================================================================================== SUBROUTINE MyMonitor(snes, its, rnorm, ierr) !==================================================================================================== !! !! !==================================================================================================== ! USE MyPetscModule IMPLICIT NONE ! !.... Declare passed variables SNES :: snes PetscInt :: its PetscReal :: rnorm PetscErrorCode :: ierr ! !.... Code to print out convergence history !.... Code to print out convergence history END SUBROUTINE MyMonitor !==================================================================================================== SUBROUTINE MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) USE MyPetscModule IMPLICIT NONE SNES :: snes PetscInt :: it,ctx PetscReal :: xnorm, ynorm, znorm KSPConvergedReason :: reason PetscErrorCode :: ierr ! ... add convergence test here ... ! set reason to a positive value if convergence has been achieved END SUBROUTINE MyConverged END MODULE SolveWithSNESShell_module ! !==================================================================================================== SUBROUTINE SolveWithSNESShell !==================================================================================================== !! !! !==================================================================================================== ! USE SolveWithSNESShell_module IMPLICIT NONE ! !.... Declare passed variables INTEGER :: level_tmp ! !.... Declare local variables INTEGER :: iz INTEGER :: imax INTEGER :: jmax INTEGER :: kmax SNES :: snes KSP :: ksp Vec :: x Vec :: y PetscViewer :: viewer PetscErrorCode :: ierr PetscReal :: rtol = 1.0D-10 !! relative tolerance PetscReal :: atol = 1.0D-50 !! absolute tolerance PetscReal :: dtol = 1.0D+06 !! divergence tolerance PetscInt :: maxits = 50 PetscInt :: maxf = 10000 character(len=1000):: args ! !.... count the number of degrees of freedom. level = level_tmp n = 0 DO iz = 1, hb(level)%nzone imax = hb(level)%zone(iz)%imax - 1 jmax = hb(level)%zone(iz)%jmax - 1 kmax = hb(level)%zone(iz)%kmax - 1 n = n + imax * jmax * kmax END DO n = n * neqn ! !.... Initialize PETSc PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) ! !.... Log PetscCall(PetscLogDefaultBegin(ierr)) ! !.... Hard-wired options. ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line style option here" , ierr)) ! !.... Command line options. call GET_COMMAND(args) PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) ! !.... view command line table PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) PetscCall(PetscViewerDestroy(viewer, ierr)) ! !.... Create PETSc vectors PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) PetscCall(VecSet(x, 0.0d0, ierr)) PetscCall(VecSet(y, 0.0d0, ierr)) !.... SNES context PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) PetscCall(SNESSetType(snes, SNESSHELL, ierr)) PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, ierr)) !!!! this line causes a segmentation error if uncommented. PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, PETSC_NULL_FUNCTION, ierr)) ! !.... Set SNES options PetscCall(SNESSetFromOptions(snes, ierr)) !.... Set tolerances PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, ierr)) ! !.... SNES montior PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, PETSC_NULL_FUNCTION,ierr)) ! !.... Set the initial solution CALL HBToVecX(x) ! !.... View snes context PetscCall(SNESView(snes, viewer, ierr)) ! !.... Solve SNES problem PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) ! !.... View snes context PetscCall(SNESView(snes, viewer, ierr)) ! !.... dump the logs ! call PetscLogDump(ierr) ! Why does this cause error ! !.... Destroy PETSc objects PetscCall(SNESDestroy(snes, ierr)) PetscCall(VecDestroy(x, ierr)) PetscCall(VecDestroy(y, ierr)) PetscCall(PetscViewerDestroy(viewer, ierr)) ! !.... Finish PetscCall(PetscFinalize(ierr)) END SUBROUTINE SolveWithSNESShell -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From FERRANJ2 at my.erau.edu Wed May 31 17:37:37 2023 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Wed, 31 May 2023 22:37:37 +0000 Subject: [petsc-users] PetscSF "One-Sided" vs. "Two-Sided" Message-ID: Dear PETSc team: For one of my applications, I need to know which owned DAG points in a (DMPlex) are other ranks' ghosts. Say, rank-0 has some point "x" (which it owns) and it shows up in, say, rank-1 as a ghost numbered "y". Then, if I inspect the "Point" PestcSF, in rank-1 there should be a leaf for "y" that records rank-0 and the value of "x", right? However, back in rank-0, the same PetscSF has no record that "x" is "y" in rank-1 (and potentially "z","a","b",... in additional ranks). Is this what, in PetscSF terminology, is referred to as "One-Sided"? If so, I think what I am really asking is how can I obtain a "Two-Sided" version of the PointSF? Sincerely: J.A. Ferrand Embry-Riddle Aeronautical University - Daytona Beach - FL Ph.D. Candidate, Aerospace Engineering M.Sc. Aerospace Engineering B.Sc. Aerospace Engineering B.Sc. Computational Mathematics Phone: (386)-843-1829 Email(s): ferranj2 at my.erau.edu jesus.ferrand at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.c.hall at duke.edu Wed May 31 17:42:39 2023 From: kenneth.c.hall at duke.edu (Kenneth C Hall) Date: Wed, 31 May 2023 22:42:39 +0000 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. In-Reply-To: References: <7AE73466-5DF0-4493-88EC-DBAEB4D60FA2@petsc.dev> Message-ID: Matt, Thanks for this. Mach number preconditioning is as follows. The Euler or Navier-Stokes equations are written as: du/dt + M.(dF(u)/dx + dG(u)/dx) = 0 The matrix M is a preconditioning matrix which changes the wave speeds of the convection and the pressure characteristic speeds so that they are close to one another. The equations are no longer time accurate, but converge to a steady states faster. This is especially useful for very low speed flows where the (unpreconditioned) waves travel at very high speeds compared to the convection waves. Kenneth From: Matthew Knepley Date: Wednesday, May 31, 2023 at 6:21 PM To: Barry Smith Cc: Kenneth C Hall , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. On Wed, May 31, 2023 at 5:51?PM Barry Smith > wrote: Sorry, I wrote to quickly in my last email. You will need to create a SNESSHELL its solve simply calls your solver (for its one iteration) SNESNRICHARSON handles the rest. Yes, this is a great suggestion. It should work with your current SNESSHELL. I would also point out that we have the SNESMS, which is the multistage solver from Jameson. You can use this as a residual smooth with FAS to do something similar to what you have, although I do not know what Mach number preconditioning is (Jed probably knows). Thanks, Matt On May 31, 2023, at 4:16 PM, Kenneth C Hall > wrote: Matt, Thanks for your quick reply. I think what you say makes sense. You asked what my code does. The MySolver program performs one iteration of a CFD iteration. The CFD scheme is an explicit scheme that uses multigrid, Mach number preconditioning, and residual smoothing. Typically, I have to call MySolver on the order of 40 to 100 times to get acceptable convergence. And in fact, I have another version of this Petsc code that uses SNESNGMRES to solve the problem with MySolver providing the residuals as R = N(x) - x. But I would like a version where I am using just MySolver, without any other operations applied to it. So I am trying to plug MySolver into the PETSc system to provide monitoring and other features, and for consistency and comparison to these other (more appropriate!) uses of PETSc. Thanks. Kenneth From: Matthew Knepley > Date: Wednesday, May 31, 2023 at 3:48 PM To: Kenneth C Hall > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. On Wed, May 31, 2023 at 3:21?PM Kenneth C Hall > wrote: Hi, I am doing a number of problems using PETSc/SLEPc, but I also work on some non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for this non-PETSc flow solver for compatibility, so I can use the tolerance monitoring, options, viewers, and for direct comparison to PETSc methods I am using. Here is what I am trying to do? I have a CFD solver that iterates with a nonlinear iterator of the form x := N(x). This can be expressed in a fortran routine of the form, SUBROUTINE MySolver(x) or SUBROUTINE MySolver(x,y) In the first case, x is over written. In the second, y = N(x). In any event, I want to do something like what is shown in the subroutine at the bottom of this email. The code below ?works? in the sense that MySolver is called, but it is called exactly *once*. But MyMonitor and MyConverged are *not* called. Again, I want to iterate so MySolver should be called many times, as should MyMonitor and MyConverged. The SNESolve() method is called once per nonlinear solve, just as KSPSolve() is called once per linear solve. There may be iteration inside the method, but that is handled inside the particular implementation. For example, both Newton's method and Nonlinear Conjugate Gradient iterate, but the iteration is internal to both, and they both call the monitor and convergence test at each internal iterate. So, if your nonlinear solver should iterate, it should happen inside the SNESSolve call for the SNESSHELL object. Does this make sense? What does your solver do? Thanks, Matt The SNESView before and after SNESSolve looks like this: SNES Object: 1 MPI process type: shell SNES has not been set up so information may be incomplete maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 total number of function evaluations=0 norm schedule ALWAYS SNES Object: 1 MPI process type: shell maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 total number of function evaluations=0 norm schedule ALWAYS Any suggestions on how to do what I am trying to accomplish? Thanks. Kenneth Hall #include #include "macros.h" MODULE SolveWithSNESShell_module USE MyPetscModule CONTAINS ! !==================================================================================================== SUBROUTINE MySolver(snes, x, ierr) !==================================================================================================== !! !! !==================================================================================================== ! USE MyPetscModule IMPLICIT NONE ! !.... declared passed variables SNES :: snes Vec :: x PetscErrorCode :: ierr ! !.... code to find residual x := N(x) !.... (or alternatively y := N(x)) END SUBROUTINE MySolver ! !==================================================================================================== SUBROUTINE MyMonitor(snes, its, rnorm, ierr) !==================================================================================================== !! !! !==================================================================================================== ! USE MyPetscModule IMPLICIT NONE ! !.... Declare passed variables SNES :: snes PetscInt :: its PetscReal :: rnorm PetscErrorCode :: ierr ! !.... Code to print out convergence history !.... Code to print out convergence history END SUBROUTINE MyMonitor !==================================================================================================== SUBROUTINE MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) USE MyPetscModule IMPLICIT NONE SNES :: snes PetscInt :: it,ctx PetscReal :: xnorm, ynorm, znorm KSPConvergedReason :: reason PetscErrorCode :: ierr ! ... add convergence test here ... ! set reason to a positive value if convergence has been achieved END SUBROUTINE MyConverged END MODULE SolveWithSNESShell_module ! !==================================================================================================== SUBROUTINE SolveWithSNESShell !==================================================================================================== !! !! !==================================================================================================== ! USE SolveWithSNESShell_module IMPLICIT NONE ! !.... Declare passed variables INTEGER :: level_tmp ! !.... Declare local variables INTEGER :: iz INTEGER :: imax INTEGER :: jmax INTEGER :: kmax SNES :: snes KSP :: ksp Vec :: x Vec :: y PetscViewer :: viewer PetscErrorCode :: ierr PetscReal :: rtol = 1.0D-10 !! relative tolerance PetscReal :: atol = 1.0D-50 !! absolute tolerance PetscReal :: dtol = 1.0D+06 !! divergence tolerance PetscInt :: maxits = 50 PetscInt :: maxf = 10000 character(len=1000):: args ! !.... count the number of degrees of freedom. level = level_tmp n = 0 DO iz = 1, hb(level)%nzone imax = hb(level)%zone(iz)%imax - 1 jmax = hb(level)%zone(iz)%jmax - 1 kmax = hb(level)%zone(iz)%kmax - 1 n = n + imax * jmax * kmax END DO n = n * neqn ! !.... Initialize PETSc PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) ! !.... Log PetscCall(PetscLogDefaultBegin(ierr)) ! !.... Hard-wired options. ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line style option here" , ierr)) ! !.... Command line options. call GET_COMMAND(args) PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) ! !.... view command line table PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) PetscCall(PetscViewerDestroy(viewer, ierr)) ! !.... Create PETSc vectors PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) PetscCall(VecSet(x, 0.0d0, ierr)) PetscCall(VecSet(y, 0.0d0, ierr)) !.... SNES context PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) PetscCall(SNESSetType(snes, SNESSHELL, ierr)) PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, ierr)) !!!! this line causes a segmentation error if uncommented. PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, PETSC_NULL_FUNCTION, ierr)) ! !.... Set SNES options PetscCall(SNESSetFromOptions(snes, ierr)) !.... Set tolerances PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, ierr)) ! !.... SNES montior PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, PETSC_NULL_FUNCTION,ierr)) ! !.... Set the initial solution CALL HBToVecX(x) ! !.... View snes context PetscCall(SNESView(snes, viewer, ierr)) ! !.... Solve SNES problem PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) ! !.... View snes context PetscCall(SNESView(snes, viewer, ierr)) ! !.... dump the logs ! call PetscLogDump(ierr) ! Why does this cause error ! !.... Destroy PETSc objects PetscCall(SNESDestroy(snes, ierr)) PetscCall(VecDestroy(x, ierr)) PetscCall(VecDestroy(y, ierr)) PetscCall(PetscViewerDestroy(viewer, ierr)) ! !.... Finish PetscCall(PetscFinalize(ierr)) END SUBROUTINE SolveWithSNESShell -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 31 18:07:46 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2023 19:07:46 -0400 Subject: [petsc-users] PetscSF "One-Sided" vs. "Two-Sided" In-Reply-To: References: Message-ID: On Wed, May 31, 2023 at 6:37?PM Ferrand, Jesus A. wrote: > Dear PETSc team: > > For one of my applications, I need to know which owned DAG points in a > (DMPlex) are other ranks' ghosts. > Say, rank-0 has some point "x" (which it owns) and it shows up in, say, > rank-1 as a ghost numbered "y". > Then, if I inspect the "Point" PestcSF, in rank-1 there should be a leaf > for "y" that records rank-0 and the value of "x", right? > However, back in rank-0, the same PetscSF has no record that "x" is "y" in > rank-1 (and potentially "z","a","b",... in additional ranks). > > Is this what, in PetscSF terminology, is referred to as "One-Sided"? > > If so, I think what I am really asking is how can I obtain a "Two-Sided" > version of the PointSF? > You need https://petsc.org/main/manualpages/PetscSF/PetscSFComputeDegreeBegin/ If the point is not in the leaf list (owned) and has rootdegree > 1 (shared), it is your point. Thanks, Matt > Sincerely: > > *J.A. Ferrand* > > Embry-Riddle Aeronautical University - Daytona Beach - FL > Ph.D. Candidate, Aerospace Engineering > > M.Sc. Aerospace Engineering > > B.Sc. Aerospace Engineering > > B.Sc. Computational Mathematics > > > *Phone:* (386)-843-1829 > > *Email(s):* ferranj2 at my.erau.edu > > jesus.ferrand at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed May 31 19:33:05 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 31 May 2023 20:33:05 -0400 Subject: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. In-Reply-To: References: <7AE73466-5DF0-4493-88EC-DBAEB4D60FA2@petsc.dev> Message-ID: <529A7844-2B03-455E-90CC-C05C7A9928A1@petsc.dev> You should also try the SNESNGRES with your SNESSHELL as the SNESSetPC(). > On May 31, 2023, at 6:33 PM, Kenneth C Hall wrote: > > Barry, > > Thanks for your reply. Yes, for SNESNGMRES, I use N(x) ? x as the residuals, with basic line search. This converges in half the ?iterations? as the original CFD solver, but each SNESNGMRES iteration requires two residual calls, so ends up converging in almost exactly the same amount of time as the CFD solver alone. > > I will try the SNESSetNPC/SNESNRICHARSON/SNESSHELL combination for the ?bare CFD? version. > > Thanks for your helpful comments and reply. > > Kenneth > > > From: Barry Smith > > Date: Wednesday, May 31, 2023 at 5:50 PM > To: Kenneth C Hall > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. > > > Sorry, I wrote to quickly in my last email. You will need to create a SNESSHELL its solve simply calls your solver (for its one iteration) SNESNRICHARSON handles the rest. > > > > On May 31, 2023, at 4:16 PM, Kenneth C Hall > wrote: > > Matt, > > Thanks for your quick reply. I think what you say makes sense. > > You asked what my code does. The MySolver program performs one iteration of a CFD iteration. The CFD scheme is an explicit scheme that uses multigrid, Mach number preconditioning, and residual smoothing. Typically, I have to call MySolver on the order of 40 to 100 times to get acceptable convergence. > > And in fact, I have another version of this Petsc code that uses SNESNGMRES to solve the problem with MySolver providing the residuals as R = N(x) - x. But I would like a version where I am using just MySolver, without any other operations applied to it. So I am trying to plug MySolver into the PETSc system to provide monitoring and other features, and for consistency and comparison to these other (more appropriate!) uses of PETSc. > > Thanks. > Kenneth > > > From: Matthew Knepley > > Date: Wednesday, May 31, 2023 at 3:48 PM > To: Kenneth C Hall > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Using SNESSHELL as a wrapper for a CFD solver. > > On Wed, May 31, 2023 at 3:21?PM Kenneth C Hall > wrote: > Hi, > > I am doing a number of problems using PETSc/SLEPc, but I also work on some non-PETSc/SLEPc flow solvers. I would like to use PETSc as a wrapper for this non-PETSc flow solver for compatibility, so I can use the tolerance monitoring, options, viewers, and for direct comparison to PETSc methods I am using. > > Here is what I am trying to do? I have a CFD solver that iterates with a nonlinear iterator of the form x := N(x). This can be expressed in a fortran routine of the form, > > SUBROUTINE MySolver(x) > or > SUBROUTINE MySolver(x,y) > > In the first case, x is over written. In the second, y = N(x). In any event, I want to do something like what is shown in the subroutine at the bottom of this email. > > The code below ?works? in the sense that MySolver is called, but it is called exactly *once*. But MyMonitor and MyConverged are *not* called. Again, I want to iterate so MySolver should be called many times, as should MyMonitor and MyConverged. > > The SNESolve() method is called once per nonlinear solve, just as KSPSolve() is called once per linear solve. There may be iteration inside the method, but that is handled inside the particular implementation. For example, both Newton's method and Nonlinear Conjugate Gradient iterate, but the iteration is internal to both, and they both call the monitor and convergence test at each internal iterate. > > So, if your nonlinear solver should iterate, it should happen inside the SNESSolve call for the SNESSHELL object. Does this make sense? What does your solver do? > > Thanks, > > Matt > > The SNESView before and after SNESSolve looks like this: > > SNES Object: 1 MPI process > type: shell > SNES has not been set up so information may be incomplete > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > total number of function evaluations=0 > norm schedule ALWAYS > > SNES Object: 1 MPI process > type: shell > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-50, absolute=1e-10, solution=1e+06 > total number of function evaluations=0 > norm schedule ALWAYS > > Any suggestions on how to do what I am trying to accomplish? > > Thanks. > Kenneth Hall > > > > #include > #include "macros.h" > > MODULE SolveWithSNESShell_module > USE MyPetscModule > CONTAINS > ! > !==================================================================================================== > SUBROUTINE MySolver(snes, x, ierr) > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE MyPetscModule > IMPLICIT NONE > ! > !.... declared passed variables > SNES :: snes > Vec :: x > PetscErrorCode :: ierr > ! > !.... code to find residual x := N(x) > !.... (or alternatively y := N(x)) > > END SUBROUTINE MySolver > ! > !==================================================================================================== > SUBROUTINE MyMonitor(snes, its, rnorm, ierr) > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE MyPetscModule > IMPLICIT NONE > ! > !.... Declare passed variables > SNES :: snes > PetscInt :: its > PetscReal :: rnorm > PetscErrorCode :: ierr > ! > !.... Code to print out convergence history > !.... Code to print out convergence history > > END SUBROUTINE MyMonitor > > !==================================================================================================== > SUBROUTINE MyConverged(snes, it, xnorm, ynorm, znorm, reason, ierr) > USE MyPetscModule > IMPLICIT NONE > > SNES :: snes > PetscInt :: it,ctx > PetscReal :: xnorm, ynorm, znorm > KSPConvergedReason :: reason > PetscErrorCode :: ierr > > ! ... add convergence test here ... > ! set reason to a positive value if convergence has been achieved > > > END SUBROUTINE MyConverged > END MODULE SolveWithSNESShell_module > ! > !==================================================================================================== > SUBROUTINE SolveWithSNESShell > !==================================================================================================== > !! > !! > !==================================================================================================== > ! > USE SolveWithSNESShell_module > IMPLICIT NONE > ! > !.... Declare passed variables > INTEGER :: level_tmp > ! > !.... Declare local variables > INTEGER :: iz > INTEGER :: imax > INTEGER :: jmax > INTEGER :: kmax > SNES :: snes > KSP :: ksp > Vec :: x > Vec :: y > PetscViewer :: viewer > PetscErrorCode :: ierr > PetscReal :: rtol = 1.0D-10 !! relative tolerance > PetscReal :: atol = 1.0D-50 !! absolute tolerance > PetscReal :: dtol = 1.0D+06 !! divergence tolerance > PetscInt :: maxits = 50 > PetscInt :: maxf = 10000 > character(len=1000):: args > ! > !.... count the number of degrees of freedom. > level = level_tmp > n = 0 > DO iz = 1, hb(level)%nzone > imax = hb(level)%zone(iz)%imax - 1 > jmax = hb(level)%zone(iz)%jmax - 1 > kmax = hb(level)%zone(iz)%kmax - 1 > n = n + imax * jmax * kmax > END DO > n = n * neqn > ! > !.... Initialize PETSc > PetscCall(PetscInitialize(PETSC_NULL_CHARACTER, ierr)) > ! > !.... Log > PetscCall(PetscLogDefaultBegin(ierr)) > ! > !.... Hard-wired options. > ! PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, "command line style option here" , ierr)) > ! > !.... Command line options. > call GET_COMMAND(args) > PetscCall(PetscOptionsInsertString(PETSC_NULL_OPTIONS, args, ierr)) > ! > !.... view command line table > PetscCall(PetscViewerASCIIOpen(PETSC_COMM_SELF, PETSC_VIEWER_STDOUT_SELF, viewer, ierr)) > PetscCall(PetscOptionsView(PETSC_NULL_OPTIONS, viewer, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > ! > !.... Create PETSc vectors > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, x, ierr)) > PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, y, ierr)) > PetscCall(VecSet(x, 0.0d0, ierr)) > PetscCall(VecSet(y, 0.0d0, ierr)) > > !.... SNES context > PetscCall(SNESCreate(PETSC_COMM_SELF, snes, ierr)) > PetscCall(SNESSetType(snes, SNESSHELL, ierr)) > PetscCall(SNESShellSetSolve(snes, MySolver, ierr)) > > !!!! PetscCall(SNESSetFunction(snes, x, MySolver, PETSC_NULL_INTEGER, ierr)) > !!!! this line causes a segmentation error if uncommented. > > PetscCall(SNESSetConvergenceTest(snes, MyConverged, 0, PETSC_NULL_FUNCTION, ierr)) > ! > !.... Set SNES options > PetscCall(SNESSetFromOptions(snes, ierr)) > > !.... Set tolerances > PetscCall(SNESSetTolerances(snes, rtol, atol, dtol, maxits, maxf, ierr)) > ! > !.... SNES montior > PetscCall(SNESMonitorSet(snes, MyMonitor, PETSC_NULL_INTEGER, PETSC_NULL_FUNCTION,ierr)) > ! > !.... Set the initial solution > CALL HBToVecX(x) > ! > !.... View snes context > PetscCall(SNESView(snes, viewer, ierr)) > ! > !.... Solve SNES problem > PetscCall(SNESSolve(snes, PETSC_NULL_VEC, x, ierr)) > ! > !.... View snes context > PetscCall(SNESView(snes, viewer, ierr)) > ! > !.... dump the logs > ! call PetscLogDump(ierr) ! Why does this cause error > ! > !.... Destroy PETSc objects > PetscCall(SNESDestroy(snes, ierr)) > PetscCall(VecDestroy(x, ierr)) > PetscCall(VecDestroy(y, ierr)) > PetscCall(PetscViewerDestroy(viewer, ierr)) > ! > !.... Finish > PetscCall(PetscFinalize(ierr)) > > END SUBROUTINE SolveWithSNESShell > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: