From drainerunknown at protonmail.com Thu Jan 4 03:06:26 2024 From: drainerunknown at protonmail.com (drainerunknown) Date: Thu, 04 Jan 2024 09:06:26 +0000 Subject: [petsc-users] =?utf-8?q?How_To_Get_SSV_Network_=24SSV_=E2=80=94_?= =?utf-8?q?Complete_Guide?= Message-ID: Discover the world of participant airdrops and governance tokens by joining the[SSV Network $SSV](https://medium.com/@vietanhdepzai1/how-to-get-ssv-network-ssv-complete-guide-1df5043825c3)initiative airdrop This guide will walk you through the process and give you valuable insights on what to anticipate You can get SSV Airdrops here: https://medium.com/@vietanhdepzai1/how-to-get-ssv-network-ssv-complete-guide-1df5043825c3 -------------- next part -------------- An HTML attachment was scrubbed... URL: From gourav.kumbhojkar at gmail.com Thu Jan 4 12:08:07 2024 From: gourav.kumbhojkar at gmail.com (Gourav Kumbhojkar) Date: Thu, 4 Jan 2024 18:08:07 +0000 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D Message-ID: Hi, I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. However, the manual pages say that the Mirror Boundary is not supported for 3D. Could you please point me to the right resources to implement no flux boundary condition in 3D domains? Regards, Gourav K. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 4 12:23:51 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 4 Jan 2024 13:23:51 -0500 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D In-Reply-To: References: Message-ID: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> Are you referring to the text? . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define the ghost point value; not yet implemented for 3d Looking at the code for DMSetUp_DA_3D() I see PetscCheck(stencil_type != DMDA_STENCIL_BOX || (bx != DM_BOUNDARY_MIRROR && by != DM_BOUNDARY_MIRROR && bz != DM_BOUNDARY_MIRROR), PetscObjectComm((PetscObject)da), PETSC_ERR_SUP, "Mirror boundary and box stencil"); which seems (to me) to indicate the mirroring is not done for box stencils but should work for star stencils. Are you using a star stencil or a box stencil? I believe the code is not complete for box stencil because the code to determine the location of the "mirrored point" for extra "box points" is messy in 3d and no one wrote it. You can compare DMSetUp_DA_2D() and DMSetUp_DA_3D() to see what is missing and see if you can determine how to add it for 3d. Barry > On Jan 4, 2024, at 1:08?PM, Gourav Kumbhojkar wrote: > > Hi, > > I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. > However, the manual pages say that the Mirror Boundary is not supported for 3D. > Could you please point me to the right resources to implement no flux boundary condition in 3D domains? > > Regards, > Gourav K. -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.centofanti01 at universitadipavia.it Fri Jan 5 04:40:19 2024 From: edoardo.centofanti01 at universitadipavia.it (Edoardo Centofanti) Date: Fri, 5 Jan 2024 11:40:19 +0100 Subject: [petsc-users] Hypre freezing with Mat type mpiaij and Vec type mpi Message-ID: Dear all, I have a code running both on GPU and CPU. This code has both cuda kernels and calls to PETSc KSP and related PC. In particular, I am trying to perform tests with Hypre BoomerAMG both on CPU and GPU. In order to do that, on CPU I am running the code with -mat_type mpiaij and -vec_type mpi, while on GPU I am using respectively aijcusparse and cuda. The configuration ran for PETSc (version is 3.20) is ./configure PETSC_ARCH=arch-linux-cuda --with-cuda --download-mumps --download-hypre --with-debugging=0 --download-scalapack --download-parmetis --download-metis --download-fblaslapack=1 --download-mpich --download-make --download-cmake My problem is that when I try to run my code on GPU it works well, while on CPU with mat_type mpiaij and -vec_type mpi it works regularly until the call to Hypre, then freezes (I have to kill it myself), while with GAMG it works on CPU with the same configuration (and the same code, just PC is changed). On another machine running PETSc version 3.17 everything worked smoothly with the same code and the same configuration, also on Hypre. Do you have any insights on what is happening? Best regards, Edoardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Jan 5 06:46:47 2024 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 5 Jan 2024 07:46:47 -0500 Subject: [petsc-users] Hypre freezing with Mat type mpiaij and Vec type mpi In-Reply-To: References: Message-ID: On Fri, Jan 5, 2024 at 5:40?AM Edoardo Centofanti < edoardo.centofanti01 at universitadipavia.it> wrote: > Dear all, > > I have a code running both on GPU and CPU. This code has both cuda kernels > and calls to PETSc KSP and related PC. In particular, I am trying to > perform tests with Hypre BoomerAMG both on CPU and GPU. In order to do > that, on CPU I am running the code with -mat_type mpiaij and -vec_type mpi, > while on GPU I am using respectively aijcusparse and cuda. > > The configuration ran for PETSc (version is 3.20) is > ./configure PETSC_ARCH=arch-linux-cuda --with-cuda --download-mumps > --download-hypre --with-debugging=0 --download-scalapack > --download-parmetis --download-metis --download-fblaslapack=1 > --download-mpich --download-make --download-cmake > > My problem is that when I try to run my code on GPU it works well, while > on CPU with mat_type mpiaij and -vec_type mpi it works regularly until the > call to Hypre, then freezes (I have to kill it myself), while with GAMG it > works on CPU with the same configuration (and the same code, just PC is > changed). > On another machine running PETSc version 3.17 everything worked smoothly > with the same code and the same configuration, also on Hypre. > Can you reproduce this error on this machine with 3.20? If yes you do a git bisect to find the commit that causes this. That would be a good start. A stack trace would be helpful. You can run this in a debugger and see where you are hung. A GUI debugger is best for this (DDT or Totalview) but a command line debugger is fine if you can do this in a serial run (Control-C will stop the code and give you a prompt and you can then see the stack). Mark > Do you have any insights on what is happening? > > Best regards, > Edoardo > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.khurana22 at imperial.ac.uk Fri Jan 5 06:52:10 2024 From: p.khurana22 at imperial.ac.uk (Khurana, Parv) Date: Fri, 5 Jan 2024 12:52:10 +0000 Subject: [petsc-users] Hypre BoomerAMG settings options database Message-ID: Hello PETSc users community, Happy new year! Thank you for the community support as always. I am using BoomerAMG for my research, and it is interfaced to my software via PETSc. I can only use options database keys as of now to tweak the settings I want for the AMG solve. I want to control the number of smoothener iterations at pre/post step for a given AMG cycle. I am looking for an options database key which helps me control this. I am not sure whether this is possible directly via the keys (Line 365: https://www.mcs.anl.gov/petsc/petsc-3.5.4/src/ksp/pc/impls/hypre/hypre.c.html). My comprehension of the current setup is that I have 1 smoothener iteration at every coarsening step. My aim is to do two pre and 2 post smoothening steps using the SSOR smoothener. BoomerAMG SOLVER PARAMETERS: Maximum number of cycles: 1 Stopping Tolerance: 0.000000e+00 Cycle type (1 = V, 2 = W, etc.): 1 Relaxation Parameters: Visiting Grid: down up coarse Number of sweeps: 1 1 1 Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 6 6 9 Point types, partial sweeps (1=C, -1=F): Pre-CG relaxation (down): 1 -1 Post-CG relaxation (up): -1 1 Coarsest grid: 0 PETSC settings I am using currently: -ksp_type preonly -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_coarsen_type hmis -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi -pc_hypre_boomeramg_strong_threshold 0.7 -pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_P_max 2 -pc_hypre_boomeramg_truncfactor 0.3 Thanks and Best Parv Khurana -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Jan 5 06:53:03 2024 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 5 Jan 2024 07:53:03 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: This is a segv. As Matt said, you need to use a debugger for this or add print statements to narrow down the place where this happens. You will need to learn how to use debuggers to do your project so you might as well start now. If you have a machine with a GUI debugger that is easier but command line debuggers are good to learn anyway. I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) and use a GUI debugger (eg, Totalview or DDT) if available. Mark On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via petsc-users wrote: > Hello Matthew, > > Thank you for your help. I am sorry that I keep coming back with my error > messages, but I reached a point that I don't know how to fix them, and I > don't understand them easily. > The list of errors is getting shorter, now I am getting the attached error > messages > > Thank you again, > > Sawsan > ------------------------------ > *From:* Matthew Knepley > *Sent:* Wednesday, December 20, 2023 6:54 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello Barry, > > Thank you a lot for your help, Now I am getting the attached error message. > > > Do not destroy the PC from KSPGetPC() > > THanks, > > Matt > > > Bests, > Sawsan > ------------------------------ > *From:* Barry Smith > *Sent:* Wednesday, December 20, 2023 6:32 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Mark Adams ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > > Instead of > > call PCCreate(PETSC_COMM_WORLD, pc, ierr) > call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) > call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the > KSP solver > > do > > call KSPGetPC(ksp,pc,ierr) > call PCSetType(pc, PCILU,ierr) > > Do not call KSPSetUp(). It will be taken care of automatically during the > solve > > > > On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, > I don't think that I set preallocation values when I created the matrix, > would you please have look at my code. It is just the petsc related part > from my code. > I was able to fix some of the error messages. Now I have a new set of > error messages related to the KSP solver (attached) > > I appreciate your help > > Sawsan > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 6:44 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > Did you set preallocation values when you created the matrix? > Don't do that. > > On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad < > sawsan.shatanawi at wsu.edu> wrote: > > Hello, > > I am trying to create a sparse matrix( which is as I believe a zero > matrix) then adding some nonzero elements to it over a loop, then > assembling it > > Get Outlook for iOS > > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 2:48 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > I am guessing that you are creating a matrix, adding to it, finalizing it > ("assembly"), and then adding to it again, which is fine, but you are > adding new non-zeros to the sparsity pattern. > If this is what you want then you can tell the matrix to let you do that. > Otherwise you have a bug. > > Mark > > On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello everyone, > > I hope this email finds you well. > > My Name is Sawsan Shatanawi, and I am currently working on developing a > Fortran code for simulating groundwater flow in a 3D system. The code > involves solving a nonlinear system, and I have created the matrix to be > solved using the PCG solver and Picard iteration. However, when I tried > to assign it as a PETSc matrix I started getting a lot of error messages. > > I am kindly asking if someone can help me, I would be happy to share my > code with him/her. > > Please find the attached file contains a list of errors I have gotten > > Thank you in advance for your time and assistance. > > Best regards, > > Sawsan > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jan 5 09:20:47 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 5 Jan 2024 10:20:47 -0500 Subject: [petsc-users] Hypre BoomerAMG settings options database In-Reply-To: References: Message-ID: <30697B53-8E8E-47D3-B3B7-15CF9B9F0D57@petsc.dev> Yes, the handling of BoomerAMG options starts at line 365. If we don't support what you want but hypre has a function call that allows one to set the values then the option could easily be added to the PETSc options database here either by you (with a merge request) or us. So I would say check the hypre docs. Just let us know what BoomerAMG function is missing from the code. Barry > On Jan 5, 2024, at 7:52?AM, Khurana, Parv wrote: > > Hello PETSc users community, > > Happy new year! Thank you for the community support as always. > > I am using BoomerAMG for my research, and it is interfaced to my software via PETSc. I can only use options database keys as of now to tweak the settings I want for the AMG solve. > > I want to control the number of smoothener iterations at pre/post step for a given AMG cycle. I am looking for an options database key which helps me control this. I am not sure whether this is possible directly via the keys (Line 365: https://www.mcs.anl.gov/petsc/petsc-3.5.4/src/ksp/pc/impls/hypre/hypre.c.html). My comprehension of the current setup is that I have 1 smoothener iteration at every coarsening step. My aim is to do two pre and 2 post smoothening steps using the SSOR smoothener. > > BoomerAMG SOLVER PARAMETERS: > > Maximum number of cycles: 1 > Stopping Tolerance: 0.000000e+00 > Cycle type (1 = V, 2 = W, etc.): 1 > > Relaxation Parameters: > Visiting Grid: down up coarse > Number of sweeps: 1 1 1 > Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 6 6 9 > Point types, partial sweeps (1=C, -1=F): > Pre-CG relaxation (down): 1 -1 > Post-CG relaxation (up): -1 1 > Coarsest grid: 0 > > PETSC settings I am using currently: > > -ksp_type preonly > -pc_type hypre > -pc_hypre_type boomeramg > -pc_hypre_boomeramg_coarsen_type hmis > -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi > -pc_hypre_boomeramg_strong_threshold 0.7 > -pc_hypre_boomeramg_interp_type ext+i > -pc_hypre_boomeramg_P_max 2 > -pc_hypre_boomeramg_truncfactor 0.3 > > Thanks and Best > Parv Khurana -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 5 09:21:40 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 5 Jan 2024 09:21:40 -0600 Subject: [petsc-users] Hypre freezing with Mat type mpiaij and Vec type mpi In-Reply-To: References: Message-ID: On Fri, Jan 5, 2024 at 4:40?AM Edoardo Centofanti < edoardo.centofanti01 at universitadipavia.it> wrote: > Dear all, > > I have a code running both on GPU and CPU. This code has both cuda kernels > and calls to PETSc KSP and related PC. In particular, I am trying to > perform tests with Hypre BoomerAMG both on CPU and GPU. In order to do > that, on CPU I am running the code with -mat_type mpiaij and -vec_type mpi, > while on GPU I am using respectively aijcusparse and cuda. > > The configuration ran for PETSc (version is 3.20) is > ./configure PETSC_ARCH=arch-linux-cuda --with-cuda --download-mumps > --download-hypre --with-debugging=0 --download-scalapack > --download-parmetis --download-metis --download-fblaslapack=1 > --download-mpich --download-make --download-cmake > > My problem is that when I try to run my code on GPU it works well, while > on CPU with mat_type mpiaij and -vec_type mpi it works regularly until the > call to Hypre, then freezes (I have to kill it myself), while with GAMG it > works on CPU with the same configuration (and the same code, just PC is > changed). > Probably because if hypre is configured with GPU support, even petsc is running on CPU (with -mat_type mpiaij and -vec_type mpi), hypre is still trying to run on GPU, causing the hanging problem. The petsc/hypre interface is not able to support both CPU/GPU with the same build. > On another machine running PETSc version 3.17 everything worked smoothly > with the same code and the same configuration, also on Hypre. > I guess with petsc-3.17, you also used a different hypre version (downloaded by petsc). > Do you have any insights on what is happening? > > Best regards, > Edoardo > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.centofanti01 at universitadipavia.it Fri Jan 5 09:32:54 2024 From: edoardo.centofanti01 at universitadipavia.it (Edoardo Centofanti) Date: Fri, 5 Jan 2024 16:32:54 +0100 Subject: [petsc-users] Hypre freezing with Mat type mpiaij and Vec type mpi In-Reply-To: References: Message-ID: Il giorno ven 5 gen 2024 alle ore 16:21 Junchao Zhang < junchao.zhang at gmail.com> ha scritto: > > > > On Fri, Jan 5, 2024 at 4:40?AM Edoardo Centofanti < > edoardo.centofanti01 at universitadipavia.it> wrote: > >> Dear all, >> >> I have a code running both on GPU and CPU. This code has both cuda >> kernels and calls to PETSc KSP and related PC. In particular, I am trying >> to perform tests with Hypre BoomerAMG both on CPU and GPU. In order to do >> that, on CPU I am running the code with -mat_type mpiaij and -vec_type mpi, >> while on GPU I am using respectively aijcusparse and cuda. >> >> The configuration ran for PETSc (version is 3.20) is >> ./configure PETSC_ARCH=arch-linux-cuda --with-cuda --download-mumps >> --download-hypre --with-debugging=0 --download-scalapack >> --download-parmetis --download-metis --download-fblaslapack=1 >> --download-mpich --download-make --download-cmake >> >> My problem is that when I try to run my code on GPU it works well, while >> on CPU with mat_type mpiaij and -vec_type mpi it works regularly until the >> call to Hypre, then freezes (I have to kill it myself), while with GAMG it >> works on CPU with the same configuration (and the same code, just PC is >> changed). >> > Probably because if hypre is configured with GPU support, even petsc is > running on CPU (with -mat_type mpiaij and -vec_type mpi), hypre is still > trying to run on GPU, causing the hanging problem. The petsc/hypre > interface is not able to support both CPU/GPU with the same build. > > >> On another machine running PETSc version 3.17 everything worked smoothly >> with the same code and the same configuration, also on Hypre. >> > I guess with petsc-3.17, you also used a different hypre version > (downloaded by petsc). > Yes, this is exactly what I concluded after a bit of debugging and recompiling. In the new version it seems that CPU/GPU with the same build is not supported anymore. Since the CUDA kernels I have in my code do not interfere with Hypre, the only workaround I found was removing kernels from my code and performing an only-CPU run. It slows down a bit overall, but nothing dramatic. Anyway, in another test with version 3.18 I encountered the same problem, but the program crashes and throws "Invalid options" error, while in 3.20 it just hangs. Thanks, Edoardo > > > >> Do you have any insights on what is happening? >> >> Best regards, >> Edoardo >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 5 09:45:04 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 5 Jan 2024 09:45:04 -0600 Subject: [petsc-users] Hypre freezing with Mat type mpiaij and Vec type mpi In-Reply-To: References: Message-ID: On Fri, Jan 5, 2024 at 9:33?AM Edoardo Centofanti < edoardo.centofanti01 at universitadipavia.it> wrote: > > > Il giorno ven 5 gen 2024 alle ore 16:21 Junchao Zhang < > junchao.zhang at gmail.com> ha scritto: > >> >> >> >> On Fri, Jan 5, 2024 at 4:40?AM Edoardo Centofanti < >> edoardo.centofanti01 at universitadipavia.it> wrote: >> >>> Dear all, >>> >>> I have a code running both on GPU and CPU. This code has both cuda >>> kernels and calls to PETSc KSP and related PC. In particular, I am trying >>> to perform tests with Hypre BoomerAMG both on CPU and GPU. In order to do >>> that, on CPU I am running the code with -mat_type mpiaij and -vec_type mpi, >>> while on GPU I am using respectively aijcusparse and cuda. >>> >>> The configuration ran for PETSc (version is 3.20) is >>> ./configure PETSC_ARCH=arch-linux-cuda --with-cuda --download-mumps >>> --download-hypre --with-debugging=0 --download-scalapack >>> --download-parmetis --download-metis --download-fblaslapack=1 >>> --download-mpich --download-make --download-cmake >>> >>> My problem is that when I try to run my code on GPU it works well, while >>> on CPU with mat_type mpiaij and -vec_type mpi it works regularly until the >>> call to Hypre, then freezes (I have to kill it myself), while with GAMG it >>> works on CPU with the same configuration (and the same code, just PC is >>> changed). >>> >> Probably because if hypre is configured with GPU support, even petsc is >> running on CPU (with -mat_type mpiaij and -vec_type mpi), hypre is still >> trying to run on GPU, causing the hanging problem. The petsc/hypre >> interface is not able to support both CPU/GPU with the same build. >> >> >>> On another machine running PETSc version 3.17 everything worked smoothly >>> with the same code and the same configuration, also on Hypre. >>> >> I guess with petsc-3.17, you also used a different hypre version >> (downloaded by petsc). >> > Yes, this is exactly what I concluded after a bit of debugging and > recompiling. In the new version it seems that CPU/GPU with the same build > is not supported anymore. Since the CUDA kernels I have in my code do not > interfere with Hypre, the only workaround I found was removing kernels from > my code and performing an only-CPU run. It slows down a bit overall, but > nothing dramatic. Anyway, in another test with version 3.18 I encountered > the same problem, but the program crashes and throws "Invalid options" > error, while in 3.20 it just hangs. > Improving the petsc/hypre interface to support both CPU/GPU is WIP. > Thanks, > Edoardo > >> >> >> >>> Do you have any insights on what is happening? >>> >>> Best regards, >>> Edoardo >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bldenton at buffalo.edu Fri Jan 5 10:04:50 2024 From: bldenton at buffalo.edu (Brandon Denton) Date: Fri, 5 Jan 2024 16:04:50 +0000 Subject: [petsc-users] Saving Multiple DMPlexes to a single HDF5 File Message-ID: Happy New Year! I'm attempting to save multiple DMPlexes to a single HDF5 file. During my solution, I make modification to the DMPlex and would like to track and store these changes as the solution progresses. Currently, I the program writes the DMPlex to the HDF5 file the first time I call DMView() but throws an error the 2nd time I call DMView() to save the modified DMPlex. In my program, I would like to save more than 2 modfied DMPlexes to the same file. My current attempt at this process is outlined below with the error message copied at the end. Any help you could give to get me over this hump is appreciated. Thank you. Brandon ?--- Relevant Current Code Segments ---- /* Command Line Variable */ -dm_view_test hdf5:mesh_test_all.h5 /* Create initial DMPlex from file and Name it */ PetscCall(DMPlexCreateFromFile(comm, ctx.filename, "EGADS", PETSC_TRUE, &dmNozzle)); PetscCall(PetscObjectSetName((PetscObject)dmNozzle, "nozzle_mesh")); /* Get viewer from command line options. If found, then get and check ViewerType and ViewerFormat */ /* Change ViewerFormat to PETSC_VIEWER_HDF5_PETSC per DMView() documation for saving multiple DMPlexes */ /* to a single file. Note: ViewerFormat Defaults to PETSC_VIEWER_DEFAULT when PetscOPtionsGetViewer() called */ PetscCall(PetscOptionsGetViewer(PETSC_COMM_WORLD, NULL, NULL, "-dm_view_test", &viewer, &format, &flg)); if (flg) { PetscCall(PetscPrintf(PETSC_COMM_SELF, " flg = TRUE \n")); PetscCall(PetscViewerGetType(viewer, &viewType)); PetscCall(PetscViewerPushFormat(viewer, PETSC_VIEWER_HDF5_PETSC)); // PetscOptionsGetViewer returns &format as PETSC_VIEWER_DEFAULT need PETSC_VIEWER_HDF5_PETSC to save multiple DMPlexes in a single .h5 file. PetscCall(PetscViewerGetFormat(viewer, &viewFormat)); PetscCall(PetscPrintf(PETSC_COMM_SELF, " viewer type = %s \n", viewType)); PetscCall(PetscPrintf(PETSC_COMM_SELF, " viewer format = %d \n", viewFormat)); } /* Save Initial DMPlex to HDF5 file */ PetscCall(DMView(dmNozzle, viewer)); /* Code makes modifications to DMPlex not shown here */ /* Set new name for modified DMPlex and attempt to write to HDF5 file - Fails */ PetscCall(PetscObjectSetName((PetscObject)dmNozzle, "nozzle_meshes_1")); PetscCall(DMView(dmNozzle, viewer)); // <-- Fails here ---- End of Relevant Code Segments ---- ----- ERROR MESSAGE ----- HDF5-DIAG: Error detected in HDF5 (1.12.1) thread 0: #000: H5D.c line 779 in H5Dset_extent(): unable to set dataset extent major: Dataset minor: Can't set value #001: H5VLcallback.c line 2326 in H5VL_dataset_specific(): unable to execute dataset specific callback major: Virtual Object Layer minor: Can't operate on object #002: H5VLcallback.c line 2289 in H5VL__dataset_specific(): unable to execute dataset specific callback major: Virtual Object Layer minor: Can't operate on object #003: H5VLnative_dataset.c line 325 in H5VL__native_dataset_specific(): unable to set extent of dataset major: Dataset minor: Unable to initialize object #004: H5Dint.c line 3045 in H5D__set_extent(): unable to modify size of dataspace major: Dataset minor: Unable to initialize object #005: H5S.c line 1847 in H5S_set_extent(): dimension cannot exceed the existing maximal size (new: 386 max: 98) major: Dataspace minor: Bad value [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Error in external library [0]PETSC ERROR: Error in HDF5 call H5Dset_extent() Status -1 [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [0]PETSC ERROR: Option left: name:-dm_plex_geom_without_snap_to_geom value: 0 source: command line [0]PETSC ERROR: Option left: name:-dm_view1 value: hdf5:mesh_minSA_vol_abstract.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view10 value: hdf5:mesh_minSA_itr5.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view11 value: hdf5:mesh_minSA_itr20.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view12 value: hdf5:mesh_minSA_itr50.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view13 value: hdf5:mesh_minSA_itr100.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view14 value: hdf5:mesh_minSA_itr150.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view15 value: hdf5:mesh_minSA_itr200.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view2 value: hdf5:mesh_minSA_vol_abstract_inflated.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view22 value: hdf5:mesh_minSA_itr200r1.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view23 value: hdf5:mesh_minSA_itr200r2.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view3 value: hdf5:mesh_minSA_vol_abstract_Refine.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view4 value: hdf5:mesh_minSA_vol_abstract_Refine2.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view5 value: hdf5:mesh_minSA_vol_abstract_Refine3.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view6 value: hdf5:mesh_minSA_vol_abstract_Refine4.h5 source: command line [0]PETSC ERROR: Option left: name:-dm_view8 value: hdf5:mesh_minSA_itr2.h5 source: command line [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.18.5-3061-g8635db7762a GIT Date: 2023-10-22 20:49:54 -0400 [0]PETSC ERROR: ./ex17 on a named XPS. by bdenton Fri Jan 5 10:02:19 2024 [0]PETSC ERROR: Configure options --with-make-np=16 --prefix=/mnt/c/Users/Brandon/software/libs/petsc/3.20.0-gitlab/gcc/11.2.0/mpich/3.4.2/openblas/0.3.17/opt --with-debugging=false --COPTFLAGS="-O3 -mavx" --CXXOPTFLAGS="-O3 -mavx" --FOPTFLAGS=-O3 --with-shared-libraries=1 --with-mpi-dir=/mnt/c/Users/Brandon/software/libs/mpich/3.4.2/gcc/11.2.0 --with-mumps=true --download-mumps=1 --with-metis=true --download-metis=1 --with-parmetis=true --download-parmetis=1 --with-superlu=true --download-superlu=1 --with-superludir=true --download-superlu_dist=1 --with-blacs=true --download-blacs=1 --with-scalapack=true --download-scalapack=1 --with-hypre=true --download-hypre=1 --with-hdf5-dir=/mnt/c/Users/Brandon/software/libs/hdf5/1.12.1/gcc/11.2.0 --with-valgrind-dir=/mnt/c/Users/Brandon/software/apps/valgrind/3.14.0 --with-blas-lib="[/mnt/c/Users/Brandon/software/libs/openblas/0.3.17/gcc/11.2.0/lib/libopenblas.so]" --with-lapack-lib="[/mnt/c/Users/Brandon/software/libs/openblas/0.3.17/gcc/11.2.0/lib/libopenblas.so]" --LDFLAGS= --with-tetgen=true --download-tetgen=1 --download-ctetgen=1 --download-opencascade=1 --download-egads [0]PETSC ERROR: #1 VecView_MPI_HDF5() at /mnt/c/Users/Brandon/software/builddir/petsc-3.20.0-gitlab/src/vec/vec/impls/mpi/pdvec.c:664 [0]PETSC ERROR: #2 VecView_Seq() at /mnt/c/Users/Brandon/software/builddir/petsc-3.20.0-gitlab/src/vec/vec/impls/seq/bvec2.c:559 [0]PETSC ERROR: #3 VecView() at /mnt/c/Users/Brandon/software/builddir/petsc-3.20.0-gitlab/src/vec/vec/interface/vector.c:803 [0]PETSC ERROR: #4 DMPlexCoordinatesView_HDF5_Legacy_Private() at /mnt/c/Users/Brandon/software/builddir/petsc-3.20.0-gitlab/src/dm/impls/plex/plexhdf5.c:1022 [0]PETSC ERROR: #5 DMPlexCoordinatesView_HDF5_Internal() at /mnt/c/Users/Brandon/software/builddir/petsc-3.20.0-gitlab/src/dm/impls/plex/plexhdf5.c:1045 [0]PETSC ERROR: #6 DMPlexView_HDF5_Internal() at /mnt/c/Users/Brandon/software/builddir/petsc-3.20.0-gitlab/src/dm/impls/plex/plexhdf5.c:1301 [0]PETSC ERROR: #7 DMView_Plex() at /mnt/c/Users/Brandon/software/builddir/petsc-3.20.0-gitlab/src/dm/impls/plex/plex.c:1878 [0]PETSC ERROR: #8 DMView() at /mnt/c/Users/Brandon/software/builddir/petsc-3.20.0-gitlab/src/dm/interface/dm.c:982 [0]PETSC ERROR: #9 main() at /mnt/c/Users/Brandon/Documents/School/Dissertation/Software/EGADS-dev/ex17/ex17.c:374 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -dm_plex_geom_print_model 1 (source: command line) [0]PETSC ERROR: -dm_plex_geom_shape_opt 1 (source: command line) [0]PETSC ERROR: -dm_plex_geom_tess_model 0 (source: command line) [0]PETSC ERROR: -dm_plex_geom_without_snap_to_geom 0 (source: command line) [0]PETSC ERROR: -dm_refine 1 (source: command line) [0]PETSC ERROR: -dm_view hdf5:mesh_minSA_abstract.h5 (source: command line) [0]PETSC ERROR: -dm_view1 hdf5:mesh_minSA_vol_abstract.h5 (source: command line) [0]PETSC ERROR: -dm_view10 hdf5:mesh_minSA_itr5.h5 (source: command line) [0]PETSC ERROR: -dm_view11 hdf5:mesh_minSA_itr20.h5 (source: command line) [0]PETSC ERROR: -dm_view12 hdf5:mesh_minSA_itr50.h5 (source: command line) [0]PETSC ERROR: -dm_view13 hdf5:mesh_minSA_itr100.h5 (source: command line) [0]PETSC ERROR: -dm_view14 hdf5:mesh_minSA_itr150.h5 (source: command line) [0]PETSC ERROR: -dm_view15 hdf5:mesh_minSA_itr200.h5 (source: command line) [0]PETSC ERROR: -dm_view2 hdf5:mesh_minSA_vol_abstract_inflated.h5 (source: command line) [0]PETSC ERROR: -dm_view22 hdf5:mesh_minSA_itr200r1.h5 (source: command line) [0]PETSC ERROR: -dm_view23 hdf5:mesh_minSA_itr200r2.h5 (source: command line) [0]PETSC ERROR: -dm_view3 hdf5:mesh_minSA_vol_abstract_Refine.h5 (source: command line) [0]PETSC ERROR: -dm_view4 hdf5:mesh_minSA_vol_abstract_Refine2.h5 (source: command line) [0]PETSC ERROR: -dm_view5 hdf5:mesh_minSA_vol_abstract_Refine3.h5 (source: command line) [0]PETSC ERROR: -dm_view6 hdf5:mesh_minSA_vol_abstract_Refine4.h5 (source: command line) [0]PETSC ERROR: -dm_view7 hdf5:mesh_minSA_itr1.h5 (source: command line) [0]PETSC ERROR: -dm_view8 hdf5:mesh_minSA_itr2.h5 (source: command line) [0]PETSC ERROR: -dm_view_test hdf5:mesh_test_all.h5 (source: command line) [0]PETSC ERROR: -filename ../examples/abstract_minSA.stp (source: command line) [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_SELF, 76) - process 0 [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=76 : system msg for write_line failure : Bad file descriptor --- END OF ERROR MESSAGE --- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Jan 6 08:15:04 2024 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 6 Jan 2024 09:15:04 -0500 Subject: [petsc-users] Hypre BoomerAMG settings options database In-Reply-To: <30697B53-8E8E-47D3-B3B7-15CF9B9F0D57@petsc.dev> References: <30697B53-8E8E-47D3-B3B7-15CF9B9F0D57@petsc.dev> Message-ID: Does this work for you? -pc_hypre_boomeramg_grid_sweeps_all 2 The comment in our code says SSOR is the default but it looks like it is really "hSGS" I thought it was an L1 Jacobi, but you would want to ask Hypre about this. Mark On Fri, Jan 5, 2024 at 10:21?AM Barry Smith wrote: > > Yes, the handling of BoomerAMG options starts at line 365. If we don't > support what you want but hypre has a function call that allows one to set > the values then the option could easily be added to the PETSc options > database here either by you (with a merge request) or us. So I would say > check the hypre docs. > > Just let us know what BoomerAMG function is missing from the code. > > Barry > > > On Jan 5, 2024, at 7:52?AM, Khurana, Parv > wrote: > > Hello PETSc users community, > > Happy new year! Thank you for the community support as always. > > I am using BoomerAMG for my research, and it is interfaced to my software > via PETSc. I can only use options database keys as of now to tweak the > settings I want for the AMG solve. > > I want to control the number of smoothener iterations at pre/post step for > a given AMG cycle. I am looking for an options database key which helps me > control this. I am not sure whether this is possible directly via the keys > (Line 365: > https://www.mcs.anl.gov/petsc/petsc-3.5.4/src/ksp/pc/impls/hypre/hypre.c.html). > My comprehension of the current setup is that I have 1 smoothener iteration > at every coarsening step. My aim is to do two pre and 2 post smoothening > steps using the SSOR smoothener. > > BoomerAMG SOLVER PARAMETERS: > > Maximum number of cycles: 1 > Stopping Tolerance: 0.000000e+00 > Cycle type (1 = V, 2 = W, etc.): 1 > > Relaxation Parameters: > Visiting Grid: down up coarse > Number of sweeps: 1 1 1 > Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 6 6 9 > Point types, partial sweeps (1=C, -1=F): > Pre-CG relaxation (down): 1 -1 > Post-CG relaxation (up): -1 1 > Coarsest grid: 0 > > PETSC settings I am using currently: > > -ksp_type preonly > -pc_type hypre > -pc_hypre_type boomeramg > -pc_hypre_boomeramg_coarsen_type hmis > -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi > -pc_hypre_boomeramg_strong_threshold 0.7 > -pc_hypre_boomeramg_interp_type ext+i > -pc_hypre_boomeramg_P_max 2 > -pc_hypre_boomeramg_truncfactor 0.3 > > Thanks and Best > Parv Khurana > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Sat Jan 6 11:50:31 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Sat, 6 Jan 2024 18:50:31 +0100 Subject: [petsc-users] Hypre BoomerAMG settings options database In-Reply-To: References: <30697B53-8E8E-47D3-B3B7-15CF9B9F0D57@petsc.dev> Message-ID: > On 6 Jan 2024, at 3:15?PM, Mark Adams wrote: > > Does this work for you? > -pc_hypre_boomeramg_grid_sweeps_all 2 > The comment in our code says SSOR is the default but it looks like it is really "hSGS" > I thought it was an L1 Jacobi, but you would want to ask Hypre about this. HYPRE?s default settings are not the same as the ones we set in PETSc as default, so do not ask HYPRE people (about this particular issue). Thanks, Pierre > Mark > > > On Fri, Jan 5, 2024 at 10:21?AM Barry Smith > wrote: >> >> Yes, the handling of BoomerAMG options starts at line 365. If we don't support what you want but hypre has a function call that allows one to set the values then the option could easily be added to the PETSc options database here either by you (with a merge request) or us. So I would say check the hypre docs. >> >> Just let us know what BoomerAMG function is missing from the code. >> >> Barry >> >> >>> On Jan 5, 2024, at 7:52?AM, Khurana, Parv > wrote: >>> >>> Hello PETSc users community, >>> >>> Happy new year! Thank you for the community support as always. >>> >>> I am using BoomerAMG for my research, and it is interfaced to my software via PETSc. I can only use options database keys as of now to tweak the settings I want for the AMG solve. >>> >>> I want to control the number of smoothener iterations at pre/post step for a given AMG cycle. I am looking for an options database key which helps me control this. I am not sure whether this is possible directly via the keys (Line 365: https://www.mcs.anl.gov/petsc/petsc-3.5.4/src/ksp/pc/impls/hypre/hypre.c.html). My comprehension of the current setup is that I have 1 smoothener iteration at every coarsening step. My aim is to do two pre and 2 post smoothening steps using the SSOR smoothener. >>> >>> BoomerAMG SOLVER PARAMETERS: >>> >>> Maximum number of cycles: 1 >>> Stopping Tolerance: 0.000000e+00 >>> Cycle type (1 = V, 2 = W, etc.): 1 >>> >>> Relaxation Parameters: >>> Visiting Grid: down up coarse >>> Number of sweeps: 1 1 1 >>> Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 6 6 9 >>> Point types, partial sweeps (1=C, -1=F): >>> Pre-CG relaxation (down): 1 -1 >>> Post-CG relaxation (up): -1 1 >>> Coarsest grid: 0 >>> >>> PETSC settings I am using currently: >>> >>> -ksp_type preonly >>> -pc_type hypre >>> -pc_hypre_type boomeramg >>> -pc_hypre_boomeramg_coarsen_type hmis >>> -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi >>> -pc_hypre_boomeramg_strong_threshold 0.7 >>> -pc_hypre_boomeramg_interp_type ext+i >>> -pc_hypre_boomeramg_P_max 2 >>> -pc_hypre_boomeramg_truncfactor 0.3 >>> >>> Thanks and Best >>> Parv Khurana >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gourav.kumbhojkar at gmail.com Sat Jan 6 18:30:21 2024 From: gourav.kumbhojkar at gmail.com (Gourav Kumbhojkar) Date: Sun, 7 Jan 2024 00:30:21 +0000 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D In-Reply-To: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> References: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> Message-ID: Thank you, Barry. Sorry for the late response. Yes, I was referring to the same text. I am using a star stencil. However, I don?t think the mirror condition is implemented for star stencil either. TLDR version of the whole message typed below ? I think DM_BOUNDARY_GHOSTED is not implemented correctly in 3D. It appears that ghost nodes are mirrored with boundary nodes themselves. They should mirror with the nodes next to boundary. Long version - Here?s what I?m trying to do ? Step 1 - Create a 3D DM ierr = DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DMDA_STENCIL_STAR, num_pts, num_pts, num_pts, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, NULL, &da); CHKERRQ(ierr); Note - num_pts = 4 in my code. Step 2 ? Create a Matrix from DM ( a FDM stiffness matrix) DMCreateMatrix(da, &K); globalKMat(K, info); ?globalKMat? is a user-defined function. Here?s a snippet from this function: for (int i = info.xs; i < (info.xs + info.xm); i++){ for(int j = info.ys; j < (info.ys + info.ym); j++){ for (int k = info.zs; k < (info.zs + info.zm); k++){ ncols = 0; row.i = i; row.j = j; row.k = k; col[0].i = i; col[0].j = j; col[0].k = k; vals[ncols++] = -6.; //ncols=1 col[ncols].i = i-1; col[ncols].j = j; col[ncols].k = k; vals[ncols++] = 1.;//ncols=2 There are total 7 ?ncols?. Other than the first one all ncols have value 1 (first one is set to -6). As you can see, this step is to only build the FDM stiffness matrix. I use ?ADD_VALUES? at the end in the above function. Step 3 ? View the stiffness matrix to check the values. I use MatView for this. Here are the results ? 1. 3D DM (showing first few rows of the stiffness matrix here, the original matrix is 64x64)- Mat Object: 1 MPI processes type: seqaij row 0: (0, -3.) (1, 1.) (4, 1.) (16, 1.) row 1: (0, 1.) (1, -4.) (2, 1.) (5, 1.) (17, 1.) row 2: (1, 1.) (2, -4.) (3, 1.) (6, 1.) (18, 1.) row 3: (2, 1.) (3, -3.) (7, 1.) (19, 1.) row 4: (0, 1.) (4, -4.) (5, 1.) (8, 1.) (20, 1.) row 5: (1, 1.) (4, 1.) (5, -5.) (6, 1.) (9, 1.) (21, 1.) 1. Repeat the same steps for a 2D DM to show the difference (the entire matrix is now 16x16) Mat Object: 1 MPI processes type: seqaij row 0: (0, -4.) (1, 2.) (4, 2.) row 1: (0, 1.) (1, -4.) (2, 1.) (5, 2.) row 2: (1, 1.) (2, -4.) (3, 1.) (6, 2.) row 3: (2, 2.) (3, -4.) (7, 2.) row 4: (0, 1.) (4, -4.) (5, 2.) (8, 1.) row 5: (1, 1.) (4, 1.) (5, -4.) (6, 1.) (9, 1.) I suspect that when using ?DM_BOUNDARY_MIRROR? in 3D, the ghost node value is added to the boundary node itself, which would explain why row 0 of the stiffness matrix has -3 instead of -6. In principle the ghost node value should be mirrored with the node next to boundary. Clearly, there?s no issue with the 2D implementation of the mirror boundary. The row 0 values are -4, 2, and 2 as expected. Let me know if I should give any other information about this. I also thought about using DM_BOUNDARY_GHOSTED and implement the mirror boundary in 3D from scratch but I would really appreciate some resources on how to do that. Thank you. Gourav From: Barry Smith Date: Thursday, January 4, 2024 at 12:24?PM To: Gourav Kumbhojkar Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D Are you referring to the text? . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define the ghost point value; not yet implemented for 3d Looking at the code for DMSetUp_DA_3D() I see PetscCheck(stencil_type != DMDA_STENCIL_BOX || (bx != DM_BOUNDARY_MIRROR && by != DM_BOUNDARY_MIRROR && bz != DM_BOUNDARY_MIRROR), PetscObjectComm((PetscObject)da), PETSC_ERR_SUP, "Mirror boundary and box stencil"); which seems (to me) to indicate the mirroring is not done for box stencils but should work for star stencils. Are you using a star stencil or a box stencil? I believe the code is not complete for box stencil because the code to determine the location of the "mirrored point" for extra "box points" is messy in 3d and no one wrote it. You can compare DMSetUp_DA_2D() and DMSetUp_DA_3D() to see what is missing and see if you can determine how to add it for 3d. Barry On Jan 4, 2024, at 1:08?PM, Gourav Kumbhojkar wrote: Hi, I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. However, the manual pages say that the Mirror Boundary is not supported for 3D. Could you please point me to the right resources to implement no flux boundary condition in 3D domains? Regards, Gourav K. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Jan 6 18:57:34 2024 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 6 Jan 2024 19:57:34 -0500 Subject: [petsc-users] Hypre BoomerAMG settings options database In-Reply-To: References: <30697B53-8E8E-47D3-B3B7-15CF9B9F0D57@petsc.dev> Message-ID: I was thinking about interpreting the "BoomerAMG SOLVER PARAMETERS:" stuff (eg, what is the closest thing to SSOR that Parv wants). That does not look like our code, but maybe it is. On Sat, Jan 6, 2024 at 12:50?PM Pierre Jolivet wrote: > > > On 6 Jan 2024, at 3:15?PM, Mark Adams wrote: > > Does this work for you? > > -pc_hypre_boomeramg_grid_sweeps_all 2 > > The comment in our code says SSOR is the default but it looks like it is really "hSGS" > > I thought it was an L1 Jacobi, but you would want to ask Hypre about this. > > HYPRE?s default settings are not the same as the ones we set in PETSc as > default, so do not ask HYPRE people (about this particular issue). > > Thanks, > Pierre > > Mark > > > > On Fri, Jan 5, 2024 at 10:21?AM Barry Smith wrote: > >> >> Yes, the handling of BoomerAMG options starts at line 365. If we don't >> support what you want but hypre has a function call that allows one to set >> the values then the option could easily be added to the PETSc options >> database here either by you (with a merge request) or us. So I would say >> check the hypre docs. >> >> Just let us know what BoomerAMG function is missing from the code. >> >> Barry >> >> >> On Jan 5, 2024, at 7:52?AM, Khurana, Parv >> wrote: >> >> Hello PETSc users community, >> >> Happy new year! Thank you for the community support as always. >> >> I am using BoomerAMG for my research, and it is interfaced to my software >> via PETSc. I can only use options database keys as of now to tweak the >> settings I want for the AMG solve. >> >> I want to control the number of smoothener iterations at pre/post step >> for a given AMG cycle. I am looking for an options database key which helps >> me control this. I am not sure whether this is possible directly via the >> keys (Line 365: >> https://www.mcs.anl.gov/petsc/petsc-3.5.4/src/ksp/pc/impls/hypre/hypre.c.html). >> My comprehension of the current setup is that I have 1 smoothener iteration >> at every coarsening step. My aim is to do two pre and 2 post smoothening >> steps using the SSOR smoothener. >> >> BoomerAMG SOLVER PARAMETERS: >> >> Maximum number of cycles: 1 >> Stopping Tolerance: 0.000000e+00 >> Cycle type (1 = V, 2 = W, etc.): 1 >> >> Relaxation Parameters: >> Visiting Grid: down up coarse >> Number of sweeps: 1 1 1 >> Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 6 6 9 >> Point types, partial sweeps (1=C, -1=F): >> Pre-CG relaxation (down): 1 -1 >> Post-CG relaxation (up): -1 1 >> Coarsest grid: 0 >> >> PETSC settings I am using currently: >> >> -ksp_type preonly >> -pc_type hypre >> -pc_hypre_type boomeramg >> -pc_hypre_boomeramg_coarsen_type hmis >> -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi >> -pc_hypre_boomeramg_strong_threshold 0.7 >> -pc_hypre_boomeramg_interp_type ext+i >> -pc_hypre_boomeramg_P_max 2 >> -pc_hypre_boomeramg_truncfactor 0.3 >> >> Thanks and Best >> Parv Khurana >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sat Jan 6 19:08:19 2024 From: bsmith at petsc.dev (Barry Smith) Date: Sat, 6 Jan 2024 20:08:19 -0500 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D In-Reply-To: References: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> Message-ID: <1130EB7D-5472-434C-A700-D424381940B5@petsc.dev> If the mirror code for star stencil is just wrong in 3d we should simply fix it. Not use some other approach. Can you attach code that tries to do what you need for both 2d (that results in a matrix you are happy with) and 3d (that results in a matrix that you are not happy with). Barry > On Jan 6, 2024, at 7:30?PM, Gourav Kumbhojkar wrote: > > Thank you, Barry. Sorry for the late response. > > Yes, I was referring to the same text. I am using a star stencil. However, I don?t think the mirror condition is implemented for star stencil either. > > TLDR version of the whole message typed below ? > I think DM_BOUNDARY_GHOSTED is not implemented correctly in 3D. It appears that ghost nodes are mirrored with boundary nodes themselves. They should mirror with the nodes next to boundary. > > Long version - > Here?s what I?m trying to do ? > > Step 1 - Create a 3D DM > ierr = DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DMDA_STENCIL_STAR, num_pts, num_pts, num_pts, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, NULL, &da); CHKERRQ(ierr); > Note - num_pts = 4 in my code. > > Step 2 ? Create a Matrix from DM ( a FDM stiffness matrix) > DMCreateMatrix(da, &K); > globalKMat(K, info); > > ?globalKMat? is a user-defined function. Here?s a snippet from this function: > for (int i = info.xs; i < (info.xs + info.xm); i++){ > for(int j = info.ys; j < (info.ys + info.ym); j++){ > for (int k = info.zs; k < (info.zs + info.zm); k++){ > ncols = 0; > row.i = i; row.j = j; row.k = k; > > col[0].i = i; col[0].j = j; col[0].k = k; > vals[ncols++] = -6.; //ncols=1 > > col[ncols].i = i-1; col[ncols].j = j; col[ncols].k = k; > vals[ncols++] = 1.;//ncols=2 > > There are total 7 ?ncols?. Other than the first one all ncols have value 1 (first one is set to -6). As you can see, this step is to only build the FDM stiffness matrix. I use ?ADD_VALUES? at the end in the above function. > > Step 3 ? View the stiffness matrix to check the values. I use MatView for this. > > Here are the results ? > 3D DM (showing first few rows of the stiffness matrix here, the original matrix is 64x64)- > Mat Object: 1 MPI processes > type: seqaij > row 0: (0, -3.) (1, 1.) (4, 1.) (16, 1.) > row 1: (0, 1.) (1, -4.) (2, 1.) (5, 1.) (17, 1.) > row 2: (1, 1.) (2, -4.) (3, 1.) (6, 1.) (18, 1.) > row 3: (2, 1.) (3, -3.) (7, 1.) (19, 1.) > row 4: (0, 1.) (4, -4.) (5, 1.) (8, 1.) (20, 1.) > row 5: (1, 1.) (4, 1.) (5, -5.) (6, 1.) (9, 1.) (21, 1.) > > Repeat the same steps for a 2D DM to show the difference (the entire matrix is now 16x16) > Mat Object: 1 MPI processes > type: seqaij > row 0: (0, -4.) (1, 2.) (4, 2.) > row 1: (0, 1.) (1, -4.) (2, 1.) (5, 2.) > row 2: (1, 1.) (2, -4.) (3, 1.) (6, 2.) > row 3: (2, 2.) (3, -4.) (7, 2.) > row 4: (0, 1.) (4, -4.) (5, 2.) (8, 1.) > row 5: (1, 1.) (4, 1.) (5, -4.) (6, 1.) (9, 1.) > > I suspect that when using ?DM_BOUNDARY_MIRROR? in 3D, the ghost node value is added to the boundary node itself, which would explain why row 0 of the stiffness matrix has -3 instead of -6. In principle the ghost node value should be mirrored with the node next to boundary. > Clearly, there?s no issue with the 2D implementation of the mirror boundary. The row 0 values are -4, 2, and 2 as expected. > > Let me know if I should give any other information about this. I also thought about using DM_BOUNDARY_GHOSTED and implement the mirror boundary in 3D from scratch but I would really appreciate some resources on how to do that. > > Thank you. > > Gourav > > > From: Barry Smith > > Date: Thursday, January 4, 2024 at 12:24?PM > To: Gourav Kumbhojkar > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D > > > Are you referring to the text? > > . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define > the ghost point value; not yet implemented for 3d > > > Looking at the code for DMSetUp_DA_3D() I see > > PetscCheck(stencil_type != DMDA_STENCIL_BOX || (bx != DM_BOUNDARY_MIRROR && by != DM_BOUNDARY_MIRROR && bz != DM_BOUNDARY_MIRROR), PetscObjectComm((PetscObject)da), PETSC_ERR_SUP, "Mirror boundary and box stencil"); > > which seems (to me) to indicate the mirroring is not done for box stencils but should work for star stencils. > > Are you using a star stencil or a box stencil? > > I believe the code is not complete for box stencil because the code to determine the location of the "mirrored point" for extra "box points" is messy in 3d and no one wrote it. You can compare DMSetUp_DA_2D() and DMSetUp_DA_3D() to see what is missing and see if you can determine how to add it for 3d. > > Barry > > > > On Jan 4, 2024, at 1:08?PM, Gourav Kumbhojkar > wrote: > > Hi, > > I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. > However, the manual pages say that the Mirror Boundary is not supported for 3D. > Could you please point me to the right resources to implement no flux boundary condition in 3D domains? > > Regards, > Gourav K. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kabdelaz at purdue.edu Sun Jan 7 14:51:33 2024 From: kabdelaz at purdue.edu (Khaled Nabil Shar Abdelaziz) Date: Sun, 7 Jan 2024 20:51:33 +0000 Subject: [petsc-users] SNES solve residual vector - Fortran Message-ID: Hello everyone, I am running into an issue with SNES solver where the residual in the first iteration gets calculated fine, but on the 2nd iteration, even though it is getting calculated locally, it doesn't seem to be "assigned" to the residual vector correctly. The residual vector here is referred to as petsc_PVector which is defined in a type (struct). It seems like the first iteration, it behaves as expected and that petsc_PVector gets updated, 2nd iteration however seems like there are 2 versions of it; one that is being passed into FormFunctionU, and another that exists in the type (struct). This result in the residual on the 2nd iteration to be exactly = 0 and convergence. Any insights, suggestions, or experiences you can share regarding this issue would be immensely appreciated. I'm particularly interested in understanding why the residual vector behaves differently in the second iteration. Best, Khaled The following is a snapshot of how I initialize and use SNES: subroutine set_function_jacobian_to_solver_u(nlU) type(variable_node_u),target,intent(inout)::nlU PetscErrorCode::ierr type(variable_node_part),pointer::var_nd_part integer::ipart ! external FormFunctionU,FormJacobianU do ipart=1,nlU%variable_node%npart var_nd_part=>nlU%variable_node%parts(ipart) PetscCallA(SNESSetFunction(var_nd_part%petsc_snes,var_nd_part%petsc_PVector,FormFunctionU,nlU,ierr)) PetscCallA(SNESSetJacobian(var_nd_part%petsc_snes,var_nd_part%petsc_KMatrix,var_nd_part%petsc_KMatrix,FormJacobianU,nlU,ierr)) PetscCallA(SNESSetFromOptions(var_nd_part%petsc_snes,ierr)) end do var_nd_part=>null() end subroutine set_function_jacobian_to_solver_u subroutine FormFunctionU(petsc_snes,petsc_vector,petsc_PVector,nlU,ierr) type(tSNES)::petsc_snes type(tVec)::petsc_vector type(tVec)::petsc_PVector type(PetscScalar),pointer::vec_ptr_a(:),vec_ptr_b(:) PetscErrorCode::ierr type(variable_node_u),target::nlU type(PetscScalar),pointer::val_petsc_PVector(:) type(parameter),pointer::para_data type(mesh),pointer::mesh_data type(element),pointer::eles(:) integer::k,i,flag,pid PetscInt::its para_data=>nlU%para_data mesh_data=>nlU%mesh_data eles=>nlU%eles call MPI_Comm_rank(MPI_COMM_WORLD, pid, ierr) call update_data_by_petsc_vector(nlU%variable_node,petsc_vector) call set_petsc_PVector_to_zero(nlU%variable_node) call update_petsc_PVector(nlU%variable_node) flag = 0 do k=1,mesh_data%nel call get_element_value(nlU%data_e, eles(k)%nnd_e, mesh_data%elements(:,k), nlU%nvar_total, nlU%data) call cal_element_ResidualU(nlU%PVector_e, eles(k), nlU%data_e, para_data) call set_petsc_PVector_values_by_element(nlU%variable_node, nlU%PVector_e, eles(k)%nnd_e, mesh_data%elements(:,k)) end do call update_petsc_PVector(nlU%variable_node) para_data=>null() mesh_data=>null() eles=>null() return end subroutine FormFunctionU call SNESSolve(nlU%variable_node%petsc_snes,PETSC_NULL_VEC, nlU%variable_node%petsc_vector,error_petsc) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kabdelaz at purdue.edu Sun Jan 7 16:06:15 2024 From: kabdelaz at purdue.edu (Khaled Nabil Shar Abdelaziz) Date: Sun, 7 Jan 2024 22:06:15 +0000 Subject: [petsc-users] petsc-users Digest, Vol 181, Issue 9 In-Reply-To: References: Message-ID: Issue resolved, stopped using the vector in the struct and used the one in the arguments. Same issue was taking place in the Jacobian function which resulted in error (-8) SNES_DIVERGED_LOCAL_MIN. Adjusting the jacobian function resolved that issue as well. -----Original Message----- From: petsc-users On Behalf Of petsc-users-request at mcs.anl.gov Sent: Sunday, January 7, 2024 3:52 PM To: petsc-users at mcs.anl.gov Subject: petsc-users Digest, Vol 181, Issue 9 ---- External Email: Use caution with attachments, links, or sharing data ---- Send petsc-users mailing list submissions to petsc-users at mcs.anl.gov To subscribe or unsubscribe via the World Wide Web, visit https://lists.mcs.anl.gov/mailman/listinfo/petsc-users or, via email, send a message with subject or body 'help' to petsc-users-request at mcs.anl.gov You can reach the person managing the list at petsc-users-owner at mcs.anl.gov When replying, please edit your Subject line so it is more specific than "Re: Contents of petsc-users digest..." Today's Topics: 1. SNES solve residual vector - Fortran (Khaled Nabil Shar Abdelaziz) ---------------------------------------------------------------------- Message: 1 Date: Sun, 7 Jan 2024 20:51:33 +0000 From: Khaled Nabil Shar Abdelaziz To: "petsc-users at mcs.anl.gov" Subject: [petsc-users] SNES solve residual vector - Fortran Message-ID: Content-Type: text/plain; charset="us-ascii" Hello everyone, I am running into an issue with SNES solver where the residual in the first iteration gets calculated fine, but on the 2nd iteration, even though it is getting calculated locally, it doesn't seem to be "assigned" to the residual vector correctly. The residual vector here is referred to as petsc_PVector which is defined in a type (struct). It seems like the first iteration, it behaves as expected and that petsc_PVector gets updated, 2nd iteration however seems like there are 2 versions of it; one that is being passed into FormFunctionU, and another that exists in the type (struct). This result in the residual on the 2nd iteration to be exactly = 0 and convergence. Any insights, suggestions, or experiences you can share regarding this issue would be immensely appreciated. I'm particularly interested in understanding why the residual vector behaves differently in the second iteration. Best, Khaled The following is a snapshot of how I initialize and use SNES: subroutine set_function_jacobian_to_solver_u(nlU) type(variable_node_u),target,intent(inout)::nlU PetscErrorCode::ierr type(variable_node_part),pointer::var_nd_part integer::ipart ! external FormFunctionU,FormJacobianU do ipart=1,nlU%variable_node%npart var_nd_part=>nlU%variable_node%parts(ipart) PetscCallA(SNESSetFunction(var_nd_part%petsc_snes,var_nd_part%petsc_PVector,FormFunctionU,nlU,ierr)) PetscCallA(SNESSetJacobian(var_nd_part%petsc_snes,var_nd_part%petsc_KMatrix,var_nd_part%petsc_KMatrix,FormJacobianU,nlU,ierr)) PetscCallA(SNESSetFromOptions(var_nd_part%petsc_snes,ierr)) end do var_nd_part=>null() end subroutine set_function_jacobian_to_solver_u subroutine FormFunctionU(petsc_snes,petsc_vector,petsc_PVector,nlU,ierr) type(tSNES)::petsc_snes type(tVec)::petsc_vector type(tVec)::petsc_PVector type(PetscScalar),pointer::vec_ptr_a(:),vec_ptr_b(:) PetscErrorCode::ierr type(variable_node_u),target::nlU type(PetscScalar),pointer::val_petsc_PVector(:) type(parameter),pointer::para_data type(mesh),pointer::mesh_data type(element),pointer::eles(:) integer::k,i,flag,pid PetscInt::its para_data=>nlU%para_data mesh_data=>nlU%mesh_data eles=>nlU%eles call MPI_Comm_rank(MPI_COMM_WORLD, pid, ierr) call update_data_by_petsc_vector(nlU%variable_node,petsc_vector) call set_petsc_PVector_to_zero(nlU%variable_node) call update_petsc_PVector(nlU%variable_node) flag = 0 do k=1,mesh_data%nel call get_element_value(nlU%data_e, eles(k)%nnd_e, mesh_data%elements(:,k), nlU%nvar_total, nlU%data) call cal_element_ResidualU(nlU%PVector_e, eles(k), nlU%data_e, para_data) call set_petsc_PVector_values_by_element(nlU%variable_node, nlU%PVector_e, eles(k)%nnd_e, mesh_data%elements(:,k)) end do call update_petsc_PVector(nlU%variable_node) para_data=>null() mesh_data=>null() eles=>null() return end subroutine FormFunctionU call SNESSolve(nlU%variable_node%petsc_snes,PETSC_NULL_VEC, nlU%variable_node%petsc_vector,error_petsc) -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Subject: Digest Footer _______________________________________________ petsc-users mailing list petsc-users at mcs.anl.gov https://lists.mcs.anl.gov/mailman/listinfo/petsc-users ------------------------------ End of petsc-users Digest, Vol 181, Issue 9 ******************************************* From mathieu.deuse at siemens.com Mon Jan 8 03:31:45 2024 From: mathieu.deuse at siemens.com (Deuse, Mathieu) Date: Mon, 8 Jan 2024 09:31:45 +0000 Subject: [petsc-users] Question about MATMPIAIJ and column indexes ordering Message-ID: Hello, I have a piece of code which generates a matrix in CSR format, but the without sorting the column indexes in increasing order within each row. This seems not to be 100% compatible with the MATMPIAIJ format: the documentation of MatCreateMPIAIJWithArrays indeed mentions 'row-major ordering'. For example, consider the 2x2 matrix (1 2; 3 4), which in my code could be stored as i=[0, 2, 4], j=[1, 0, 0, 1], v=[2, 1, 3, 4]. I can generate the matrix as follows (on 1 proc): MatCreateMPIAIJWithArrays(PETSC_COMM_SELF, 2, 2, 2, 2, i, j, v, &matrix). This appears to work fine, and I can then use the matrix in a KSP for example. However, if I try to update the entry values (same order and values v=[2, 1, 3, 4]) with MatUpdateMPIAIJWithArray(matrix, v), it seems that PETSc does not memorize the order of the column indexes and the matrix that I get now is (2 1; 3 4). I get the same result with MatUpdateMPIAIJWithArrays(matrix, 2, 2, 2, 2, i, j, v). On the other hand, if the column indexes are sorted within each row (i=[0, 2, 4], j=[0, 1, 0, 1], v=[1, 2, 3, 4]), then it works fine. I have attached a minimal working example (C++). Can I safely rely on MatCreateMPIAIJWithArrays working fine with unsorted column indexes as long as I do not use MatUpdateMPIAIJWithArray(s)? Or should I do the sorting myself before calling MatCreateMPIAIJWithArrays? (or alternatively use another matrix format). Thanks in advance for the help. Kind regards, Mathieu Deuse -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: main.cpp URL: From bsmith at petsc.dev Mon Jan 8 09:49:39 2024 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 8 Jan 2024 10:49:39 -0500 Subject: [petsc-users] Question about MATMPIAIJ and column indexes ordering In-Reply-To: References: Message-ID: <8F3E81C9-5494-4B21-8297-47CD5391E603@petsc.dev> > On Jan 8, 2024, at 4:31?AM, Deuse, Mathieu via petsc-users wrote: > > Hello, > > I have a piece of code which generates a matrix in CSR format, but the without sorting the column indexes in increasing order within each row. This seems not to be 100% compatible with the MATMPIAIJ format: the documentation of MatCreateMPIAIJWithArrays indeed mentions 'row-major ordering'. > > For example, consider the 2x2 matrix (1 2; 3 4), which in my code could be stored as i=[0, 2, 4], j=[1, 0, 0, 1], v=[2, 1, 3, 4]. I can generate the matrix as follows (on 1 proc): MatCreateMPIAIJWithArrays(PETSC_COMM_SELF, 2, 2, 2, 2, i, j, v, &matrix). This appears to work fine, and I can then use the matrix in a KSP for example. However, if I try to update the entry values (same order and values v=[2, 1, 3, 4]) with MatUpdateMPIAIJWithArray(matrix, v), it seems that PETSc does not memorize the order of the column indexes and the matrix that I get now is (2 1; 3 4). I get the same result with MatUpdateMPIAIJWithArrays(matrix, 2, 2, 2, 2, i, j, v). On the other hand, if the column indexes are sorted within each row (i=[0, 2, 4], j=[0, 1, 0, 1], v=[1, 2, 3, 4]), then it works fine. I have attached a minimal working example (C++). > > Can I safely rely on MatCreateMPIAIJWithArrays working fine with unsorted column indexes as long as I do not use MatUpdateMPIAIJWithArray(s)? Yes, this is correct. The column indices do not need to be sorted if you never call MatUpdateMPIAIJWithArray(). > Or should I do the sorting myself before calling MatCreateMPIAIJWithArrays? (or alternatively use another matrix format). > > Thanks in advance for the help. > > Kind regards, > > Mathieu Deuse > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Jan 8 10:07:40 2024 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 8 Jan 2024 11:07:40 -0500 Subject: [petsc-users] Question about MATMPIAIJ and column indexes ordering In-Reply-To: References: Message-ID: <9DC92361-E29D-4B00-BEF6-643AE8C7D50D@petsc.dev> Added clarification to the man pages in https://gitlab.com/petsc/petsc/-/merge_requests/7170 > On Jan 8, 2024, at 4:31?AM, Deuse, Mathieu via petsc-users wrote: > > Hello, > > I have a piece of code which generates a matrix in CSR format, but the without sorting the column indexes in increasing order within each row. This seems not to be 100% compatible with the MATMPIAIJ format: the documentation of MatCreateMPIAIJWithArrays indeed mentions 'row-major ordering'. > > For example, consider the 2x2 matrix (1 2; 3 4), which in my code could be stored as i=[0, 2, 4], j=[1, 0, 0, 1], v=[2, 1, 3, 4]. I can generate the matrix as follows (on 1 proc): MatCreateMPIAIJWithArrays(PETSC_COMM_SELF, 2, 2, 2, 2, i, j, v, &matrix). This appears to work fine, and I can then use the matrix in a KSP for example. However, if I try to update the entry values (same order and values v=[2, 1, 3, 4]) with MatUpdateMPIAIJWithArray(matrix, v), it seems that PETSc does not memorize the order of the column indexes and the matrix that I get now is (2 1; 3 4). I get the same result with MatUpdateMPIAIJWithArrays(matrix, 2, 2, 2, 2, i, j, v). On the other hand, if the column indexes are sorted within each row (i=[0, 2, 4], j=[0, 1, 0, 1], v=[1, 2, 3, 4]), then it works fine. I have attached a minimal working example (C++). > > Can I safely rely on MatCreateMPIAIJWithArrays working fine with unsorted column indexes as long as I do not use MatUpdateMPIAIJWithArray(s)? Or should I do the sorting myself before calling MatCreateMPIAIJWithArrays? (or alternatively use another matrix format). > > Thanks in advance for the help. > > Kind regards, > > Mathieu Deuse > -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Mon Jan 8 11:24:00 2024 From: y.hu at mpie.de (Yi Hu) Date: Mon, 8 Jan 2024 18:24:00 +0100 Subject: [petsc-users] SNES seems not use my matrix-free operation Message-ID: Dear PETSc Experts, I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) In the main program, I define my residual and jacobian and matrix-free jacobian like the following, ? call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) ? subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) ? #include ? use petscmat ? implicit None ? DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & ??? residual_subdomain????????????????????????????????????????????????????????????????????????????? !< DMDA info (needs to be named "in" for macros like XRANGE to work) ? real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & ??? F?????????????????????????????????????????????????????????????????????????????????????????????? !< deformation gradient field ? Mat????????????????????????????????? :: Jac, Jac_pre ? PetscObject????????????????????????? :: dummy ? PetscErrorCode?????????????????????? :: err_PETSc ? PetscInt???????????????????????????? :: N_dof ! global number of DoF, maybe only a placeholder ? N_dof = 9*product(cells(1:2))*cells3 ? print*, 'in my jac' ? ??call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) ? CHKERRQ(err_PETSc) ? call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) ? CHKERRQ(err_PETSc) ? ??print*, 'in my jac' ? ! for jac preconditioner ? call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) ? CHKERRQ(err_PETSc) ? call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) ? CHKERRQ(err_PETSc) ? print*, 'in my jac' end subroutine formJacobian subroutine GK_op(Jac,dF,output,err_PETSc)? real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: &??? dF????????????????????????????????????????????????????????????????????????????????????????????? ???real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: &??? output???? ????????????????????????????????????????????????????????????????????????????????????????????real(pREAL),? dimension(3,3) :: &??? deltaF_aim = 0.0_pREAL ? Mat????????????????????????????????? :: Jac? PetscErrorCode?????????????????????? :: err_PETSc ? integer :: i, j, k, e ?? a lot of calculations ? ? print*, 'in GK op' ?end subroutine GK_op The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? Thanks, Yi ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Jan 8 11:41:01 2024 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 8 Jan 2024 12:41:01 -0500 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: References: Message-ID: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). Barry Yes, "form" is a bad word that should not have been used in our code. > On Jan 8, 2024, at 12:24?PM, Yi Hu wrote: > > Dear PETSc Experts, > > I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) > > In the main program, I define my residual and jacobian and matrix-free jacobian like the following, > > ? > call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) > call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) > ? > > subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) > > #include > use petscmat > implicit None > DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & > residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > F !< deformation gradient field > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > PetscInt :: N_dof ! global number of DoF, maybe only a placeholder > > N_dof = 9*product(cells(1:2))*cells3 > > print*, 'in my jac' > > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > ! for jac preconditioner > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > end subroutine formJacobian > > subroutine GK_op(Jac,dF,output,err_PETSc) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > dF > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & > output > real(pREAL), dimension(3,3) :: & > deltaF_aim = 0.0_pREAL > > Mat :: Jac > PetscErrorCode :: err_PETSc > > integer :: i, j, k, e > > ? a lot of calculations ? > > print*, 'in GK op' > > end subroutine GK_op > > The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. > > Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? > > Thanks, > Yi > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From gourav.kumbhojkar at gmail.com Mon Jan 8 14:44:06 2024 From: gourav.kumbhojkar at gmail.com (Gourav Kumbhojkar) Date: Mon, 8 Jan 2024 20:44:06 +0000 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D In-Reply-To: <1130EB7D-5472-434C-A700-D424381940B5@petsc.dev> References: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> <1130EB7D-5472-434C-A700-D424381940B5@petsc.dev> Message-ID: You are right. Attaching the code that I?m using to test this. The output matrix is saved in separate ascii files. You can use ?make noflux? to compile the code. Gourav From: Barry Smith Date: Saturday, January 6, 2024 at 7:08?PM To: Gourav Kumbhojkar Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D If the mirror code for star stencil is just wrong in 3d we should simply fix it. Not use some other approach. Can you attach code that tries to do what you need for both 2d (that results in a matrix you are happy with) and 3d (that results in a matrix that you are not happy with). Barry On Jan 6, 2024, at 7:30?PM, Gourav Kumbhojkar wrote: Thank you, Barry. Sorry for the late response. Yes, I was referring to the same text. I am using a star stencil. However, I don?t think the mirror condition is implemented for star stencil either. TLDR version of the whole message typed below ? I think DM_BOUNDARY_GHOSTED is not implemented correctly in 3D. It appears that ghost nodes are mirrored with boundary nodes themselves. They should mirror with the nodes next to boundary. Long version - Here?s what I?m trying to do ? Step 1 - Create a 3D DM ierr = DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DMDA_STENCIL_STAR, num_pts, num_pts, num_pts, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, NULL, &da); CHKERRQ(ierr); Note - num_pts = 4 in my code. Step 2 ? Create a Matrix from DM ( a FDM stiffness matrix) DMCreateMatrix(da, &K); globalKMat(K, info); ?globalKMat? is a user-defined function. Here?s a snippet from this function: for (int i = info.xs; i < (info.xs + info.xm); i++){ for(int j = info.ys; j < (info.ys + info.ym); j++){ for (int k = info.zs; k < (info.zs + info.zm); k++){ ncols = 0; row.i = i; row.j = j; row.k = k; col[0].i = i; col[0].j = j; col[0].k = k; vals[ncols++] = -6.; //ncols=1 col[ncols].i = i-1; col[ncols].j = j; col[ncols].k = k; vals[ncols++] = 1.;//ncols=2 There are total 7 ?ncols?. Other than the first one all ncols have value 1 (first one is set to -6). As you can see, this step is to only build the FDM stiffness matrix. I use ?ADD_VALUES? at the end in the above function. Step 3 ? View the stiffness matrix to check the values. I use MatView for this. Here are the results ? 1. 3D DM (showing first few rows of the stiffness matrix here, the original matrix is 64x64)- Mat Object: 1 MPI processes type: seqaij row 0: (0, -3.) (1, 1.) (4, 1.) (16, 1.) row 1: (0, 1.) (1, -4.) (2, 1.) (5, 1.) (17, 1.) row 2: (1, 1.) (2, -4.) (3, 1.) (6, 1.) (18, 1.) row 3: (2, 1.) (3, -3.) (7, 1.) (19, 1.) row 4: (0, 1.) (4, -4.) (5, 1.) (8, 1.) (20, 1.) row 5: (1, 1.) (4, 1.) (5, -5.) (6, 1.) (9, 1.) (21, 1.) 1. Repeat the same steps for a 2D DM to show the difference (the entire matrix is now 16x16) Mat Object: 1 MPI processes type: seqaij row 0: (0, -4.) (1, 2.) (4, 2.) row 1: (0, 1.) (1, -4.) (2, 1.) (5, 2.) row 2: (1, 1.) (2, -4.) (3, 1.) (6, 2.) row 3: (2, 2.) (3, -4.) (7, 2.) row 4: (0, 1.) (4, -4.) (5, 2.) (8, 1.) row 5: (1, 1.) (4, 1.) (5, -4.) (6, 1.) (9, 1.) I suspect that when using ?DM_BOUNDARY_MIRROR? in 3D, the ghost node value is added to the boundary node itself, which would explain why row 0 of the stiffness matrix has -3 instead of -6. In principle the ghost node value should be mirrored with the node next to boundary. Clearly, there?s no issue with the 2D implementation of the mirror boundary. The row 0 values are -4, 2, and 2 as expected. Let me know if I should give any other information about this. I also thought about using DM_BOUNDARY_GHOSTED and implement the mirror boundary in 3D from scratch but I would really appreciate some resources on how to do that. Thank you. Gourav From: Barry Smith > Date: Thursday, January 4, 2024 at 12:24?PM To: Gourav Kumbhojkar > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D Are you referring to the text? . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define the ghost point value; not yet implemented for 3d Looking at the code for DMSetUp_DA_3D() I see PetscCheck(stencil_type != DMDA_STENCIL_BOX || (bx != DM_BOUNDARY_MIRROR && by != DM_BOUNDARY_MIRROR && bz != DM_BOUNDARY_MIRROR), PetscObjectComm((PetscObject)da), PETSC_ERR_SUP, "Mirror boundary and box stencil"); which seems (to me) to indicate the mirroring is not done for box stencils but should work for star stencils. Are you using a star stencil or a box stencil? I believe the code is not complete for box stencil because the code to determine the location of the "mirrored point" for extra "box points" is messy in 3d and no one wrote it. You can compare DMSetUp_DA_2D() and DMSetUp_DA_3D() to see what is missing and see if you can determine how to add it for 3d. Barry On Jan 4, 2024, at 1:08?PM, Gourav Kumbhojkar > wrote: Hi, I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. However, the manual pages say that the Mirror Boundary is not supported for 3D. Could you please point me to the right resources to implement no flux boundary condition in 3D domains? Regards, Gourav K. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: noflux_check.c URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: makefile Type: application/octet-stream Size: 2312 bytes Desc: makefile URL: From bsmith at petsc.dev Mon Jan 8 22:00:25 2024 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 8 Jan 2024 23:00:25 -0500 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D In-Reply-To: References: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> <1130EB7D-5472-434C-A700-D424381940B5@petsc.dev> Message-ID: Ok, I now understand what you are reporting and am working on a fix. From the manual pages I see DMBoundaryType - Describes the choice for the filling of ghost cells on physical domain boundaries. Values: + `DM_BOUNDARY_NONE` - no ghost nodes . `DM_BOUNDARY_GHOSTED` - ghost vertices/cells exist but aren't filled; you can put values into them and then apply a stencil that uses those ghost locations . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define the ghost point value; not yet implemented for 3d .... Developer Note: Should `DM_BOUNDARY_MIRROR` have the same meaning with DMDA_Q0, that is a staggered grid? In that case should the ghost point have the same value as the 0th grid point where the physical boundary serves as the mirror? I assume you are working with a "vertex valued" stencil as opposed to the cell-centered values where mirror possibly has a different interpretation? > On Jan 8, 2024, at 3:44?PM, Gourav Kumbhojkar wrote: > > You are right. Attaching the code that I?m using to test this. > The output matrix is saved in separate ascii files. > You can use ?make noflux? to compile the code. > > Gourav > > From: Barry Smith > > Date: Saturday, January 6, 2024 at 7:08?PM > To: Gourav Kumbhojkar > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D > > > If the mirror code for star stencil is just wrong in 3d we should simply fix it. Not use some other approach. Can you attach code that tries to do what you need for both 2d (that results in a matrix you are happy with) and 3d (that results in a matrix that you are not happy with). > > Barry > > > > > > On Jan 6, 2024, at 7:30?PM, Gourav Kumbhojkar > wrote: > > Thank you, Barry. Sorry for the late response. > > Yes, I was referring to the same text. I am using a star stencil. However, I don?t think the mirror condition is implemented for star stencil either. > > TLDR version of the whole message typed below ? > I think DM_BOUNDARY_GHOSTED is not implemented correctly in 3D. It appears that ghost nodes are mirrored with boundary nodes themselves. They should mirror with the nodes next to boundary. > > Long version - > Here?s what I?m trying to do ? > > Step 1 - Create a 3D DM > ierr = DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DMDA_STENCIL_STAR, num_pts, num_pts, num_pts, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, NULL, &da); CHKERRQ(ierr); > Note - num_pts = 4 in my code. > > Step 2 ? Create a Matrix from DM ( a FDM stiffness matrix) > DMCreateMatrix(da, &K); > globalKMat(K, info); > > ?globalKMat? is a user-defined function. Here?s a snippet from this function: > for (int i = info.xs; i < (info.xs + info.xm); i++){ > for(int j = info.ys; j < (info.ys + info.ym); j++){ > for (int k = info.zs; k < (info.zs + info.zm); k++){ > ncols = 0; > row.i = i; row.j = j; row.k = k; > > col[0].i = i; col[0].j = j; col[0].k = k; > vals[ncols++] = -6.; //ncols=1 > > col[ncols].i = i-1; col[ncols].j = j; col[ncols].k = k; > vals[ncols++] = 1.;//ncols=2 > > There are total 7 ?ncols?. Other than the first one all ncols have value 1 (first one is set to -6). As you can see, this step is to only build the FDM stiffness matrix. I use ?ADD_VALUES? at the end in the above function. > > Step 3 ? View the stiffness matrix to check the values. I use MatView for this. > > Here are the results ? > 3D DM (showing first few rows of the stiffness matrix here, the original matrix is 64x64)- > Mat Object: 1 MPI processes > type: seqaij > row 0: (0, -3.) (1, 1.) (4, 1.) (16, 1.) > row 1: (0, 1.) (1, -4.) (2, 1.) (5, 1.) (17, 1.) > row 2: (1, 1.) (2, -4.) (3, 1.) (6, 1.) (18, 1.) > row 3: (2, 1.) (3, -3.) (7, 1.) (19, 1.) > row 4: (0, 1.) (4, -4.) (5, 1.) (8, 1.) (20, 1.) > row 5: (1, 1.) (4, 1.) (5, -5.) (6, 1.) (9, 1.) (21, 1.) > > Repeat the same steps for a 2D DM to show the difference (the entire matrix is now 16x16) > Mat Object: 1 MPI processes > type: seqaij > row 0: (0, -4.) (1, 2.) (4, 2.) > row 1: (0, 1.) (1, -4.) (2, 1.) (5, 2.) > row 2: (1, 1.) (2, -4.) (3, 1.) (6, 2.) > row 3: (2, 2.) (3, -4.) (7, 2.) > row 4: (0, 1.) (4, -4.) (5, 2.) (8, 1.) > row 5: (1, 1.) (4, 1.) (5, -4.) (6, 1.) (9, 1.) > > I suspect that when using ?DM_BOUNDARY_MIRROR? in 3D, the ghost node value is added to the boundary node itself, which would explain why row 0 of the stiffness matrix has -3 instead of -6. In principle the ghost node value should be mirrored with the node next to boundary. > Clearly, there?s no issue with the 2D implementation of the mirror boundary. The row 0 values are -4, 2, and 2 as expected. > > Let me know if I should give any other information about this. I also thought about using DM_BOUNDARY_GHOSTED and implement the mirror boundary in 3D from scratch but I would really appreciate some resources on how to do that. > > Thank you. > > Gourav > > > From: Barry Smith > > Date: Thursday, January 4, 2024 at 12:24?PM > To: Gourav Kumbhojkar > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D > > > Are you referring to the text? > > . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define > the ghost point value; not yet implemented for 3d > > > Looking at the code for DMSetUp_DA_3D() I see > > PetscCheck(stencil_type != DMDA_STENCIL_BOX || (bx != DM_BOUNDARY_MIRROR && by != DM_BOUNDARY_MIRROR && bz != DM_BOUNDARY_MIRROR), PetscObjectComm((PetscObject)da), PETSC_ERR_SUP, "Mirror boundary and box stencil"); > > which seems (to me) to indicate the mirroring is not done for box stencils but should work for star stencils. > > Are you using a star stencil or a box stencil? > > I believe the code is not complete for box stencil because the code to determine the location of the "mirrored point" for extra "box points" is messy in 3d and no one wrote it. You can compare DMSetUp_DA_2D() and DMSetUp_DA_3D() to see what is missing and see if you can determine how to add it for 3d. > > Barry > > > > On Jan 4, 2024, at 1:08?PM, Gourav Kumbhojkar > wrote: > > Hi, > > I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. > However, the manual pages say that the Mirror Boundary is not supported for 3D. > Could you please point me to the right resources to implement no flux boundary condition in 3D domains? > > Regards, > Gourav K. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 9 09:49:07 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 9 Jan 2024 10:49:07 -0500 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> References: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> Message-ID: <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> > However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. > The input for the matrix-vector product is a global vector, as is the result. (Not like the arguments to DMSNESSetJacobianLocal). This means that your MATOP_MULT function needs to do the DMGlobalToLocal() vector operation first then the "unwrapping" from the vector to the array format at the beginning of the routine. Similarly it needs to "unwrap" the result vector as an array. See src/snes/tutorials/ex14f.F90 and in particular the code block PetscCall(DMGlobalToLocalBegin(da,X,INSERT_VALUES,localX,ierr)) PetscCall(DMGlobalToLocalEnd(da,X,INSERT_VALUES,localX,ierr)) ! Get pointers to vector data PetscCall(VecGetArrayReadF90(localX,xx,ierr)) PetscCall(VecGetArrayF90(F,ff,ierr)) Barry You really shouldn't be using DMSNESSetJacobianLocal() for your code. Basically all the DMSNESSetJacobianLocal() gives you is that it automatically handles the global to local mapping and unwrapping of the vector to an array, but it doesn't work for shell matrices. > On Jan 9, 2024, at 6:30?AM, Yi Hu wrote: > > Dear Barry, > > Thanks for your help. > > It works when doing first SNESSetJacobian() with my created shell matrix Jac in the main (or module) and then DMSNESSetJacobianLocal() to associate with my DM and an dummy formJacobian callback (which is doing nothing). My SNES can now recognize my shell matrix and do my customized operation. > > However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. > > Best wishes, > Yi > > From: Barry Smith > > Sent: Monday, January 8, 2024 6:41 PM > To: Yi Hu > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > > "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) > > You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). > > Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). > > Barry > > > Yes, "form" is a bad word that should not have been used in our code. > > > > > On Jan 8, 2024, at 12:24?PM, Yi Hu > wrote: > > Dear PETSc Experts, > > I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) > > In the main program, I define my residual and jacobian and matrix-free jacobian like the following, > > ? > call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) > call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) > ? > > subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) > > #include > use petscmat > implicit None > DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & > residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > F !< deformation gradient field > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > PetscInt :: N_dof ! global number of DoF, maybe only a placeholder > > N_dof = 9*product(cells(1:2))*cells3 > > print*, 'in my jac' > > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > ! for jac preconditioner > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > end subroutine formJacobian > > subroutine GK_op(Jac,dF,output,err_PETSc) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > dF > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & > output > real(pREAL), dimension(3,3) :: & > deltaF_aim = 0.0_pREAL > > Mat :: Jac > PetscErrorCode :: err_PETSc > > integer :: i, j, k, e > > ? a lot of calculations ? > > print*, 'in GK op' > > end subroutine GK_op > > The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. > > Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? > > Thanks, > Yi > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From s_g at berkeley.edu Tue Jan 9 15:46:03 2024 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Tue, 9 Jan 2024 15:46:03 -0600 Subject: [petsc-users] M2 macs Message-ID: I was wondering if anyone has build experience with PETSc + FORTRAN on an M2-based MAC?? In particular, I am looking for compiler recommendations. -sanjay From balay at mcs.anl.gov Tue Jan 9 15:54:59 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 9 Jan 2024 15:54:59 -0600 (CST) Subject: [petsc-users] M2 macs In-Reply-To: References: Message-ID: The usual xcode/clang + brew/gfortran should work. https://gitlab.com/petsc/petsc/-/jobs/5895519334 https://gitlab.com/petsc/petsc/-/jobs/5895519414 There can be issues - not all CI builds work in M2 - with latest xcode [when I tried this previously] - so some CI jobs are still on Intel/Mac [with older xcode] Satish On Tue, 9 Jan 2024, Sanjay Govindjee via petsc-users wrote: > I was wondering if anyone has build experience with PETSc + FORTRAN on an > M2-based MAC?? In particular, I am looking for compiler recommendations. > > -sanjay > > From samar.khatiwala at earth.ox.ac.uk Tue Jan 9 15:55:37 2024 From: samar.khatiwala at earth.ox.ac.uk (Samar Khatiwala) Date: Tue, 9 Jan 2024 21:55:37 +0000 Subject: [petsc-users] M2 macs In-Reply-To: References: Message-ID: Hi Sanjay, I?ve done this with ifort on my M2 MacBook Air and reported on my experience in this list some months ago. I attach the message below. Samar Begin forwarded message: From: Samar Khatiwala Subject: Re: [petsc-users] PETSc build asks for network connections Date: April 28, 2023 at 5:43:44 PM GMT+1 To: Pierre Jolivet Cc: petsc-users Hi, I realize this is an old thread but I have some recent experience based on setting up an M2 Mac that might be relevant. I was dreading moving to Apple Silicon Macs because of issues like these but I actually did not run into this particular problem. While I can?t be certain I think it is because in the process of installing another piece of software I had to modify Apple?s security restrictions to make them more permissive. Details of how to do this are in the following and it takes only a minute to implement: https://rogueamoeba.com/support/knowledgebase/?showArticle=ACE-StepByStep&product=Audio+Hijack Incidentally, I built mpich from source followed by PETSc in the usual way. Something else that might be helpful for others is my experience getting ifort to work. (My needs were somewhat specific: mixed fortran/C code, preferably ifort, and avoid package managers.) The intel OneAPI installer ran smoothly (via rosetta) but when building mpich (or PETSc) I ran into an obvious problem: clang produces arm64 object files while ifort produces x86 ones. I couldn?t manage to set the correct CFLAGS to tell clang to target x86. Instead, the (simpler) solution turned out to be (1) the fact that all the executables in Apple?s toolchain are universal binaries, and (2) the ?arch? command can let you run programs for any of the two architectures. Specifically, executing in the terminal: arch -x86_64 bash starts a bash shell and *every* program that is then run from that shell is automatically the x86 version. So I could then do: FC=ifort ./configure --prefix=/usr/local/mpichx86 --enable-two-level-namespace make sudo make install and get an x86 build of mpich which I could then use (from the same shell or a new one started as above) to build [x86] PETSc. Except for some annoying warnings from MKL (I think because it is confused what architecture it is running on) everything runs smoothly and - even in emulation - surprisingly fast. Sorry if this is all well know and already documented on PETSc?s install page. Samar On Jan 9, 2024, at 9:46 PM, Sanjay Govindjee via petsc-users wrote: I was wondering if anyone has build experience with PETSc + FORTRAN on an M2-based MAC? In particular, I am looking for compiler recommendations. -sanjay -------------- next part -------------- An HTML attachment was scrubbed... URL: From thatismybc at protonmail.com Tue Jan 9 16:22:29 2024 From: thatismybc at protonmail.com (johnny) Date: Tue, 09 Jan 2024 22:22:29 +0000 Subject: [petsc-users] Solving reduced system of equations with global preconditioner Message-ID: Hi Mike, I hope this message finds you well. I wanted to reach out regarding the topic of solving reduced systems of equations with a global preconditioner. This approach proves to be highly effective in improving the efficiency of numerical methods for solving complex systems. To delve into this further, consider implementing a global preconditioner to enhance the convergence of your iterative solver when dealing with reduced systems. By doing so, you can significantly accelerate the overall solution process, especially in scenarios with large-scale and interconnected equations. Feel free to explore various preconditioning strategies and adapt them to your specific problem[.](https://artcoolz.com/good-morning-texts.html) Experimentation is key to finding the most suitable approach for your application. If you have any questions or need further assistance on this topic, please don't hesitate to reach out. Best regards, Johnny. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thatismybc at protonmail.com Tue Jan 9 16:37:27 2024 From: thatismybc at protonmail.com (johnny) Date: Tue, 09 Jan 2024 22:37:27 +0000 Subject: [petsc-users] Solving reduced system of equations with global preconditioner Message-ID: Dear Mr. Johnny, I trust this email finds you in good health. I am writing to provide information on the efficient resolution of reduced systems of equations through the incorporation of a global preconditioner. To optimize the convergence of iterative solvers, consider the utilization of a global preconditioner, particularly when dealing with intricate and interconnected equation systems. This strategy can significantly enhance the computational efficiency and accelerate the solution process, especially in scenarios involving large-scale equations. I encourage you to explore various preconditioning techniques and tailor them to the specifics of your problem. Experimentation is essential in identifying the most suitable approach for your application. Should you have any queries or require further clarification on this subject, please do not hesitate to contact me. https://howisdom.com/inside-of-a-blueberry/ Sincerely. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 9 20:59:00 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 9 Jan 2024 21:59:00 -0500 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D In-Reply-To: References: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> <1130EB7D-5472-434C-A700-D424381940B5@petsc.dev> Message-ID: <2E18FABC-B59F-4F02-9BD6-0DFBBD4FB878@petsc.dev> Sorry for the delay. The fix is in the git branch barry/2024-01-09/fix-mirror-dmda-3d/release see also https://gitlab.com/petsc/petsc/-/merge_requests/7175 Barry > On Jan 8, 2024, at 3:44?PM, Gourav Kumbhojkar wrote: > > You are right. Attaching the code that I?m using to test this. > The output matrix is saved in separate ascii files. > You can use ?make noflux? to compile the code. > > Gourav > > From: Barry Smith > > Date: Saturday, January 6, 2024 at 7:08?PM > To: Gourav Kumbhojkar > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D > > > If the mirror code for star stencil is just wrong in 3d we should simply fix it. Not use some other approach. Can you attach code that tries to do what you need for both 2d (that results in a matrix you are happy with) and 3d (that results in a matrix that you are not happy with). > > Barry > > > > > > On Jan 6, 2024, at 7:30?PM, Gourav Kumbhojkar > wrote: > > Thank you, Barry. Sorry for the late response. > > Yes, I was referring to the same text. I am using a star stencil. However, I don?t think the mirror condition is implemented for star stencil either. > > TLDR version of the whole message typed below ? > I think DM_BOUNDARY_GHOSTED is not implemented correctly in 3D. It appears that ghost nodes are mirrored with boundary nodes themselves. They should mirror with the nodes next to boundary. > > Long version - > Here?s what I?m trying to do ? > > Step 1 - Create a 3D DM > ierr = DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DMDA_STENCIL_STAR, num_pts, num_pts, num_pts, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, NULL, &da); CHKERRQ(ierr); > Note - num_pts = 4 in my code. > > Step 2 ? Create a Matrix from DM ( a FDM stiffness matrix) > DMCreateMatrix(da, &K); > globalKMat(K, info); > > ?globalKMat? is a user-defined function. Here?s a snippet from this function: > for (int i = info.xs; i < (info.xs + info.xm); i++){ > for(int j = info.ys; j < (info.ys + info.ym); j++){ > for (int k = info.zs; k < (info.zs + info.zm); k++){ > ncols = 0; > row.i = i; row.j = j; row.k = k; > > col[0].i = i; col[0].j = j; col[0].k = k; > vals[ncols++] = -6.; //ncols=1 > > col[ncols].i = i-1; col[ncols].j = j; col[ncols].k = k; > vals[ncols++] = 1.;//ncols=2 > > There are total 7 ?ncols?. Other than the first one all ncols have value 1 (first one is set to -6). As you can see, this step is to only build the FDM stiffness matrix. I use ?ADD_VALUES? at the end in the above function. > > Step 3 ? View the stiffness matrix to check the values. I use MatView for this. > > Here are the results ? > 3D DM (showing first few rows of the stiffness matrix here, the original matrix is 64x64)- > Mat Object: 1 MPI processes > type: seqaij > row 0: (0, -3.) (1, 1.) (4, 1.) (16, 1.) > row 1: (0, 1.) (1, -4.) (2, 1.) (5, 1.) (17, 1.) > row 2: (1, 1.) (2, -4.) (3, 1.) (6, 1.) (18, 1.) > row 3: (2, 1.) (3, -3.) (7, 1.) (19, 1.) > row 4: (0, 1.) (4, -4.) (5, 1.) (8, 1.) (20, 1.) > row 5: (1, 1.) (4, 1.) (5, -5.) (6, 1.) (9, 1.) (21, 1.) > > Repeat the same steps for a 2D DM to show the difference (the entire matrix is now 16x16) > Mat Object: 1 MPI processes > type: seqaij > row 0: (0, -4.) (1, 2.) (4, 2.) > row 1: (0, 1.) (1, -4.) (2, 1.) (5, 2.) > row 2: (1, 1.) (2, -4.) (3, 1.) (6, 2.) > row 3: (2, 2.) (3, -4.) (7, 2.) > row 4: (0, 1.) (4, -4.) (5, 2.) (8, 1.) > row 5: (1, 1.) (4, 1.) (5, -4.) (6, 1.) (9, 1.) > > I suspect that when using ?DM_BOUNDARY_MIRROR? in 3D, the ghost node value is added to the boundary node itself, which would explain why row 0 of the stiffness matrix has -3 instead of -6. In principle the ghost node value should be mirrored with the node next to boundary. > Clearly, there?s no issue with the 2D implementation of the mirror boundary. The row 0 values are -4, 2, and 2 as expected. > > Let me know if I should give any other information about this. I also thought about using DM_BOUNDARY_GHOSTED and implement the mirror boundary in 3D from scratch but I would really appreciate some resources on how to do that. > > Thank you. > > Gourav > > > From: Barry Smith > > Date: Thursday, January 4, 2024 at 12:24?PM > To: Gourav Kumbhojkar > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D > > > Are you referring to the text? > > . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define > the ghost point value; not yet implemented for 3d > > > Looking at the code for DMSetUp_DA_3D() I see > > PetscCheck(stencil_type != DMDA_STENCIL_BOX || (bx != DM_BOUNDARY_MIRROR && by != DM_BOUNDARY_MIRROR && bz != DM_BOUNDARY_MIRROR), PetscObjectComm((PetscObject)da), PETSC_ERR_SUP, "Mirror boundary and box stencil"); > > which seems (to me) to indicate the mirroring is not done for box stencils but should work for star stencils. > > Are you using a star stencil or a box stencil? > > I believe the code is not complete for box stencil because the code to determine the location of the "mirrored point" for extra "box points" is messy in 3d and no one wrote it. You can compare DMSetUp_DA_2D() and DMSetUp_DA_3D() to see what is missing and see if you can determine how to add it for 3d. > > Barry > > > > On Jan 4, 2024, at 1:08?PM, Gourav Kumbhojkar > wrote: > > Hi, > > I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. > However, the manual pages say that the Mirror Boundary is not supported for 3D. > Could you please point me to the right resources to implement no flux boundary condition in 3D domains? > > Regards, > Gourav K. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gourav.kumbhojkar at gmail.com Tue Jan 9 23:50:49 2024 From: gourav.kumbhojkar at gmail.com (Gourav Kumbhojkar) Date: Wed, 10 Jan 2024 05:50:49 +0000 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D In-Reply-To: <2E18FABC-B59F-4F02-9BD6-0DFBBD4FB878@petsc.dev> References: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> <1130EB7D-5472-434C-A700-D424381940B5@petsc.dev> <2E18FABC-B59F-4F02-9BD6-0DFBBD4FB878@petsc.dev> Message-ID: Thank you very much for the fix. I?ll also update here as soon as I test it on my application code. Many thanks. Gourav From: Barry Smith Date: Tuesday, January 9, 2024 at 8:59?PM To: Gourav Kumbhojkar Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D Sorry for the delay. The fix is in the git branch barry/2024-01-09/fix-mirror-dmda-3d/release see also https://gitlab.com/petsc/petsc/-/merge_requests/7175 Barry On Jan 8, 2024, at 3:44?PM, Gourav Kumbhojkar wrote: You are right. Attaching the code that I?m using to test this. The output matrix is saved in separate ascii files. You can use ?make noflux? to compile the code. Gourav From: Barry Smith > Date: Saturday, January 6, 2024 at 7:08?PM To: Gourav Kumbhojkar > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D If the mirror code for star stencil is just wrong in 3d we should simply fix it. Not use some other approach. Can you attach code that tries to do what you need for both 2d (that results in a matrix you are happy with) and 3d (that results in a matrix that you are not happy with). Barry On Jan 6, 2024, at 7:30?PM, Gourav Kumbhojkar > wrote: Thank you, Barry. Sorry for the late response. Yes, I was referring to the same text. I am using a star stencil. However, I don?t think the mirror condition is implemented for star stencil either. TLDR version of the whole message typed below ? I think DM_BOUNDARY_GHOSTED is not implemented correctly in 3D. It appears that ghost nodes are mirrored with boundary nodes themselves. They should mirror with the nodes next to boundary. Long version - Here?s what I?m trying to do ? Step 1 - Create a 3D DM ierr = DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DMDA_STENCIL_STAR, num_pts, num_pts, num_pts, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, NULL, &da); CHKERRQ(ierr); Note - num_pts = 4 in my code. Step 2 ? Create a Matrix from DM ( a FDM stiffness matrix) DMCreateMatrix(da, &K); globalKMat(K, info); ?globalKMat? is a user-defined function. Here?s a snippet from this function: for (int i = info.xs; i < (info.xs + info.xm); i++){ for(int j = info.ys; j < (info.ys + info.ym); j++){ for (int k = info.zs; k < (info.zs + info.zm); k++){ ncols = 0; row.i = i; row.j = j; row.k = k; col[0].i = i; col[0].j = j; col[0].k = k; vals[ncols++] = -6.; //ncols=1 col[ncols].i = i-1; col[ncols].j = j; col[ncols].k = k; vals[ncols++] = 1.;//ncols=2 There are total 7 ?ncols?. Other than the first one all ncols have value 1 (first one is set to -6). As you can see, this step is to only build the FDM stiffness matrix. I use ?ADD_VALUES? at the end in the above function. Step 3 ? View the stiffness matrix to check the values. I use MatView for this. Here are the results ? 1. 3D DM (showing first few rows of the stiffness matrix here, the original matrix is 64x64)- Mat Object: 1 MPI processes type: seqaij row 0: (0, -3.) (1, 1.) (4, 1.) (16, 1.) row 1: (0, 1.) (1, -4.) (2, 1.) (5, 1.) (17, 1.) row 2: (1, 1.) (2, -4.) (3, 1.) (6, 1.) (18, 1.) row 3: (2, 1.) (3, -3.) (7, 1.) (19, 1.) row 4: (0, 1.) (4, -4.) (5, 1.) (8, 1.) (20, 1.) row 5: (1, 1.) (4, 1.) (5, -5.) (6, 1.) (9, 1.) (21, 1.) 1. Repeat the same steps for a 2D DM to show the difference (the entire matrix is now 16x16) Mat Object: 1 MPI processes type: seqaij row 0: (0, -4.) (1, 2.) (4, 2.) row 1: (0, 1.) (1, -4.) (2, 1.) (5, 2.) row 2: (1, 1.) (2, -4.) (3, 1.) (6, 2.) row 3: (2, 2.) (3, -4.) (7, 2.) row 4: (0, 1.) (4, -4.) (5, 2.) (8, 1.) row 5: (1, 1.) (4, 1.) (5, -4.) (6, 1.) (9, 1.) I suspect that when using ?DM_BOUNDARY_MIRROR? in 3D, the ghost node value is added to the boundary node itself, which would explain why row 0 of the stiffness matrix has -3 instead of -6. In principle the ghost node value should be mirrored with the node next to boundary. Clearly, there?s no issue with the 2D implementation of the mirror boundary. The row 0 values are -4, 2, and 2 as expected. Let me know if I should give any other information about this. I also thought about using DM_BOUNDARY_GHOSTED and implement the mirror boundary in 3D from scratch but I would really appreciate some resources on how to do that. Thank you. Gourav From: Barry Smith > Date: Thursday, January 4, 2024 at 12:24?PM To: Gourav Kumbhojkar > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D Are you referring to the text? . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define the ghost point value; not yet implemented for 3d Looking at the code for DMSetUp_DA_3D() I see PetscCheck(stencil_type != DMDA_STENCIL_BOX || (bx != DM_BOUNDARY_MIRROR && by != DM_BOUNDARY_MIRROR && bz != DM_BOUNDARY_MIRROR), PetscObjectComm((PetscObject)da), PETSC_ERR_SUP, "Mirror boundary and box stencil"); which seems (to me) to indicate the mirroring is not done for box stencils but should work for star stencils. Are you using a star stencil or a box stencil? I believe the code is not complete for box stencil because the code to determine the location of the "mirrored point" for extra "box points" is messy in 3d and no one wrote it. You can compare DMSetUp_DA_2D() and DMSetUp_DA_3D() to see what is missing and see if you can determine how to add it for 3d. Barry On Jan 4, 2024, at 1:08?PM, Gourav Kumbhojkar > wrote: Hi, I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. However, the manual pages say that the Mirror Boundary is not supported for 3D. Could you please point me to the right resources to implement no flux boundary condition in 3D domains? Regards, Gourav K. -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Wed Jan 10 05:20:19 2024 From: y.hu at mpie.de (Yi Hu) Date: Wed, 10 Jan 2024 12:20:19 +0100 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> References: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> Message-ID: Thanks for the clarification. It is more clear to me now about the global to local processes after checking the examples, e.g. ksp/ksp/tutorials/ex14f.F90. And for using Vec locally, I followed your advice of VecGet.. and VecRestore? In fact I used DMDAVecGetArrayReadF90() and some other relevant subroutines. For your comment on DMSNESSetJacobianLocal(). It seems that I need to use both SNESSetJacobian() and then DMSNESSetJacobianLocal() to get things working. When I do only SNESSetJacobian(), it does not work, meaning the following does not work ?? ??call DMDASNESsetFunctionLocal(DM_mech,INSERT_VALUES,formResidual,PETSC_NULL_SNES,err_PETSc)?????? ??CHKERRQ(err_PETSc) ? call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& ????????????????????? 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& ????????????????????? 0,Jac_PETSc,err_PETSc) ? CHKERRQ(err_PETSc) ? call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) ? CHKERRQ(err_PETSc) ? call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) ? CHKERRQ(err_PETSc) ? !call DMSNESsetJacobianLocal(DM_mech,formJacobian,PETSC_NULL_SNES,err_PETSc) ? !CHKERRQ(err_PETSc) ? call SNESsetConvergenceTest(SNES_mech,converged,PETSC_NULL_SNES,PETSC_NULL_FUNCTION,err_PETSc)??? ??CHKERRQ(err_PETSc) ? call SNESSetDM(SNES_mech,DM_mech,err_PETSc) ? CHKERRQ(err_PETSc) ?? It gives me the message [0]PETSC ERROR: No support for this operation for this object type????????????????????????????????? ???? [0]PETSC ERROR: Code not yet written for matrix type shell????????????????????????????????????????????? [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.??????????????????????????????? [0]PETSC ERROR: Petsc Release Version 3.16.4, Feb 02, 2022 [0]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu --with-fortran-bindings --with-mpi-f90module-visibility=0 --download-fftw --download-hdf5 --download-hdf5-fortran-bindings --download-fblaslapack --download-ml --download-zlib?????????? ??????????????????????????????????????????????????????????????????????? [0]PETSC ERROR: #1 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471????? [0]PETSC ERROR: #2 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173[0]PETSC ERROR: #3 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864??? [0]PETSC ERROR: #4 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222???????? [0]PETSC ERROR: #5 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809????????????? [0]PETSC ERROR: #6 User provided function() at User file:0????????????????????????????????????????????? [0]PETSC ERROR: #7 VecSetErrorIfLocked() at /home/yi/app/petsc-3.16.4/include/petscvec.h:623??????????? [0]PETSC ERROR: #8 VecGetArray() at /home/yi/app/petsc-3.16.4/src/vec/vec/interface/rvector.c:1769????? [0]PETSC ERROR: #9 User provided function() at User file:0????????????????????????????????????????????? [0]PETSC ERROR: #10 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471???? [0]PETSC ERROR: #11 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173??????????????????????????????????????????????????????????????? ??????????????????????????????????????? [0]PETSC ERROR: #12 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864?? [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222??????? [0]PETSC ERROR: #14 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 It seems that I have to use a DMSNESSetJacobianLocal() to ?activate? the use of my shell matrix, although the formJacobian() in the DMSNESSetJacobianLocal() is doing nothing. Best wishes, Yi From: Barry Smith Sent: Tuesday, January 9, 2024 4:49 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. The input for the matrix-vector product is a global vector, as is the result. (Not like the arguments to DMSNESSetJacobianLocal). This means that your MATOP_MULT function needs to do the DMGlobalToLocal() vector operation first then the "unwrapping" from the vector to the array format at the beginning of the routine. Similarly it needs to "unwrap" the result vector as an array. See src/snes/tutorials/ex14f.F90 and in particular the code block PetscCall(DMGlobalToLocalBegin(da,X,INSERT_VALUES,localX,ierr)) PetscCall(DMGlobalToLocalEnd(da,X,INSERT_VALUES,localX,ierr)) ! Get pointers to vector data PetscCall(VecGetArrayReadF90(localX,xx,ierr)) PetscCall(VecGetArrayF90(F,ff,ierr)) Barry You really shouldn't be using DMSNESSetJacobianLocal() for your code. Basically all the DMSNESSetJacobianLocal() gives you is that it automatically handles the global to local mapping and unwrapping of the vector to an array, but it doesn't work for shell matrices. On Jan 9, 2024, at 6:30?AM, Yi Hu wrote: Dear Barry, Thanks for your help. It works when doing first SNESSetJacobian() with my created shell matrix Jac in the main (or module) and then DMSNESSetJacobianLocal() to associate with my DM and an dummy formJacobian callback (which is doing nothing). My SNES can now recognize my shell matrix and do my customized operation. However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. Best wishes, Yi From: Barry Smith Sent: Monday, January 8, 2024 6:41 PM To: Yi Hu Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] SNES seems not use my matrix-free operation "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). Barry Yes, "form" is a bad word that should not have been used in our code. On Jan 8, 2024, at 12:24?PM, Yi Hu wrote: Dear PETSc Experts, I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) In the main program, I define my residual and jacobian and matrix-free jacobian like the following, ? call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) ? subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) #include use petscmat implicit None DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & F !< deformation gradient field Mat :: Jac, Jac_pre PetscObject :: dummy PetscErrorCode :: err_PETSc PetscInt :: N_dof ! global number of DoF, maybe only a placeholder N_dof = 9*product(cells(1:2))*cells3 print*, 'in my jac' call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) print*, 'in my jac' ! for jac preconditioner call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) print*, 'in my jac' end subroutine formJacobian subroutine GK_op(Jac,dF,output,err_PETSc) real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & dF real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & output real(pREAL), dimension(3,3) :: & deltaF_aim = 0.0_pREAL Mat :: Jac PetscErrorCode :: err_PETSc integer :: i, j, k, e ? a lot of calculations ? print*, 'in GK op' end subroutine GK_op The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? Thanks, Yi ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Jan 10 09:27:21 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Jan 2024 10:27:21 -0500 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: References: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> Message-ID: <10A79A13-67A7-4C2E-B6F6-6E73B58856A2@petsc.dev> By default if SNESSetJacobian() is not called with a function pointer PETSc attempts to compute the Jacobian matrix explicitly with finite differences and coloring. This doesn't makes sense with a shell matrix. Hence the error message below regarding MatFDColoringCreate(). DMSNESSetJacobianLocal() calls SNESSetJacobian() with a function pointer of SNESComputeJacobian_DMLocal() so preventing the error from triggering in your code. You can provide your own function to SNESSetJacobian() and thus not need to call DMSNESSetJacobianLocal(). What you do depends on how you want to record the "base" vector that tells your matrix-free multiply routine where the Jacobian matrix vector product is being applied, that is J(u)*x. u is the "base" vector which is passed to the function provided with SNESSetJacobian(). Barry > On Jan 10, 2024, at 6:20?AM, Yi Hu wrote: > > Thanks for the clarification. It is more clear to me now about the global to local processes after checking the examples, e.g. ksp/ksp/tutorials/ex14f.F90. > > And for using Vec locally, I followed your advice of VecGet.. and VecRestore? In fact I used DMDAVecGetArrayReadF90() and some other relevant subroutines. > > For your comment on DMSNESSetJacobianLocal(). It seems that I need to use both SNESSetJacobian() and then DMSNESSetJacobianLocal() to get things working. When I do only SNESSetJacobian(), it does not work, meaning the following does not work > > ?? > call DMDASNESsetFunctionLocal(DM_mech,INSERT_VALUES,formResidual,PETSC_NULL_SNES,err_PETSc) > CHKERRQ(err_PETSc) > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& > 0,Jac_PETSc,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) > CHKERRQ(err_PETSc) > !call DMSNESsetJacobianLocal(DM_mech,formJacobian,PETSC_NULL_SNES,err_PETSc) > !CHKERRQ(err_PETSc) > call SNESsetConvergenceTest(SNES_mech,converged,PETSC_NULL_SNES,PETSC_NULL_FUNCTION,err_PETSc) > CHKERRQ(err_PETSc) > call SNESSetDM(SNES_mech,DM_mech,err_PETSc) > CHKERRQ(err_PETSc) > ?? > > It gives me the message > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Code not yet written for matrix type shell > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.4, Feb 02, 2022 > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu --with-fortran-bindings --with-mpi-f90module-visibility=0 --download-fftw --download-hdf5 --download-hdf5-fortran-bindings --download-fblaslapack --download-ml --download-zlib > [0]PETSC ERROR: #1 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 > [0]PETSC ERROR: #2 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173[0]PETSC ERROR: #3 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 > [0]PETSC ERROR: #4 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 > [0]PETSC ERROR: #5 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 > [0]PETSC ERROR: #6 User provided function() at User file:0 > [0]PETSC ERROR: #7 VecSetErrorIfLocked() at /home/yi/app/petsc-3.16.4/include/petscvec.h:623 > [0]PETSC ERROR: #8 VecGetArray() at /home/yi/app/petsc-3.16.4/src/vec/vec/interface/rvector.c:1769 > [0]PETSC ERROR: #9 User provided function() at User file:0 > [0]PETSC ERROR: #10 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 > [0]PETSC ERROR: #11 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173 > [0]PETSC ERROR: #12 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 > [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 > [0]PETSC ERROR: #14 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 > > It seems that I have to use a DMSNESSetJacobianLocal() to ?activate? the use of my shell matrix, although the formJacobian() in the DMSNESSetJacobianLocal() is doing nothing. > > Best wishes, > Yi > > > > From: Barry Smith > > Sent: Tuesday, January 9, 2024 4:49 PM > To: Yi Hu > > Cc: petsc-users > > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. > > > The input for the matrix-vector product is a global vector, as is the result. (Not like the arguments to DMSNESSetJacobianLocal). > > This means that your MATOP_MULT function needs to do the DMGlobalToLocal() vector operation first then the "unwrapping" from the vector to the array format at the beginning of the routine. Similarly it needs to "unwrap" the result vector as an array. See src/snes/tutorials/ex14f.F90 and in particular the code block > > PetscCall(DMGlobalToLocalBegin(da,X,INSERT_VALUES,localX,ierr)) > PetscCall(DMGlobalToLocalEnd(da,X,INSERT_VALUES,localX,ierr)) > > ! Get pointers to vector data > > PetscCall(VecGetArrayReadF90(localX,xx,ierr)) > PetscCall(VecGetArrayF90(F,ff,ierr)) > > Barry > > You really shouldn't be using DMSNESSetJacobianLocal() for your code. Basically all the DMSNESSetJacobianLocal() gives you is that it automatically handles the global to local mapping and unwrapping of the vector to an array, but it doesn't work for shell matrices. > > > > On Jan 9, 2024, at 6:30?AM, Yi Hu > wrote: > > Dear Barry, > > Thanks for your help. > > It works when doing first SNESSetJacobian() with my created shell matrix Jac in the main (or module) and then DMSNESSetJacobianLocal() to associate with my DM and an dummy formJacobian callback (which is doing nothing). My SNES can now recognize my shell matrix and do my customized operation. > > However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. > > Best wishes, > Yi > > From: Barry Smith > > Sent: Monday, January 8, 2024 6:41 PM > To: Yi Hu > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > > "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) > > You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). > > Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). > > Barry > > > Yes, "form" is a bad word that should not have been used in our code. > > > > > > On Jan 8, 2024, at 12:24?PM, Yi Hu > wrote: > > Dear PETSc Experts, > > I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) > > In the main program, I define my residual and jacobian and matrix-free jacobian like the following, > > ? > call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) > call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) > ? > > subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) > > #include > use petscmat > implicit None > DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & > residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > F !< deformation gradient field > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > PetscInt :: N_dof ! global number of DoF, maybe only a placeholder > > N_dof = 9*product(cells(1:2))*cells3 > > print*, 'in my jac' > > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > ! for jac preconditioner > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > end subroutine formJacobian > > subroutine GK_op(Jac,dF,output,err_PETSc) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > dF > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & > output > real(pREAL), dimension(3,3) :: & > deltaF_aim = 0.0_pREAL > > Mat :: Jac > PetscErrorCode :: err_PETSc > > integer :: i, j, k, e > > ? a lot of calculations ? > > print*, 'in GK op' > > end subroutine GK_op > > The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. > > Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? > > Thanks, > Yi > > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From sawsan.shatanawi at wsu.edu Wed Jan 10 14:38:00 2024 From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad) Date: Wed, 10 Jan 2024 20:38:00 +0000 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: Hello all, I hope you are doing well. Generally, I use gdb to debug the code. I got the attached error message. I have tried to add the flag -start_in_debugger in the make file, but it didn't work, so it seems I was doing it in the wrong way This is the link for the whole code: sshatanawi/SS_GWM (github.com) [https://opengraph.githubassets.com/9eb6cd14baf12f04848ed209b6f502415eb531bdd7b3a5f9696af68663b870c0/sshatanawi/SS_GWM] GitHub - sshatanawi/SS_GWM Contribute to sshatanawi/SS_GWM development by creating an account on GitHub. github.com ? You can read the description of the code in " Model Desprciption.pdf" the compiling file is makefile_f90 where you can find the linked code files I really appreciate your help Bests, Sawsan ________________________________ From: Mark Adams Sent: Friday, January 5, 2024 4:53 AM To: Shatanawi, Sawsan Muhammad Cc: Matthew Knepley ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] This is a segv. As Matt said, you need to use a debugger for this or add print statements to narrow down the place where this happens. You will need to learn how to use debuggers to do your project so you might as well start now. If you have a machine with a GUI debugger that is easier but command line debuggers are good to learn anyway. I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) and use a GUI debugger (eg, Totalview or DDT) if available. Mark On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello Matthew, Thank you for your help. I am sorry that I keep coming back with my error messages, but I reached a point that I don't know how to fix them, and I don't understand them easily. The list of errors is getting shorter, now I am getting the attached error messages Thank you again, Sawsan ________________________________ From: Matthew Knepley > Sent: Wednesday, December 20, 2023 6:54 PM To: Shatanawi, Sawsan Muhammad > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello Barry, Thank you a lot for your help, Now I am getting the attached error message. Do not destroy the PC from KSPGetPC() THanks, Matt Bests, Sawsan ________________________________ From: Barry Smith > Sent: Wednesday, December 20, 2023 6:32 PM To: Shatanawi, Sawsan Muhammad > Cc: Mark Adams >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Instead of call PCCreate(PETSC_COMM_WORLD, pc, ierr) call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver do call KSPGetPC(ksp,pc,ierr) call PCSetType(pc, PCILU,ierr) Do not call KSPSetUp(). It will be taken care of automatically during the solve On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello, I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) I appreciate your help Sawsan ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 6:44 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Did you set preallocation values when you created the matrix? Don't do that. On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: Hello, I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it Get Outlook for iOS ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 2:48 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. If this is what you want then you can tell the matrix to let you do that. Otherwise you have a bug. Mark On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello everyone, I hope this email finds you well. My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. I am kindly asking if someone can help me, I would be happy to share my code with him/her. Please find the attached file contains a list of errors I have gotten Thank you in advance for your time and assistance. Best regards, Sawsan -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: out4.txt URL: From bsmith at petsc.dev Wed Jan 10 16:50:12 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Jan 2024 17:50:12 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: When running with just one mpi process it is often possible (on non-batch systems where one does not need to use mpiexec to start up a single rank run) to continue to just use > gdb when running with more than 1 MPI process (or if the execuable requires mpiexec to start) you can use mpiexec -n 2 ./executable options -start_in_debugger and it will open two terminals with the debugger for each rank. See https://petsc.org/release/manual/other/#sec-debugging for more details. > On Jan 10, 2024, at 3:38?PM, Shatanawi, Sawsan Muhammad via petsc-users wrote: > > Hello all, > > I hope you are doing well. > > Generally, I use gdb to debug the code. > I got the attached error message. > > I have tried to add the flag -start_in_debugger in the make file, but it didn't work, so it seems I was doing it in the wrong way > > This is the link for the whole code: sshatanawi/SS_GWM (github.com) > > GitHub - sshatanawi/SS_GWM > Contribute to sshatanawi/SS_GWM development by creating an account on GitHub. > github.com > ? > > You can read the description of the code in " Model Desprciption.pdf" > the compiling file is makefile_f90 where you can find the linked code files > > I really appreciate your help > > Bests, > Sawsan > From: Mark Adams > > Sent: Friday, January 5, 2024 4:53 AM > To: Shatanawi, Sawsan Muhammad > > Cc: Matthew Knepley >; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code > > [EXTERNAL EMAIL] > This is a segv. As Matt said, you need to use a debugger for this or add print statements to narrow down the place where this happens. > > You will need to learn how to use debuggers to do your project so you might as well start now. > > If you have a machine with a GUI debugger that is easier but command line debuggers are good to learn anyway. > > I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) and use a GUI debugger (eg, Totalview or DDT) if available. > > Mark > > > On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > Hello Matthew, > > Thank you for your help. I am sorry that I keep coming back with my error messages, but I reached a point that I don't know how to fix them, and I don't understand them easily. > The list of errors is getting shorter, now I am getting the attached error messages > > Thank you again, > > Sawsan > From: Matthew Knepley > > Sent: Wednesday, December 20, 2023 6:54 PM > To: Shatanawi, Sawsan Muhammad > > Cc: Barry Smith >; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code > > [EXTERNAL EMAIL] > On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > Hello Barry, > > Thank you a lot for your help, Now I am getting the attached error message. > > Do not destroy the PC from KSPGetPC() > > THanks, > > Matt > > Bests, > Sawsan > From: Barry Smith > > Sent: Wednesday, December 20, 2023 6:32 PM > To: Shatanawi, Sawsan Muhammad > > Cc: Mark Adams >; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code > > [EXTERNAL EMAIL] > > Instead of > > call PCCreate(PETSC_COMM_WORLD, pc, ierr) > call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) > call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver > > do > > call KSPGetPC(ksp,pc,ierr) > call PCSetType(pc, PCILU,ierr) > > Do not call KSPSetUp(). It will be taken care of automatically during the solve > > > >> On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users > wrote: >> >> Hello, >> I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. >> I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) >> >> I appreciate your help >> >> Sawsan >> From: Mark Adams > >> Sent: Wednesday, December 20, 2023 6:44 AM >> To: Shatanawi, Sawsan Muhammad > >> Cc: petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >> >> [EXTERNAL EMAIL] >> Did you set preallocation values when you created the matrix? >> Don't do that. >> >> On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: >> Hello, >> >> I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it >> >> Get Outlook for iOS >> From: Mark Adams > >> Sent: Wednesday, December 20, 2023 2:48 AM >> To: Shatanawi, Sawsan Muhammad > >> Cc: petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >> >> [EXTERNAL EMAIL] >> I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. >> If this is what you want then you can tell the matrix to let you do that. >> Otherwise you have a bug. >> >> Mark >> >> On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >> Hello everyone, >> >> I hope this email finds you well. >> >> My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. >> >> I am kindly asking if someone can help me, I would be happy to share my code with him/her. >> >> Please find the attached file contains a list of errors I have gotten >> >> Thank you in advance for your time and assistance. >> Best regards, >> >> Sawsan >> >> >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Jan 10 17:49:00 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 10 Jan 2024 17:49:00 -0600 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: Hi, Sawsan, I could build your code and I also could gdb it. $ gdb ./GW.exe ... $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault. 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 257 *ierr = VecGetArray(*x, &lx); (gdb) bt #0 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 #1 0x000000000040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at GW_solver_try.F90:169 #2 0x000000000040c6a8 in test_gw () at test_main.F90:35 ierr=0x0 caused the segfault. See https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray, you should use VecGetArrayF90 instead. BTW, Barry, the code https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 has "call VecGetArray(temp_solution, H_vector, ierr)". I don't find petsc Fortran examples doing VecGetArray. Do we still support it? --Junchao Zhang On Wed, Jan 10, 2024 at 2:38?PM Shatanawi, Sawsan Muhammad via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello all, > > I hope you are doing well. > > Generally, I use gdb to debug the code. > I got the attached error message. > > I have tried to add the flag -start_in_debugger in the make file, but it > didn't work, so it seems I was doing it in the wrong way > > This is the link for the whole code: sshatanawi/SS_GWM (github.com) > > > GitHub - sshatanawi/SS_GWM > Contribute to sshatanawi/SS_GWM development by creating an account on > GitHub. > github.com > *?* > > You can read the description of the code in " Model Desprciption.pdf" > the compiling file is makefile_f90 where you can find the linked code > files > > I really appreciate your help > > Bests, > Sawsan > ------------------------------ > *From:* Mark Adams > *Sent:* Friday, January 5, 2024 4:53 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Matthew Knepley ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > This is a segv. As Matt said, you need to use a debugger for this or add > print statements to narrow down the place where this happens. > > You will need to learn how to use debuggers to do your project so you > might as well start now. > > If you have a machine with a GUI debugger that is easier but command line > debuggers are good to learn anyway. > > I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) > and use a GUI debugger (eg, Totalview or DDT) if available. > > Mark > > > On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via > petsc-users wrote: > > Hello Matthew, > > Thank you for your help. I am sorry that I keep coming back with my error > messages, but I reached a point that I don't know how to fix them, and I > don't understand them easily. > The list of errors is getting shorter, now I am getting the attached error > messages > > Thank you again, > > Sawsan > ------------------------------ > *From:* Matthew Knepley > *Sent:* Wednesday, December 20, 2023 6:54 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello Barry, > > Thank you a lot for your help, Now I am getting the attached error message. > > > Do not destroy the PC from KSPGetPC() > > THanks, > > Matt > > > Bests, > Sawsan > ------------------------------ > *From:* Barry Smith > *Sent:* Wednesday, December 20, 2023 6:32 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Mark Adams ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > > Instead of > > call PCCreate(PETSC_COMM_WORLD, pc, ierr) > call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) > call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the > KSP solver > > do > > call KSPGetPC(ksp,pc,ierr) > call PCSetType(pc, PCILU,ierr) > > Do not call KSPSetUp(). It will be taken care of automatically during the > solve > > > > On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, > I don't think that I set preallocation values when I created the matrix, > would you please have look at my code. It is just the petsc related part > from my code. > I was able to fix some of the error messages. Now I have a new set of > error messages related to the KSP solver (attached) > > I appreciate your help > > Sawsan > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 6:44 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > Did you set preallocation values when you created the matrix? > Don't do that. > > On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad < > sawsan.shatanawi at wsu.edu> wrote: > > Hello, > > I am trying to create a sparse matrix( which is as I believe a zero > matrix) then adding some nonzero elements to it over a loop, then > assembling it > > Get Outlook for iOS > > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 2:48 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > I am guessing that you are creating a matrix, adding to it, finalizing it > ("assembly"), and then adding to it again, which is fine, but you are > adding new non-zeros to the sparsity pattern. > If this is what you want then you can tell the matrix to let you do that. > Otherwise you have a bug. > > Mark > > On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello everyone, > > I hope this email finds you well. > > My Name is Sawsan Shatanawi, and I am currently working on developing a > Fortran code for simulating groundwater flow in a 3D system. The code > involves solving a nonlinear system, and I have created the matrix to be > solved using the PCG solver and Picard iteration. However, when I tried > to assign it as a PETSc matrix I started getting a lot of error messages. > > I am kindly asking if someone can help me, I would be happy to share my > code with him/her. > > Please find the attached file contains a list of errors I have gotten > > Thank you in advance for your time and assistance. > > Best regards, > > Sawsan > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Jan 10 17:52:05 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 10 Jan 2024 17:52:05 -0600 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: Sawsan, Also, another error at https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L170. You can not destroy the vector BEFORE calling VecRestoreArray(), --Junchao Zhang On Wed, Jan 10, 2024 at 5:49?PM Junchao Zhang wrote: > Hi, Sawsan, > I could build your code and I also could gdb it. > > $ gdb ./GW.exe > ... > $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault. > 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, > ia=0x7fffffffa75c, ierr=0x0) at > /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 > 257 *ierr = VecGetArray(*x, &lx); > (gdb) bt > #0 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, > ia=0x7fffffffa75c, ierr=0x0) at > /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 > #1 0x000000000040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at > GW_solver_try.F90:169 > #2 0x000000000040c6a8 in test_gw () at test_main.F90:35 > > ierr=0x0 caused the segfault. See > https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray, you > should use VecGetArrayF90 instead. > > BTW, Barry, the code > https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 has > "call VecGetArray(temp_solution, H_vector, ierr)". I don't find petsc > Fortran examples doing VecGetArray. Do we still support it? > > --Junchao Zhang > > > On Wed, Jan 10, 2024 at 2:38?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > >> Hello all, >> >> I hope you are doing well. >> >> Generally, I use gdb to debug the code. >> I got the attached error message. >> >> I have tried to add the flag -start_in_debugger in the make file, but it >> didn't work, so it seems I was doing it in the wrong way >> >> This is the link for the whole code: sshatanawi/SS_GWM (github.com) >> >> >> GitHub - sshatanawi/SS_GWM >> Contribute to sshatanawi/SS_GWM development by creating an account on >> GitHub. >> github.com >> *?* >> >> You can read the description of the code in " Model Desprciption.pdf" >> the compiling file is makefile_f90 where you can find the linked code >> files >> >> I really appreciate your help >> >> Bests, >> Sawsan >> ------------------------------ >> *From:* Mark Adams >> *Sent:* Friday, January 5, 2024 4:53 AM >> *To:* Shatanawi, Sawsan Muhammad >> *Cc:* Matthew Knepley ; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov> >> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran >> Groundwater Flow Simulation Code >> >> >> *[EXTERNAL EMAIL]* >> This is a segv. As Matt said, you need to use a debugger for this or add >> print statements to narrow down the place where this happens. >> >> You will need to learn how to use debuggers to do your project so you >> might as well start now. >> >> If you have a machine with a GUI debugger that is easier but command line >> debuggers are good to learn anyway. >> >> I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) >> and use a GUI debugger (eg, Totalview or DDT) if available. >> >> Mark >> >> >> On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via >> petsc-users wrote: >> >> Hello Matthew, >> >> Thank you for your help. I am sorry that I keep coming back with my error >> messages, but I reached a point that I don't know how to fix them, and I >> don't understand them easily. >> The list of errors is getting shorter, now I am getting the attached >> error messages >> >> Thank you again, >> >> Sawsan >> ------------------------------ >> *From:* Matthew Knepley >> *Sent:* Wednesday, December 20, 2023 6:54 PM >> *To:* Shatanawi, Sawsan Muhammad >> *Cc:* Barry Smith ; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov> >> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran >> Groundwater Flow Simulation Code >> >> >> *[EXTERNAL EMAIL]* >> On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via >> petsc-users wrote: >> >> Hello Barry, >> >> Thank you a lot for your help, Now I am getting the attached >> error message. >> >> >> Do not destroy the PC from KSPGetPC() >> >> THanks, >> >> Matt >> >> >> Bests, >> Sawsan >> ------------------------------ >> *From:* Barry Smith >> *Sent:* Wednesday, December 20, 2023 6:32 PM >> *To:* Shatanawi, Sawsan Muhammad >> *Cc:* Mark Adams ; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov> >> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran >> Groundwater Flow Simulation Code >> >> >> *[EXTERNAL EMAIL]* >> >> Instead of >> >> call PCCreate(PETSC_COMM_WORLD, pc, ierr) >> call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) >> call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the >> KSP solver >> >> do >> >> call KSPGetPC(ksp,pc,ierr) >> call PCSetType(pc, PCILU,ierr) >> >> Do not call KSPSetUp(). It will be taken care of automatically during the >> solve >> >> >> >> On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> Hello, >> I don't think that I set preallocation values when I created the matrix, >> would you please have look at my code. It is just the petsc related part >> from my code. >> I was able to fix some of the error messages. Now I have a new set of >> error messages related to the KSP solver (attached) >> >> I appreciate your help >> >> Sawsan >> ------------------------------ >> *From:* Mark Adams >> *Sent:* Wednesday, December 20, 2023 6:44 AM >> *To:* Shatanawi, Sawsan Muhammad >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran >> Groundwater Flow Simulation Code >> >> *[EXTERNAL EMAIL]* >> Did you set preallocation values when you created the matrix? >> Don't do that. >> >> On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad < >> sawsan.shatanawi at wsu.edu> wrote: >> >> Hello, >> >> I am trying to create a sparse matrix( which is as I believe a zero >> matrix) then adding some nonzero elements to it over a loop, then >> assembling it >> >> Get Outlook for iOS >> >> ------------------------------ >> *From:* Mark Adams >> *Sent:* Wednesday, December 20, 2023 2:48 AM >> *To:* Shatanawi, Sawsan Muhammad >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran >> Groundwater Flow Simulation Code >> >> *[EXTERNAL EMAIL]* >> I am guessing that you are creating a matrix, adding to it, finalizing it >> ("assembly"), and then adding to it again, which is fine, but you are >> adding new non-zeros to the sparsity pattern. >> If this is what you want then you can tell the matrix to let you do that. >> Otherwise you have a bug. >> >> Mark >> >> On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via >> petsc-users wrote: >> >> Hello everyone, >> >> I hope this email finds you well. >> >> My Name is Sawsan Shatanawi, and I am currently working on developing a >> Fortran code for simulating groundwater flow in a 3D system. The code >> involves solving a nonlinear system, and I have created the matrix to be >> solved using the PCG solver and Picard iteration. However, when I tried >> to assign it as a PETSc matrix I started getting a lot of error messages. >> >> I am kindly asking if someone can help me, I would be happy to share my >> code with him/her. >> >> Please find the attached file contains a list of errors I have gotten >> >> Thank you in advance for your time and assistance. >> >> Best regards, >> >> Sawsan >> >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinw3 at vt.edu Wed Jan 10 18:09:38 2024 From: kevinw3 at vt.edu (Kevin G. Wang) Date: Wed, 10 Jan 2024 19:09:38 -0500 Subject: [petsc-users] KSP number of iterations different from size of residual array Message-ID: Hello everyone! I am writing a code that uses PETSc/KSP to solve linear systems. I just realized that after running "KSPSolve(...)", the number of iterations given by KSPGetIterationNumber(ksp, &numIts) is *different* from the size of the residual history given by KSPGetResidualHistory(ksp, NULL, &nEntries); That is, "numIts" is not equal to "nEntries". Is this expected, or a bug in my code? (I thought they should be the same...) I have tried several pairs of solvers and preconditioners (e.g., fgmres & bjacobi, ibcgs & bjacobi). This issue happens to all of them. Thanks! Kevin -- Kevin G. Wang, Ph.D. Associate Professor Kevin T. Crofton Department of Aerospace and Ocean Engineering Virginia Tech 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 Office: (540) 231-7547 | Mobile: (650) 862-2663 URL: https://www.aoe.vt.edu/people/faculty/wang.html Codes: https://github.com/kevinwgy -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Jan 10 19:30:12 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Jan 2024 20:30:12 -0500 Subject: [petsc-users] KSP number of iterations different from size of residual array In-Reply-To: References: Message-ID: Kevin, The manual page is misleading. It actually returns the total length of the history as set by KSPSetResidualHistory(), not the number of iterations. Barry > On Jan 10, 2024, at 7:09?PM, Kevin G. Wang wrote: > > Hello everyone! > > I am writing a code that uses PETSc/KSP to solve linear systems. I just realized that after running "KSPSolve(...)", the number of iterations given by > > KSPGetIterationNumber(ksp, &numIts) > > is *different* from the size of the residual history given by > > KSPGetResidualHistory(ksp, NULL, &nEntries); > > That is, "numIts" is not equal to "nEntries". Is this expected, or a bug in my code? (I thought they should be the same...) > > I have tried several pairs of solvers and preconditioners (e.g., fgmres & bjacobi, ibcgs & bjacobi). This issue happens to all of them. > > Thanks! > Kevin > > -- > Kevin G. Wang, Ph.D. > Associate Professor > Kevin T. Crofton Department of Aerospace and Ocean Engineering > Virginia Tech > 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 > Office: (540) 231-7547 | Mobile: (650) 862-2663 > URL: https://www.aoe.vt.edu/people/faculty/wang.html > Codes: https://github.com/kevinwgy -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Jan 10 19:32:27 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Jan 2024 20:32:27 -0500 Subject: [petsc-users] KSP number of iterations different from size of residual array In-Reply-To: References: Message-ID: <9E554AD8-CC98-4219-A138-EB9ECEDFC628@petsc.dev> Sorry, sent the message too quickly. What I said below is incorrect. Barry > On Jan 10, 2024, at 8:30?PM, Barry Smith wrote: > > > Kevin, > > The manual page is misleading. It actually returns the total length of the history as set by KSPSetResidualHistory(), not the number of iterations. > > Barry > > >> On Jan 10, 2024, at 7:09?PM, Kevin G. Wang wrote: >> >> Hello everyone! >> >> I am writing a code that uses PETSc/KSP to solve linear systems. I just realized that after running "KSPSolve(...)", the number of iterations given by >> >> KSPGetIterationNumber(ksp, &numIts) >> >> is *different* from the size of the residual history given by >> >> KSPGetResidualHistory(ksp, NULL, &nEntries); >> >> That is, "numIts" is not equal to "nEntries". Is this expected, or a bug in my code? (I thought they should be the same...) >> >> I have tried several pairs of solvers and preconditioners (e.g., fgmres & bjacobi, ibcgs & bjacobi). This issue happens to all of them. >> >> Thanks! >> Kevin >> >> -- >> Kevin G. Wang, Ph.D. >> Associate Professor >> Kevin T. Crofton Department of Aerospace and Ocean Engineering >> Virginia Tech >> 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 >> Office: (540) 231-7547 | Mobile: (650) 862-2663 >> URL: https://www.aoe.vt.edu/people/faculty/wang.html >> Codes: https://github.com/kevinwgy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Jan 10 19:35:37 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Jan 2024 20:35:37 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: <9BD74F88-20FD-4021-B02A-195A58B72282@petsc.dev> > On Jan 10, 2024, at 6:49?PM, Junchao Zhang wrote: > > Hi, Sawsan, > I could build your code and I also could gdb it. > > $ gdb ./GW.exe > ... > $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault. > 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 > 257 *ierr = VecGetArray(*x, &lx); > (gdb) bt > #0 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 > #1 0x000000000040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at GW_solver_try.F90:169 > #2 0x000000000040c6a8 in test_gw () at test_main.F90:35 > > ierr=0x0 caused the segfault. See https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray, you should use VecGetArrayF90 instead. > > BTW, Barry, the code https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 has "call VecGetArray(temp_solution, H_vector, ierr)". I don't find petsc Fortran examples doing VecGetArray. Do we still support it? This is not the correct calling sequence for VecGetArray() from Fortran. Regardless, definitely should not be writing any new code that uses VecGetArray() from Fortran. Should use VecGetArrayF90(). > > --Junchao Zhang > > > On Wed, Jan 10, 2024 at 2:38?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >> Hello all, >> >> I hope you are doing well. >> >> Generally, I use gdb to debug the code. >> I got the attached error message. >> >> I have tried to add the flag -start_in_debugger in the make file, but it didn't work, so it seems I was doing it in the wrong way >> >> This is the link for the whole code: sshatanawi/SS_GWM (github.com) >> >> GitHub - sshatanawi/SS_GWM >> Contribute to sshatanawi/SS_GWM development by creating an account on GitHub. >> github.com >> ? >> >> You can read the description of the code in " Model Desprciption.pdf" >> the compiling file is makefile_f90 where you can find the linked code files >> >> I really appreciate your help >> >> Bests, >> Sawsan >> From: Mark Adams > >> Sent: Friday, January 5, 2024 4:53 AM >> To: Shatanawi, Sawsan Muhammad > >> Cc: Matthew Knepley >; petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >> >> [EXTERNAL EMAIL] >> >> This is a segv. As Matt said, you need to use a debugger for this or add print statements to narrow down the place where this happens. >> >> You will need to learn how to use debuggers to do your project so you might as well start now. >> >> If you have a machine with a GUI debugger that is easier but command line debuggers are good to learn anyway. >> >> I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) and use a GUI debugger (eg, Totalview or DDT) if available. >> >> Mark >> >> >> On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >> Hello Matthew, >> >> Thank you for your help. I am sorry that I keep coming back with my error messages, but I reached a point that I don't know how to fix them, and I don't understand them easily. >> The list of errors is getting shorter, now I am getting the attached error messages >> >> Thank you again, >> >> Sawsan >> From: Matthew Knepley > >> Sent: Wednesday, December 20, 2023 6:54 PM >> To: Shatanawi, Sawsan Muhammad > >> Cc: Barry Smith >; petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >> >> [EXTERNAL EMAIL] >> >> On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >> Hello Barry, >> >> Thank you a lot for your help, Now I am getting the attached error message. >> >> Do not destroy the PC from KSPGetPC() >> >> THanks, >> >> Matt >> >> Bests, >> Sawsan >> From: Barry Smith > >> Sent: Wednesday, December 20, 2023 6:32 PM >> To: Shatanawi, Sawsan Muhammad > >> Cc: Mark Adams >; petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >> >> [EXTERNAL EMAIL] >> >> >> Instead of >> >> call PCCreate(PETSC_COMM_WORLD, pc, ierr) >> call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) >> call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver >> >> do >> >> call KSPGetPC(ksp,pc,ierr) >> call PCSetType(pc, PCILU,ierr) >> >> Do not call KSPSetUp(). It will be taken care of automatically during the solve >> >> >> >>> On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users > wrote: >>> >>> Hello, >>> I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. >>> I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) >>> >>> I appreciate your help >>> >>> Sawsan >>> From: Mark Adams > >>> Sent: Wednesday, December 20, 2023 6:44 AM >>> To: Shatanawi, Sawsan Muhammad > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >>> >>> [EXTERNAL EMAIL] >>> Did you set preallocation values when you created the matrix? >>> Don't do that. >>> >>> On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: >>> Hello, >>> >>> I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it >>> >>> Get Outlook for iOS >>> From: Mark Adams > >>> Sent: Wednesday, December 20, 2023 2:48 AM >>> To: Shatanawi, Sawsan Muhammad > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >>> >>> [EXTERNAL EMAIL] >>> I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. >>> If this is what you want then you can tell the matrix to let you do that. >>> Otherwise you have a bug. >>> >>> Mark >>> >>> On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >>> Hello everyone, >>> >>> I hope this email finds you well. >>> >>> My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. >>> >>> I am kindly asking if someone can help me, I would be happy to share my code with him/her. >>> >>> Please find the attached file contains a list of errors I have gotten >>> >>> Thank you in advance for your time and assistance. >>> Best regards, >>> >>> Sawsan >>> >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Thu Jan 11 04:55:37 2024 From: y.hu at mpie.de (Yi Hu) Date: Thu, 11 Jan 2024 11:55:37 +0100 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: <10A79A13-67A7-4C2E-B6F6-6E73B58856A2@petsc.dev> References: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> <10A79A13-67A7-4C2E-B6F6-6E73B58856A2@petsc.dev> Message-ID: Now I understand a bit more about the workflow of set jacobian. It seems that the SNES can be really fine-grained. As you point out, J is built via formJacobian() callback, and can be based on previous solution (or the base vector u, as you mentioned). And then KSP can use a customized MATOP_MULT to solve the linear equations J(u)*x=rhs. So I followed your idea about removing DMSNESSetJacobianLocal() and did the following. ?? ? call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& ????????????????????? 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& ????????????????????? 0,Jac_PETSc,err_PETSc) ? CHKERRQ(err_PETSc) ? call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) ? CHKERRQ(err_PETSc) ? call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,formJacobian,0,err_PETSc) ? CHKERRQ(err_PETSc) ?? And my formJacobian() is subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) ? ??SNES :: snes ? Vec? :: F_global ? ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & ? !?? F ? Mat?????????? ???????????????????????:: Jac, Jac_pre ? PetscObject????????????????????????? :: dummy ? PetscErrorCode?????????????????????? :: err_PETSc ? print*, '@@ start build my jac' ? ??call MatCopy(Jac_PETSc,Jac,SAME_NONZERO_PATTERN,err_PETSc) ? CHKERRQ(err_PETSc) ? call MatCopy(Jac_PETSc,Jac_pre,SAME_NONZERO_PATTERN,err_PETSc) ? CHKERRQ(err_PETSc) ? ! Jac = Jac_PETSc ? ! Jac_pre = Jac_PETSc ? print*, '@@ end build my jac' end subroutine formJacobian it turns out that no matter by a simple assignment or MatCopy(), the compiled program gives me the same error as before. So I guess the real jacobian is still not set. I wonder how to get around this and let this built jac in formJacobian() to be the same as my shell matrix. Yi From: Barry Smith Sent: Wednesday, January 10, 2024 4:27 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation By default if SNESSetJacobian() is not called with a function pointer PETSc attempts to compute the Jacobian matrix explicitly with finite differences and coloring. This doesn't makes sense with a shell matrix. Hence the error message below regarding MatFDColoringCreate(). DMSNESSetJacobianLocal() calls SNESSetJacobian() with a function pointer of SNESComputeJacobian_DMLocal() so preventing the error from triggering in your code. You can provide your own function to SNESSetJacobian() and thus not need to call DMSNESSetJacobianLocal(). What you do depends on how you want to record the "base" vector that tells your matrix-free multiply routine where the Jacobian matrix vector product is being applied, that is J(u)*x. u is the "base" vector which is passed to the function provided with SNESSetJacobian(). Barry On Jan 10, 2024, at 6:20?AM, Yi Hu wrote: Thanks for the clarification. It is more clear to me now about the global to local processes after checking the examples, e.g. ksp/ksp/tutorials/ex14f.F90. And for using Vec locally, I followed your advice of VecGet.. and VecRestore? In fact I used DMDAVecGetArrayReadF90() and some other relevant subroutines. For your comment on DMSNESSetJacobianLocal(). It seems that I need to use both SNESSetJacobian() and then DMSNESSetJacobianLocal() to get things working. When I do only SNESSetJacobian(), it does not work, meaning the following does not work ?? call DMDASNESsetFunctionLocal(DM_mech,INSERT_VALUES,formResidual,PETSC_NULL_SNES,err_PETSc) CHKERRQ(err_PETSc) call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& 0,Jac_PETSc,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) CHKERRQ(err_PETSc) !call DMSNESsetJacobianLocal(DM_mech,formJacobian,PETSC_NULL_SNES,err_PETSc) !CHKERRQ(err_PETSc) call SNESsetConvergenceTest(SNES_mech,converged,PETSC_NULL_SNES,PETSC_NULL_FUNCTION,err_PETSc) CHKERRQ(err_PETSc) call SNESSetDM(SNES_mech,DM_mech,err_PETSc) CHKERRQ(err_PETSc) ?? It gives me the message [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Code not yet written for matrix type shell [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.4, Feb 02, 2022 [0]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu --with-fortran-bindings --with-mpi-f90module-visibility=0 --download-fftw --download-hdf5 --download-hdf5-fortran-bindings --download-fblaslapack --download-ml --download-zlib [0]PETSC ERROR: #1 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 [0]PETSC ERROR: #2 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173[0]PETSC ERROR: #3 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 [0]PETSC ERROR: #4 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 [0]PETSC ERROR: #5 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 [0]PETSC ERROR: #6 User provided function() at User file:0 [0]PETSC ERROR: #7 VecSetErrorIfLocked() at /home/yi/app/petsc-3.16.4/include/petscvec.h:623 [0]PETSC ERROR: #8 VecGetArray() at /home/yi/app/petsc-3.16.4/src/vec/vec/interface/rvector.c:1769 [0]PETSC ERROR: #9 User provided function() at User file:0 [0]PETSC ERROR: #10 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 [0]PETSC ERROR: #11 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173 [0]PETSC ERROR: #12 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 [0]PETSC ERROR: #14 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 It seems that I have to use a DMSNESSetJacobianLocal() to ?activate? the use of my shell matrix, although the formJacobian() in the DMSNESSetJacobianLocal() is doing nothing. Best wishes, Yi From: Barry Smith Sent: Tuesday, January 9, 2024 4:49 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. The input for the matrix-vector product is a global vector, as is the result. (Not like the arguments to DMSNESSetJacobianLocal). This means that your MATOP_MULT function needs to do the DMGlobalToLocal() vector operation first then the "unwrapping" from the vector to the array format at the beginning of the routine. Similarly it needs to "unwrap" the result vector as an array. See src/snes/tutorials/ex14f.F90 and in particular the code block PetscCall(DMGlobalToLocalBegin(da,X,INSERT_VALUES,localX,ierr)) PetscCall(DMGlobalToLocalEnd(da,X,INSERT_VALUES,localX,ierr)) ! Get pointers to vector data PetscCall(VecGetArrayReadF90(localX,xx,ierr)) PetscCall(VecGetArrayF90(F,ff,ierr)) Barry You really shouldn't be using DMSNESSetJacobianLocal() for your code. Basically all the DMSNESSetJacobianLocal() gives you is that it automatically handles the global to local mapping and unwrapping of the vector to an array, but it doesn't work for shell matrices. On Jan 9, 2024, at 6:30?AM, Yi Hu wrote: Dear Barry, Thanks for your help. It works when doing first SNESSetJacobian() with my created shell matrix Jac in the main (or module) and then DMSNESSetJacobianLocal() to associate with my DM and an dummy formJacobian callback (which is doing nothing). My SNES can now recognize my shell matrix and do my customized operation. However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. Best wishes, Yi From: Barry Smith Sent: Monday, January 8, 2024 6:41 PM To: Yi Hu Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] SNES seems not use my matrix-free operation "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). Barry Yes, "form" is a bad word that should not have been used in our code. On Jan 8, 2024, at 12:24?PM, Yi Hu wrote: Dear PETSc Experts, I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) In the main program, I define my residual and jacobian and matrix-free jacobian like the following, ? call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) ? subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) #include use petscmat implicit None DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & F !< deformation gradient field Mat :: Jac, Jac_pre PetscObject :: dummy PetscErrorCode :: err_PETSc PetscInt :: N_dof ! global number of DoF, maybe only a placeholder N_dof = 9*product(cells(1:2))*cells3 print*, 'in my jac' call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) print*, 'in my jac' ! for jac preconditioner call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) print*, 'in my jac' end subroutine formJacobian subroutine GK_op(Jac,dF,output,err_PETSc) real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & dF real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & output real(pREAL), dimension(3,3) :: & deltaF_aim = 0.0_pREAL Mat :: Jac PetscErrorCode :: err_PETSc integer :: i, j, k, e ? a lot of calculations ? print*, 'in GK op' end subroutine GK_op The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? Thanks, Yi ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 11 09:46:49 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 11 Jan 2024 10:46:49 -0500 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: References: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> <10A79A13-67A7-4C2E-B6F6-6E73B58856A2@petsc.dev> Message-ID: The following assumes you are not using the shell matrix context for some other purpose > subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) > > SNES :: snes > Vec :: F_global > > ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > ! F > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > > print*, '@@ start build my jac' > > PetscCall(MatShellSetContext(Jac,F_global,ierr)) ! record the current base vector where the Jacobian is to be applied > print*, '@@ end build my jac' > > end subroutine formJacobian subroutine Gk_op ... Vec base PetscCall(MatShellGetContext(Jac,base,ierr)) ! use base in the computation of your matrix-free Jacobian vector product .... > On Jan 11, 2024, at 5:55?AM, Yi Hu wrote: > > Now I understand a bit more about the workflow of set jacobian. It seems that the SNES can be really fine-grained. As you point out, J is built via formJacobian() callback, and can be based on previous solution (or the base vector u, as you mentioned). And then KSP can use a customized MATOP_MULT to solve the linear equations J(u)*x=rhs. > > So I followed your idea about removing DMSNESSetJacobianLocal() and did the following. > > ?? > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& > 0,Jac_PETSc,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,formJacobian,0,err_PETSc) > CHKERRQ(err_PETSc) > ?? > > And my formJacobian() is > > subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) > > SNES :: snes > Vec :: F_global > > ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > ! F > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > > print*, '@@ start build my jac' > > call MatCopy(Jac_PETSc,Jac,SAME_NONZERO_PATTERN,err_PETSc) > CHKERRQ(err_PETSc) > call MatCopy(Jac_PETSc,Jac_pre,SAME_NONZERO_PATTERN,err_PETSc) > CHKERRQ(err_PETSc) > ! Jac = Jac_PETSc > ! Jac_pre = Jac_PETSc > > print*, '@@ end build my jac' > > end subroutine formJacobian > > it turns out that no matter by a simple assignment or MatCopy(), the compiled program gives me the same error as before. So I guess the real jacobian is still not set. I wonder how to get around this and let this built jac in formJacobian() to be the same as my shell matrix. > > Yi > > From: Barry Smith > > Sent: Wednesday, January 10, 2024 4:27 PM > To: Yi Hu > > Cc: petsc-users > > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > > By default if SNESSetJacobian() is not called with a function pointer PETSc attempts to compute the Jacobian matrix explicitly with finite differences and coloring. This doesn't makes sense with a shell matrix. Hence the error message below regarding MatFDColoringCreate(). > > DMSNESSetJacobianLocal() calls SNESSetJacobian() with a function pointer of SNESComputeJacobian_DMLocal() so preventing the error from triggering in your code. > > You can provide your own function to SNESSetJacobian() and thus not need to call DMSNESSetJacobianLocal(). What you do depends on how you want to record the "base" vector that tells your matrix-free multiply routine where the Jacobian matrix vector product is being applied, that is J(u)*x. u is the "base" vector which is passed to the function provided with SNESSetJacobian(). > > Barry > > > > On Jan 10, 2024, at 6:20?AM, Yi Hu > wrote: > > Thanks for the clarification. It is more clear to me now about the global to local processes after checking the examples, e.g. ksp/ksp/tutorials/ex14f.F90. > > And for using Vec locally, I followed your advice of VecGet.. and VecRestore? In fact I used DMDAVecGetArrayReadF90() and some other relevant subroutines. > > For your comment on DMSNESSetJacobianLocal(). It seems that I need to use both SNESSetJacobian() and then DMSNESSetJacobianLocal() to get things working. When I do only SNESSetJacobian(), it does not work, meaning the following does not work > > ?? > call DMDASNESsetFunctionLocal(DM_mech,INSERT_VALUES,formResidual,PETSC_NULL_SNES,err_PETSc) > CHKERRQ(err_PETSc) > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& > 0,Jac_PETSc,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) > CHKERRQ(err_PETSc) > !call DMSNESsetJacobianLocal(DM_mech,formJacobian,PETSC_NULL_SNES,err_PETSc) > !CHKERRQ(err_PETSc) > call SNESsetConvergenceTest(SNES_mech,converged,PETSC_NULL_SNES,PETSC_NULL_FUNCTION,err_PETSc) > CHKERRQ(err_PETSc) > call SNESSetDM(SNES_mech,DM_mech,err_PETSc) > CHKERRQ(err_PETSc) > ?? > > It gives me the message > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Code not yet written for matrix type shell > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.4, Feb 02, 2022 > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu --with-fortran-bindings --with-mpi-f90module-visibility=0 --download-fftw --download-hdf5 --download-hdf5-fortran-bindings --download-fblaslapack --download-ml --download-zlib > [0]PETSC ERROR: #1 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 > [0]PETSC ERROR: #2 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173[0]PETSC ERROR: #3 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 > [0]PETSC ERROR: #4 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 > [0]PETSC ERROR: #5 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 > [0]PETSC ERROR: #6 User provided function() at User file:0 > [0]PETSC ERROR: #7 VecSetErrorIfLocked() at /home/yi/app/petsc-3.16.4/include/petscvec.h:623 > [0]PETSC ERROR: #8 VecGetArray() at /home/yi/app/petsc-3.16.4/src/vec/vec/interface/rvector.c:1769 > [0]PETSC ERROR: #9 User provided function() at User file:0 > [0]PETSC ERROR: #10 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 > [0]PETSC ERROR: #11 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173 > [0]PETSC ERROR: #12 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 > [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 > [0]PETSC ERROR: #14 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 > > It seems that I have to use a DMSNESSetJacobianLocal() to ?activate? the use of my shell matrix, although the formJacobian() in the DMSNESSetJacobianLocal() is doing nothing. > > Best wishes, > Yi > > > > From: Barry Smith > > Sent: Tuesday, January 9, 2024 4:49 PM > To: Yi Hu > > Cc: petsc-users > > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. > > > The input for the matrix-vector product is a global vector, as is the result. (Not like the arguments to DMSNESSetJacobianLocal). > > This means that your MATOP_MULT function needs to do the DMGlobalToLocal() vector operation first then the "unwrapping" from the vector to the array format at the beginning of the routine. Similarly it needs to "unwrap" the result vector as an array. See src/snes/tutorials/ex14f.F90 and in particular the code block > > PetscCall(DMGlobalToLocalBegin(da,X,INSERT_VALUES,localX,ierr)) > PetscCall(DMGlobalToLocalEnd(da,X,INSERT_VALUES,localX,ierr)) > > ! Get pointers to vector data > > PetscCall(VecGetArrayReadF90(localX,xx,ierr)) > PetscCall(VecGetArrayF90(F,ff,ierr)) > > Barry > > You really shouldn't be using DMSNESSetJacobianLocal() for your code. Basically all the DMSNESSetJacobianLocal() gives you is that it automatically handles the global to local mapping and unwrapping of the vector to an array, but it doesn't work for shell matrices. > > > > > On Jan 9, 2024, at 6:30?AM, Yi Hu > wrote: > > Dear Barry, > > Thanks for your help. > > It works when doing first SNESSetJacobian() with my created shell matrix Jac in the main (or module) and then DMSNESSetJacobianLocal() to associate with my DM and an dummy formJacobian callback (which is doing nothing). My SNES can now recognize my shell matrix and do my customized operation. > > However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. > > Best wishes, > Yi > > From: Barry Smith > > Sent: Monday, January 8, 2024 6:41 PM > To: Yi Hu > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > > "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) > > You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). > > Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). > > Barry > > > Yes, "form" is a bad word that should not have been used in our code. > > > > > > > On Jan 8, 2024, at 12:24?PM, Yi Hu > wrote: > > Dear PETSc Experts, > > I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) > > In the main program, I define my residual and jacobian and matrix-free jacobian like the following, > > ? > call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) > call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) > ? > > subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) > > #include > use petscmat > implicit None > DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & > residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > F !< deformation gradient field > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > PetscInt :: N_dof ! global number of DoF, maybe only a placeholder > > N_dof = 9*product(cells(1:2))*cells3 > > print*, 'in my jac' > > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > ! for jac preconditioner > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > end subroutine formJacobian > > subroutine GK_op(Jac,dF,output,err_PETSc) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > dF > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & > output > real(pREAL), dimension(3,3) :: & > deltaF_aim = 0.0_pREAL > > Mat :: Jac > PetscErrorCode :: err_PETSc > > integer :: i, j, k, e > > ? a lot of calculations ? > > print*, 'in GK op' > > end subroutine GK_op > > The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. > > Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? > > Thanks, > Yi > > > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Thu Jan 11 10:10:32 2024 From: dontbugthedevs at proton.me (Noam T.) Date: Thu, 11 Jan 2024 16:10:32 +0000 Subject: [petsc-users] [Gmsh] Access both default sets and region names Message-ID: Would it be feasible to have an option (e.g. new flag along the lines of -dm_plex_gmsh_...) that allows the user to access both the default sets (Cell / Face / Vertex) together with user-defined gorups (those under $PhysicalNames, available when using -dm_plex_gmsh_use_regions)? That is, with a *.msh file containing $PhysicalNames 2 2 100 "my_surface" 3 200 "my_vol" the return of DMGetLabelName(dm, n, name) would be (order may differ) n = 0, name = "celltype" n = 1, name = "depth" n = 2, name = "Cell Sets" n = 3, name = "my_vol" n = 4, name = "Face Sets" n = 5, name = "my_surface" ... I poked into src/dm/impls/plex/plexgmsh.c and have managed to print all the labels after changing a couple of variable values, so perhaps it is doable. The changes made are not a solution, simply naively set some variables to skip checking for the use_regions flag, so it understandably crashes soon after. Thanks, Noam -------------- next part -------------- An HTML attachment was scrubbed... URL: From swilksen at itp.uni-bremen.de Thu Jan 11 10:18:10 2024 From: swilksen at itp.uni-bremen.de (Steffen Wilksen | Universitaet Bremen) Date: Thu, 11 Jan 2024 17:18:10 +0100 Subject: [petsc-users] Parallel processes run significantly slower Message-ID: <20240111171810.Horde.KBkZFRx_rd4KbBh9zYpql9G@webmail.uni-bremen.de> Hi all, I'm trying to do repeated matrix-vector-multiplication of large sparse matrices in python using petsc4py. Even the most simple method of parallelization, dividing up the calculation to run on multiple processes indenpendtly, does not seem to give a singnificant speed up for large matrices. I constructed a minimal working example, which I run using mpiexec -n N python parallel_example.py, where N is the number of processes. Instead of taking approximately the same time irrespective of the number of processes used, the calculation is much slower when starting more MPI processes. This translates to little to no speed up when splitting up a fixed number of calculations over N processes. As an example, running with N=1 takes 9s, while running with N=4 takes 34s. When running with smaller matrices, the problem is not as severe (only slower by a factor of 1.5 when setting MATSIZE=1e+5 instead of MATSIZE=1e+6). I get the same problems when just starting the script four times manually without using MPI. I attached both the script and the log file for running the script with N=4. Any help would be greatly appreciated. Calculations are done on my laptop, arch linux version 6.6.8 and PETSc version 3.20.2. Kind Regards Steffen -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: parallel_example.py Type: text/x-python Size: 417 bytes Desc: not available URL: From knepley at gmail.com Thu Jan 11 10:28:35 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Jan 2024 11:28:35 -0500 Subject: [petsc-users] [Gmsh] Access both default sets and region names In-Reply-To: References: Message-ID: On Thu, Jan 11, 2024 at 11:18?AM Noam T. via petsc-users < petsc-users at mcs.anl.gov> wrote: > Would it be feasible to have an option (e.g. new flag along the lines of > -dm_plex_gmsh_...) that allows the user to access both the default sets > (Cell / Face / Vertex) together with user-defined gorups (those under > $PhysicalNames, available when using -dm_plex_gmsh_use_regions)? > I am not sure I understand the question. When you turn on regions, it makes extra labels, but the generic labels still exist. Thanks, Matt > That is, with a *.msh file containing > > $PhysicalNames > 2 > 2 100 "my_surface" > 3 200 "my_vol" > > the return of DMGetLabelName(dm, n, name) would be (order may differ) > > n = 0, name = "celltype" > n = 1, name = "depth" > n = 2, name = "Cell Sets" > n = 3, name = "my_vol" > n = 4, name = "Face Sets" > n = 5, name = "my_surface" > ... > > I poked into src/dm/impls/plex/plexgmsh.c and have managed to print all > the labels after changing a couple of variable values, so perhaps it is > doable. > The changes made are not a solution, simply naively set some variables to > skip checking for the use_regions flag, so it understandably crashes soon > after. > > Thanks, Noam > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Thu Jan 11 10:37:32 2024 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 11 Jan 2024 19:37:32 +0300 Subject: [petsc-users] Parallel processes run significantly slower In-Reply-To: <20240111171810.Horde.KBkZFRx_rd4KbBh9zYpql9G@webmail.uni-bremen.de> References: <20240111171810.Horde.KBkZFRx_rd4KbBh9zYpql9G@webmail.uni-bremen.de> Message-ID: You are creating the matrix on the wrong communicator if you want it parallel. You are using PETSc.COMM_SELF On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | Universitaet Bremen < swilksen at itp.uni-bremen.de> wrote: > Hi all, > > I'm trying to do repeated matrix-vector-multiplication of large sparse > matrices in python using petsc4py. Even the most simple method of > parallelization, dividing up the calculation to run on multiple processes > indenpendtly, does not seem to give a singnificant speed up for large > matrices. I constructed a minimal working example, which I run using > > mpiexec -n N python parallel_example.py, > > where N is the number of processes. Instead of taking approximately the > same time irrespective of the number of processes used, the calculation is > much slower when starting more MPI processes. This translates to little to > no speed up when splitting up a fixed number of calculations over N > processes. As an example, running with N=1 takes 9s, while running with N=4 > takes 34s. When running with smaller matrices, the problem is not as severe > (only slower by a factor of 1.5 when setting MATSIZE=1e+5 instead of > MATSIZE=1e+6). I get the same problems when just starting the script four > times manually without using MPI. > I attached both the script and the log file for running the script with > N=4. Any help would be greatly appreciated. Calculations are done on my > laptop, arch linux version 6.6.8 and PETSc version 3.20.2. > > Kind Regards > Steffen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 11 10:56:24 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 11 Jan 2024 11:56:24 -0500 Subject: [petsc-users] Parallel processes run significantly slower In-Reply-To: References: <20240111171810.Horde.KBkZFRx_rd4KbBh9zYpql9G@webmail.uni-bremen.de> Message-ID: Take a look at the discussion in https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html and I suggest you run the streams benchmark from the branch barry/2023-09-15/fix-log-pcmpi on your machine to get a baseline for what kind of speedup you can expect. Then let us know your thoughts. Barry > On Jan 11, 2024, at 11:37?AM, Stefano Zampini wrote: > > You are creating the matrix on the wrong communicator if you want it parallel. You are using PETSc.COMM_SELF > > On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | Universitaet Bremen > wrote: >> Hi all, >> >> I'm trying to do repeated matrix-vector-multiplication of large sparse matrices in python using petsc4py. Even the most simple method of parallelization, dividing up the calculation to run on multiple processes indenpendtly, does not seem to give a singnificant speed up for large matrices. I constructed a minimal working example, which I run using >> >> mpiexec -n N python parallel_example.py, >> >> where N is the number of processes. Instead of taking approximately the same time irrespective of the number of processes used, the calculation is much slower when starting more MPI processes. This translates to little to no speed up when splitting up a fixed number of calculations over N processes. As an example, running with N=1 takes 9s, while running with N=4 takes 34s. When running with smaller matrices, the problem is not as severe (only slower by a factor of 1.5 when setting MATSIZE=1e+5 instead of MATSIZE=1e+6). I get the same problems when just starting the script four times manually without using MPI. >> I attached both the script and the log file for running the script with N=4. Any help would be greatly appreciated. Calculations are done on my laptop, arch linux version 6.6.8 and PETSc version 3.20.2. >> >> Kind Regards >> Steffen >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Thu Jan 11 10:58:55 2024 From: dontbugthedevs at proton.me (Noam T.) Date: Thu, 11 Jan 2024 16:58:55 +0000 Subject: [petsc-users] [Gmsh] Access both default sets and region names In-Reply-To: References: Message-ID: Without using the flag -dm_plex_gmsh_use_regions, the DMGetNumLabels says there are 5, named celltype, depth, cell/face/vertex sets. With the flag, the labels are celltype, depth, my_vol, my_surface (using the same example as before). Am I misusing the flag somehow, and I should be able to access those of cell/face/vertex as well? PS: Using PETSc 3.20.3 Thanks, Noam On Thursday, January 11th, 2024 at 5:28 PM, Matthew Knepley wrote: > On Thu, Jan 11, 2024 at 11:18?AM Noam T. via petsc-users wrote: > >> Would it be feasible to have an option (e.g. new flag along the lines of -dm_plex_gmsh_...) that allows the user to access both the default sets (Cell / Face / Vertex) together with user-defined gorups (those under $PhysicalNames, available when using -dm_plex_gmsh_use_regions)? > > I am not sure I understand the question. When you turn on regions, it makes extra labels, but the generic labels still exist. > > Thanks, > > Matt > >> That is, with a *.msh file containing >> >> $PhysicalNames >> 2 >> 2 100 "my_surface" >> 3 200 "my_vol" >> >> the return of DMGetLabelName(dm, n, name) would be (order may differ) >> >> n = 0, name = "celltype" >> n = 1, name = "depth" >> n = 2, name = "Cell Sets" >> n = 3, name = "my_vol" >> n = 4, name = "Face Sets" >> n = 5, name = "my_surface" >> ... >> >> I poked into src/dm/impls/plex/plexgmsh.c and have managed to print all the labels after changing a couple of variable values, so perhaps it is doable. >> The changes made are not a solution, simply naively set some variables to skip checking for the use_regions flag, so it understandably crashes soon after. >> >> Thanks, Noam > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 11 11:26:24 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 11 Jan 2024 12:26:24 -0500 Subject: [petsc-users] KSP number of iterations different from size of residual array In-Reply-To: References: Message-ID: <3E3E3CCB-4236-42C9-A351-5DCDFFD125B8@petsc.dev> Trying again. Normally, numIts would be one less than nEntries since the initial residual is computed (and stored in the history) before any iterations. Is this what you are seeing or are you seeing other values for the two? I've started a run of the PETSc test suite that compares the two values for inconsistencies for all tests to see if I can find any problems. Barry Also note the importance of the reset value in KSPSetResidualHistory() which means the values will not match when reset is PETSC_FALSE. > On Jan 10, 2024, at 7:09?PM, Kevin G. Wang wrote: > > Hello everyone! > > I am writing a code that uses PETSc/KSP to solve linear systems. I just realized that after running "KSPSolve(...)", the number of iterations given by > > KSPGetIterationNumber(ksp, &numIts) > > is *different* from the size of the residual history given by > > KSPGetResidualHistory(ksp, NULL, &nEntries); > > That is, "numIts" is not equal to "nEntries". Is this expected, or a bug in my code? (I thought they should be the same...) > > I have tried several pairs of solvers and preconditioners (e.g., fgmres & bjacobi, ibcgs & bjacobi). This issue happens to all of them. > > Thanks! > Kevin > > -- > Kevin G. Wang, Ph.D. > Associate Professor > Kevin T. Crofton Department of Aerospace and Ocean Engineering > Virginia Tech > 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 > Office: (540) 231-7547 | Mobile: (650) 862-2663 > URL: https://www.aoe.vt.edu/people/faculty/wang.html > Codes: https://github.com/kevinwgy -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 11 11:31:33 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Jan 2024 12:31:33 -0500 Subject: [petsc-users] [Gmsh] Access both default sets and region names In-Reply-To: References: Message-ID: On Thu, Jan 11, 2024 at 11:59?AM Noam T. wrote: > Without using the flag -dm_plex_gmsh_use_regions, the DMGetNumLabels says > there are 5, named celltype, depth, cell/face/vertex sets. > With the flag, the labels are celltype, depth, my_vol, my_surface (using > the same example as before). > Am I misusing the flag somehow, and I should be able to access those of > cell/face/vertex as well? > Shoot, yes this changed after another request. Yes, we can put in a flag for that. Should not take long. Thanks, Matt > PS: Using PETSc 3.20.3 > > Thanks, > Noam > On Thursday, January 11th, 2024 at 5:28 PM, Matthew Knepley < > knepley at gmail.com> wrote: > > On Thu, Jan 11, 2024 at 11:18?AM Noam T. via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Would it be feasible to have an option (e.g. new flag along the lines of >> -dm_plex_gmsh_...) that allows the user to access both the default sets >> (Cell / Face / Vertex) together with user-defined gorups (those under >> $PhysicalNames, available when using -dm_plex_gmsh_use_regions)? >> > > I am not sure I understand the question. When you turn on regions, it > makes extra labels, but the generic labels still exist. > > Thanks, > > Matt > >> That is, with a *.msh file containing >> >> $PhysicalNames >> 2 >> 2 100 "my_surface" >> 3 200 "my_vol" >> >> the return of DMGetLabelName(dm, n, name) would be (order may differ) >> >> n = 0, name = "celltype" >> n = 1, name = "depth" >> n = 2, name = "Cell Sets" >> n = 3, name = "my_vol" >> n = 4, name = "Face Sets" >> n = 5, name = "my_surface" >> ... >> >> I poked into src/dm/impls/plex/plexgmsh.c and have managed to print all >> the labels after changing a couple of variable values, so perhaps it is >> doable. >> The changes made are not a solution, simply naively set some variables to >> skip checking for the use_regions flag, so it understandably crashes soon >> after. >> >> Thanks, Noam >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Thu Jan 11 11:53:07 2024 From: dontbugthedevs at proton.me (Noam T.) Date: Thu, 11 Jan 2024 17:53:07 +0000 Subject: [petsc-users] [Gmsh] Access both default sets and region names In-Reply-To: References: Message-ID: <6SJlhZpUt0lCEjV2_bNQqbqUVNP7wlSyUNdERkJseZyGPel-GMgPFhIPnvwcpsvn0TNmRdhj5yV8th6vgR9MbSFxs1Z9CunqR5j0Fg-tE8w=@proton.me> There could be some overlapping/redundancy between the default and the user-defined groups, so perhaps that was the intended behavior. Glad to hear it's possible to have access to everything. Thanks, Noam On Thursday, January 11th, 2024 at 6:31 PM, Matthew Knepley wrote: > On Thu, Jan 11, 2024 at 11:59?AM Noam T. wrote: > >> Without using the flag -dm_plex_gmsh_use_regions, the DMGetNumLabels says there are 5, named celltype, depth, cell/face/vertex sets. >> With the flag, the labels are celltype, depth, my_vol, my_surface (using the same example as before). >> Am I misusing the flag somehow, and I should be able to access those of cell/face/vertex as well? > > Shoot, yes this changed after another request. Yes, we can put in a flag for that. Should not take long. > > Thanks, > > Matt > >> PS: Using PETSc 3.20.3 >> >> Thanks, >> Noam >> On Thursday, January 11th, 2024 at 5:28 PM, Matthew Knepley wrote: >> >>> On Thu, Jan 11, 2024 at 11:18?AM Noam T. via petsc-users wrote: >>> >>>> Would it be feasible to have an option (e.g. new flag along the lines of -dm_plex_gmsh_...) that allows the user to access both the default sets (Cell / Face / Vertex) together with user-defined gorups (those under $PhysicalNames, available when using -dm_plex_gmsh_use_regions)? >>> >>> I am not sure I understand the question. When you turn on regions, it makes extra labels, but the generic labels still exist. >>> >>> Thanks, >>> >>> Matt >>> >>>> That is, with a *.msh file containing >>>> >>>> $PhysicalNames >>>> 2 >>>> 2 100 "my_surface" >>>> 3 200 "my_vol" >>>> >>>> the return of DMGetLabelName(dm, n, name) would be (order may differ) >>>> >>>> n = 0, name = "celltype" >>>> n = 1, name = "depth" >>>> n = 2, name = "Cell Sets" >>>> n = 3, name = "my_vol" >>>> n = 4, name = "Face Sets" >>>> n = 5, name = "my_surface" >>>> ... >>>> >>>> I poked into src/dm/impls/plex/plexgmsh.c and have managed to print all the labels after changing a couple of variable values, so perhaps it is doable. >>>> The changes made are not a solution, simply naively set some variables to skip checking for the use_regions flag, so it understandably crashes soon after. >>>> >>>> Thanks, Noam >>> >>> -- >>> >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/) > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/) -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 11 14:34:36 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Jan 2024 15:34:36 -0500 Subject: [petsc-users] [Gmsh] Access both default sets and region names In-Reply-To: <6SJlhZpUt0lCEjV2_bNQqbqUVNP7wlSyUNdERkJseZyGPel-GMgPFhIPnvwcpsvn0TNmRdhj5yV8th6vgR9MbSFxs1Z9CunqR5j0Fg-tE8w=@proton.me> References: <6SJlhZpUt0lCEjV2_bNQqbqUVNP7wlSyUNdERkJseZyGPel-GMgPFhIPnvwcpsvn0TNmRdhj5yV8th6vgR9MbSFxs1Z9CunqR5j0Fg-tE8w=@proton.me> Message-ID: On Thu, Jan 11, 2024 at 12:53?PM Noam T. wrote: > There could be some overlapping/redundancy between the default and the > user-defined groups, so perhaps that was the intended behavior. Glad to > hear it's possible to have access to everything. > Here is the MR: https://gitlab.com/petsc/petsc/-/merge_requests/7178 If you build that branch, you can use -dm_plex_gmsh_use_generic to turn on those labels. Thanks, Matt > Thanks, > Noam > On Thursday, January 11th, 2024 at 6:31 PM, Matthew Knepley < > knepley at gmail.com> wrote: > > On Thu, Jan 11, 2024 at 11:59?AM Noam T. wrote: > >> Without using the flag -dm_plex_gmsh_use_regions, the DMGetNumLabels >> says there are 5, named celltype, depth, cell/face/vertex sets. >> With the flag, the labels are celltype, depth, my_vol, my_surface (using >> the same example as before). >> Am I misusing the flag somehow, and I should be able to access those of >> cell/face/vertex as well? >> > > Shoot, yes this changed after another request. Yes, we can put in a flag > for that. Should not take long. > > Thanks, > > Matt > >> PS: Using PETSc 3.20.3 >> >> Thanks, >> Noam >> On Thursday, January 11th, 2024 at 5:28 PM, Matthew Knepley < >> knepley at gmail.com> wrote: >> >> On Thu, Jan 11, 2024 at 11:18?AM Noam T. via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Would it be feasible to have an option (e.g. new flag along the lines of >>> -dm_plex_gmsh_...) that allows the user to access both the default sets >>> (Cell / Face / Vertex) together with user-defined gorups (those under >>> $PhysicalNames, available when using -dm_plex_gmsh_use_regions)? >>> >> >> I am not sure I understand the question. When you turn on regions, it >> makes extra labels, but the generic labels still exist. >> >> Thanks, >> >> Matt >> >>> That is, with a *.msh file containing >>> >>> $PhysicalNames >>> 2 >>> 2 100 "my_surface" >>> 3 200 "my_vol" >>> >>> the return of DMGetLabelName(dm, n, name) would be (order may differ) >>> >>> n = 0, name = "celltype" >>> n = 1, name = "depth" >>> n = 2, name = "Cell Sets" >>> n = 3, name = "my_vol" >>> n = 4, name = "Face Sets" >>> n = 5, name = "my_surface" >>> ... >>> >>> I poked into src/dm/impls/plex/plexgmsh.c and have managed to print all >>> the labels after changing a couple of variable values, so perhaps it is >>> doable. >>> The changes made are not a solution, simply naively set some variables >>> to skip checking for the use_regions flag, so it understandably crashes >>> soon after. >>> >>> Thanks, Noam >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 11 20:51:04 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 11 Jan 2024 21:51:04 -0500 Subject: [petsc-users] KSP number of iterations different from size of residual array In-Reply-To: References: <3E3E3CCB-4236-42C9-A351-5DCDFFD125B8@petsc.dev> Message-ID: <5EFE8226-BF13-4F19-84B9-8982961C7099@petsc.dev> Kevin A couple of different things are at play here producing the unexpected results. I have created a merge request https://gitlab.com/petsc/petsc/-/merge_requests/7179 clarifying why the results obtained from the KSPGetResidualHistory() and KSPGetIterationNumber() can be different in the docs. I also fixed a couple of locations of KSPLogResidual() (in gmres and fgmres) that resulted in extra incorrect logging of the history. In summary, with the "standard" textbook Krylov methods, one expects numIts = nEntries - 1, but this need not be the case for advanced Krylov methods (like those with inner iterations or pipelining) or under exceptional circumstances like the use of CG in trust region methods. Barry > On Jan 11, 2024, at 1:20?PM, Kevin G. Wang wrote: > > Hi Barry, > > Thanks for your help!! > > I have checked that in KSPSetResidualHistory, "reset" is set to PETSC_TRUE. I did a few quick tests after reading your message. There seems to be some patterns between "numIts" (given by KSPGetIterationNumber) and "nEntries" (given by KSPGetResidualHistory): > > 1. With gmres or fgmres as the solver: > - If the number of iterations (until error tolerance is met) is small, like 20 - 30, indeed as you said, numIts = nEntries - 1. > - if the number of iterations is large, this is no longer true. I have a case where nEntries = 372, numIts = 360. > 2. With bcgsl, it looks like numIts = 2*(nEntries - 1). > 3. With ibcgs, nEntries = 0, while numIts is nonzero. > > In all these tests, I have set the preconditioner to "none". > > My code (where the KSP functions are called) is here: https://github.com/kevinwgy/m2c/blob/main/LinearSystemSolver.cpp > > I am using PETSc 3.12.4. > > Thanks! > Kevin > > > On Thu, Jan 11, 2024 at 12:26?PM Barry Smith > wrote: >> >> Trying again. >> >> Normally, numIts would be one less than nEntries since the initial residual is computed (and stored in the history) before any iterations. >> >> Is this what you are seeing or are you seeing other values for the two? >> >> I've started a run of the PETSc test suite that compares the two values for inconsistencies for all tests to see if I can find any problems. >> >> Barry >> >> Also note the importance of the reset value in KSPSetResidualHistory() which means the values will not match when reset is PETSC_FALSE. >> >>> On Jan 10, 2024, at 7:09?PM, Kevin G. Wang > wrote: >>> >>> Hello everyone! >>> >>> I am writing a code that uses PETSc/KSP to solve linear systems. I just realized that after running "KSPSolve(...)", the number of iterations given by >>> >>> KSPGetIterationNumber(ksp, &numIts) >>> >>> is *different* from the size of the residual history given by >>> >>> KSPGetResidualHistory(ksp, NULL, &nEntries); >>> >>> That is, "numIts" is not equal to "nEntries". Is this expected, or a bug in my code? (I thought they should be the same...) >>> >>> I have tried several pairs of solvers and preconditioners (e.g., fgmres & bjacobi, ibcgs & bjacobi). This issue happens to all of them. >>> >>> Thanks! >>> Kevin >>> >>> -- >>> Kevin G. Wang, Ph.D. >>> Associate Professor >>> Kevin T. Crofton Department of Aerospace and Ocean Engineering >>> Virginia Tech >>> 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 >>> Office: (540) 231-7547 | Mobile: (650) 862-2663 >>> URL: https://www.aoe.vt.edu/people/faculty/wang.html >>> Codes: https://github.com/kevinwgy >> > > > -- > Kevin G. Wang, Ph.D. > Associate Professor > Kevin T. Crofton Department of Aerospace and Ocean Engineering > Virginia Tech > 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 > Office: (540) 231-7547 | Mobile: (650) 862-2663 > URL: https://www.aoe.vt.edu/people/faculty/wang.html > Codes: https://github.com/kevinwgy -------------- next part -------------- An HTML attachment was scrubbed... URL: From sawsan.shatanawi at wsu.edu Thu Jan 11 22:49:32 2024 From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad) Date: Fri, 12 Jan 2024 04:49:32 +0000 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: <9BD74F88-20FD-4021-B02A-195A58B72282@petsc.dev> References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> <9BD74F88-20FD-4021-B02A-195A58B72282@petsc.dev> Message-ID: Hello, Thank you all for your help. I have changed VecGetArray to VecGetArrayF90, and the location of destory call. but I want to make sure that VecGet ArrayF90 is to make a new array( vector) that I can use in the rest of my Fortran code? when I run it and debugged it, I got 5.2000000E-03 50.00000 10.00000 0.0000000E+00 PETSC: Attaching gdb to /weka/data/lab/richey/sawsan/GW_CODE/code2024/SS_GWM/./GW.exe of pid 33065 on display :0.0 on machine sn16 Unable to start debugger in xterm: No such file or directory 0.0000000E+00 Attempting to use an MPI routine after finalizing MPICH srun: error: sn16: task 0: Exited with exit code 1 [sawsan.shatanawi at login-p2n02 SS_GWM]$ gdb ./GW/exe GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: ... ./GW/exe: No such file or directory. (gdb) run Starting program: No executable file specified. Use the "file" or "exec-file" command. (gdb) bt No stack. (gdb) If the highlighted line is the error, I don't know why when I write gdb , it does not show me the location of error The code : sshatanawi/SS_GWM (github.com) I really appreciate your helps Sawsan ________________________________ From: Barry Smith Sent: Wednesday, January 10, 2024 5:35 PM To: Junchao Zhang Cc: Shatanawi, Sawsan Muhammad ; Mark Adams ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] On Jan 10, 2024, at 6:49?PM, Junchao Zhang wrote: Hi, Sawsan, I could build your code and I also could gdb it. $ gdb ./GW.exe ... $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault. 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 257 *ierr = VecGetArray(*x, &lx); (gdb) bt #0 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 #1 0x000000000040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at GW_solver_try.F90:169 #2 0x000000000040c6a8 in test_gw () at test_main.F90:35 ierr=0x0 caused the segfault. See https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray, you should use VecGetArrayF90 instead. BTW, Barry, the code https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 has "call VecGetArray(temp_solution, H_vector, ierr)". I don't find petsc Fortran examples doing VecGetArray. Do we still support it? This is not the correct calling sequence for VecGetArray() from Fortran. Regardless, definitely should not be writing any new code that uses VecGetArray() from Fortran. Should use VecGetArrayF90(). --Junchao Zhang On Wed, Jan 10, 2024 at 2:38?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello all, I hope you are doing well. Generally, I use gdb to debug the code. I got the attached error message. I have tried to add the flag -start_in_debugger in the make file, but it didn't work, so it seems I was doing it in the wrong way This is the link for the whole code: sshatanawi/SS_GWM (github.com) [https://opengraph.githubassets.com/9eb6cd14baf12f04848ed209b6f502415eb531bdd7b3a5f9696af68663b870c0/sshatanawi/SS_GWM] GitHub - sshatanawi/SS_GWM Contribute to sshatanawi/SS_GWM development by creating an account on GitHub. github.com ? You can read the description of the code in " Model Desprciption.pdf" the compiling file is makefile_f90 where you can find the linked code files I really appreciate your help Bests, Sawsan ________________________________ From: Mark Adams > Sent: Friday, January 5, 2024 4:53 AM To: Shatanawi, Sawsan Muhammad > Cc: Matthew Knepley >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] This is a segv. As Matt said, you need to use a debugger for this or add print statements to narrow down the place where this happens. You will need to learn how to use debuggers to do your project so you might as well start now. If you have a machine with a GUI debugger that is easier but command line debuggers are good to learn anyway. I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) and use a GUI debugger (eg, Totalview or DDT) if available. Mark On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello Matthew, Thank you for your help. I am sorry that I keep coming back with my error messages, but I reached a point that I don't know how to fix them, and I don't understand them easily. The list of errors is getting shorter, now I am getting the attached error messages Thank you again, Sawsan ________________________________ From: Matthew Knepley > Sent: Wednesday, December 20, 2023 6:54 PM To: Shatanawi, Sawsan Muhammad > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello Barry, Thank you a lot for your help, Now I am getting the attached error message. Do not destroy the PC from KSPGetPC() THanks, Matt Bests, Sawsan ________________________________ From: Barry Smith > Sent: Wednesday, December 20, 2023 6:32 PM To: Shatanawi, Sawsan Muhammad > Cc: Mark Adams >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Instead of call PCCreate(PETSC_COMM_WORLD, pc, ierr) call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver do call KSPGetPC(ksp,pc,ierr) call PCSetType(pc, PCILU,ierr) Do not call KSPSetUp(). It will be taken care of automatically during the solve On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello, I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) I appreciate your help Sawsan ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 6:44 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Did you set preallocation values when you created the matrix? Don't do that. On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: Hello, I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it Get Outlook for iOS ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 2:48 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. If this is what you want then you can tell the matrix to let you do that. Otherwise you have a bug. Mark On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello everyone, I hope this email finds you well. My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. I am kindly asking if someone can help me, I would be happy to share my code with him/her. Please find the attached file contains a list of errors I have gotten Thank you in advance for your time and assistance. Best regards, Sawsan -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Fri Jan 12 04:25:55 2024 From: dontbugthedevs at proton.me (Noam T.) Date: Fri, 12 Jan 2024 10:25:55 +0000 Subject: [petsc-users] [Gmsh] Access both default sets and region names In-Reply-To: References: <6SJlhZpUt0lCEjV2_bNQqbqUVNP7wlSyUNdERkJseZyGPel-GMgPFhIPnvwcpsvn0TNmRdhj5yV8th6vgR9MbSFxs1Z9CunqR5j0Fg-tE8w=@proton.me> Message-ID: Great. Thank you very much for the quick replies. Noam On Thursday, January 11th, 2024 at 9:34 PM, Matthew Knepley wrote: > On Thu, Jan 11, 2024 at 12:53?PM Noam T. wrote: > >> There could be some overlapping/redundancy between the default and the user-defined groups, so perhaps that was the intended behavior. Glad to hear it's possible to have access to everything. > > Here is the MR: https://gitlab.com/petsc/petsc/-/merge_requests/7178 > > If you build that branch, you can use -dm_plex_gmsh_use_generic to turn on those labels. > > Thanks, > > Matt > >> Thanks, >> Noam >> On Thursday, January 11th, 2024 at 6:31 PM, Matthew Knepley wrote: >> >>> On Thu, Jan 11, 2024 at 11:59?AM Noam T. wrote: >>> >>>> Without using the flag -dm_plex_gmsh_use_regions, the DMGetNumLabels says there are 5, named celltype, depth, cell/face/vertex sets. >>>> With the flag, the labels are celltype, depth, my_vol, my_surface (using the same example as before). >>>> Am I misusing the flag somehow, and I should be able to access those of cell/face/vertex as well? >>> >>> Shoot, yes this changed after another request. Yes, we can put in a flag for that. Should not take long. >>> >>> Thanks, >>> >>> Matt >>> >>>> PS: Using PETSc 3.20.3 >>>> >>>> Thanks, >>>> Noam >>>> On Thursday, January 11th, 2024 at 5:28 PM, Matthew Knepley wrote: >>>> >>>>> On Thu, Jan 11, 2024 at 11:18?AM Noam T. via petsc-users wrote: >>>>> >>>>>> Would it be feasible to have an option (e.g. new flag along the lines of -dm_plex_gmsh_...) that allows the user to access both the default sets (Cell / Face / Vertex) together with user-defined gorups (those under $PhysicalNames, available when using -dm_plex_gmsh_use_regions)? >>>>> >>>>> I am not sure I understand the question. When you turn on regions, it makes extra labels, but the generic labels still exist. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> That is, with a *.msh file containing >>>>>> >>>>>> $PhysicalNames >>>>>> 2 >>>>>> 2 100 "my_surface" >>>>>> 3 200 "my_vol" >>>>>> >>>>>> the return of DMGetLabelName(dm, n, name) would be (order may differ) >>>>>> >>>>>> n = 0, name = "celltype" >>>>>> n = 1, name = "depth" >>>>>> n = 2, name = "Cell Sets" >>>>>> n = 3, name = "my_vol" >>>>>> n = 4, name = "Face Sets" >>>>>> n = 5, name = "my_surface" >>>>>> ... >>>>>> >>>>>> I poked into src/dm/impls/plex/plexgmsh.c and have managed to print all the labels after changing a couple of variable values, so perhaps it is doable. >>>>>> The changes made are not a solution, simply naively set some variables to skip checking for the use_regions flag, so it understandably crashes soon after. >>>>>> >>>>>> Thanks, Noam >>>>> >>>>> -- >>>>> >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/) >>> >>> -- >>> >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/) > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/) -------------- next part -------------- An HTML attachment was scrubbed... URL: From swilksen at itp.uni-bremen.de Fri Jan 12 08:43:09 2024 From: swilksen at itp.uni-bremen.de (Steffen Wilksen | Universitaet Bremen) Date: Fri, 12 Jan 2024 15:43:09 +0100 Subject: [petsc-users] Parallel processes run significantly slower In-Reply-To: References: <20240111171810.Horde.KBkZFRx_rd4KbBh9zYpql9G@webmail.uni-bremen.de> Message-ID: <20240112154309.Horde.qQWBnq9gTF7cTnAVnT6a10X@webmail.uni-bremen.de> Thank you for your feedback. @Stefano: the use of my communicator was intentional, since I later intend to distribute M independent calculations to N processes, each process then only needing to do M/N calculations. Of course I don't expect speed up in my example since the number of calculations is constant and not dependent on N, but I would hope that the time each process takes does not increase too drastically with N. @Barry: I tried to do the STREAMS benchmark, these are my results: 1? 23467.9961?? Rate (MB/s) 1 2? 26852.0536?? Rate (MB/s) 1.1442 3? 29715.4762?? Rate (MB/s) 1.26621 4? 34132.2490?? Rate (MB/s) 1.45442 5? 34924.3020?? Rate (MB/s) 1.48817 6? 34315.5290?? Rate (MB/s) 1.46223 7? 33134.9545?? Rate (MB/s) 1.41192 8? 33234.9141?? Rate (MB/s) 1.41618 9? 32584.3349?? Rate (MB/s) 1.38846 10? 32582.3962?? Rate (MB/s) 1.38838 11? 32098.2903?? Rate (MB/s) 1.36775 12? 32064.8779?? Rate (MB/s) 1.36632 13? 31692.0541?? Rate (MB/s) 1.35044 14? 31274.2421?? Rate (MB/s) 1.33263 15? 31574.0196?? Rate (MB/s) 1.34541 16? 30906.7773?? Rate (MB/s) 1.31698 I also attached the resulting plot. As it seems, I get very bad MPI speedup (red curve, right?), even decreasing if I use too many threads. I don't fully understand the reasons given in the discussion you linked since this is all very new to me, but I take that this is a problem with my computer which I can't easily fix, right? ----- Message from Barry Smith --------- ? ?Date: Thu, 11 Jan 2024 11:56:24 -0500 ? ?From: Barry Smith Subject: Re: [petsc-users] Parallel processes run significantly slower ? ? ?To: Steffen Wilksen | Universitaet Bremen ? ? ?Cc: PETSc users list > ? > ? ?Take a look at the discussion > in?https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html?and I suggest you run the streams benchmark from the branch?barry/2023-09-15/fix-log-pcmpi on your machine to get a baseline for what kind of speedup you can expect. ? > ? > ? ? Then let us know your thoughts. > ? > ? ?Barry > > > > >> On Jan 11, 2024, at 11:37?AM, Stefano Zampini >> wrote: >> >> You are creating the matrix on the wrong communicator >> if you want it parallel. You are using PETSc.COMM_SELF >> >> On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | >> Universitaet Bremen wrote: >> >>> _Hi all, >>> >>> I'm trying to do repeated matrix-vector-multiplication of large >>> sparse matrices in python using petsc4py. Even the most simple >>> method of parallelization, dividing up the calculation to run on >>> multiple processes indenpendtly, does not seem to give a >>> singnificant speed up for large matrices. I constructed a minimal >>> working example, which I run using >>> >>> mpiexec -n N python parallel_example.py, >>> >>> where N is the number of processes. Instead of taking >>> approximately the same time irrespective of the number of >>> processes used, the calculation is much slower when starting more >>> MPI processes. This translates to little to no speed up when >>> splitting up a fixed number of calculations over N processes. As >>> an example, running with N=1 takes 9s, while running with N=4 >>> takes 34s. When running with smaller matrices, the problem is not >>> as severe (only slower by a factor of 1.5 when setting >>> MATSIZE=1e+5 instead of MATSIZE=1e+6). I get the same problems >>> when just starting the script four times manually without using MPI. >>> I attached both the script and the log file for running the script >>> with N=4. Any help would be greatly appreciated. Calculations are >>> done on my laptop, arch linux version 6.6.8 and PETSc version >>> 3.20.2. >>> >>> Kind Regards >>> Steffen_ _----- End message from Barry Smith -----_ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: streams.png Type: image/png Size: 52778 bytes Desc: not available URL: From knepley at gmail.com Fri Jan 12 08:59:53 2024 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Jan 2024 09:59:53 -0500 Subject: [petsc-users] [Gmsh] Access both default sets and region names In-Reply-To: References: <6SJlhZpUt0lCEjV2_bNQqbqUVNP7wlSyUNdERkJseZyGPel-GMgPFhIPnvwcpsvn0TNmRdhj5yV8th6vgR9MbSFxs1Z9CunqR5j0Fg-tE8w=@proton.me> Message-ID: On Fri, Jan 12, 2024 at 5:26?AM Noam T. wrote: > Great. > > Thank you very much for the quick replies. > It has now merged to the main branch. Thanks, Matt > Noam > On Thursday, January 11th, 2024 at 9:34 PM, Matthew Knepley < > knepley at gmail.com> wrote: > > On Thu, Jan 11, 2024 at 12:53?PM Noam T. wrote: > >> There could be some overlapping/redundancy between the default and the >> user-defined groups, so perhaps that was the intended behavior. Glad to >> hear it's possible to have access to everything. >> > > Here is the MR: https://gitlab.com/petsc/petsc/-/merge_requests/7178 > > If you build that branch, you can use -dm_plex_gmsh_use_generic to turn on > those labels. > > Thanks, > > Matt > >> Thanks, >> Noam >> On Thursday, January 11th, 2024 at 6:31 PM, Matthew Knepley < >> knepley at gmail.com> wrote: >> >> On Thu, Jan 11, 2024 at 11:59?AM Noam T. >> wrote: >> >>> Without using the flag -dm_plex_gmsh_use_regions, the DMGetNumLabels >>> says there are 5, named celltype, depth, cell/face/vertex sets. >>> With the flag, the labels are celltype, depth, my_vol, my_surface (using >>> the same example as before). >>> Am I misusing the flag somehow, and I should be able to access those of >>> cell/face/vertex as well? >>> >> >> Shoot, yes this changed after another request. Yes, we can put in a flag >> for that. Should not take long. >> >> Thanks, >> >> Matt >> >>> PS: Using PETSc 3.20.3 >>> >>> Thanks, >>> Noam >>> On Thursday, January 11th, 2024 at 5:28 PM, Matthew Knepley < >>> knepley at gmail.com> wrote: >>> >>> On Thu, Jan 11, 2024 at 11:18?AM Noam T. via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Would it be feasible to have an option (e.g. new flag along the lines >>>> of -dm_plex_gmsh_...) that allows the user to access both the default sets >>>> (Cell / Face / Vertex) together with user-defined gorups (those under >>>> $PhysicalNames, available when using -dm_plex_gmsh_use_regions)? >>>> >>> >>> I am not sure I understand the question. When you turn on regions, it >>> makes extra labels, but the generic labels still exist. >>> >>> Thanks, >>> >>> Matt >>> >>>> That is, with a *.msh file containing >>>> >>>> $PhysicalNames >>>> 2 >>>> 2 100 "my_surface" >>>> 3 200 "my_vol" >>>> >>>> the return of DMGetLabelName(dm, n, name) would be (order may differ) >>>> >>>> n = 0, name = "celltype" >>>> n = 1, name = "depth" >>>> n = 2, name = "Cell Sets" >>>> n = 3, name = "my_vol" >>>> n = 4, name = "Face Sets" >>>> n = 5, name = "my_surface" >>>> ... >>>> >>>> I poked into src/dm/impls/plex/plexgmsh.c and have managed to print all >>>> the labels after changing a couple of variable values, so perhaps it is >>>> doable. >>>> The changes made are not a solution, simply naively set some variables >>>> to skip checking for the use_regions flag, so it understandably crashes >>>> soon after. >>>> >>>> Thanks, Noam >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 12 09:41:39 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 12 Jan 2024 09:41:39 -0600 Subject: [petsc-users] Parallel processes run significantly slower In-Reply-To: <20240112154309.Horde.qQWBnq9gTF7cTnAVnT6a10X@webmail.uni-bremen.de> References: <20240111171810.Horde.KBkZFRx_rd4KbBh9zYpql9G@webmail.uni-bremen.de> <20240112154309.Horde.qQWBnq9gTF7cTnAVnT6a10X@webmail.uni-bremen.de> Message-ID: Hi, Steffen, Would it be an MPI process binding issue? Could you try running with mpiexec --bind-to core -n N python parallel_example.py --Junchao Zhang On Fri, Jan 12, 2024 at 8:52?AM Steffen Wilksen | Universitaet Bremen < swilksen at itp.uni-bremen.de> wrote: > Thank you for your feedback. > @Stefano: the use of my communicator was intentional, since I later intend > to distribute M independent calculations to N processes, each process then > only needing to do M/N calculations. Of course I don't expect speed up in > my example since the number of calculations is constant and not dependent > on N, but I would hope that the time each process takes does not increase > too drastically with N. > @Barry: I tried to do the STREAMS benchmark, these are my results: > 1 23467.9961 Rate (MB/s) 1 > 2 26852.0536 Rate (MB/s) 1.1442 > 3 29715.4762 Rate (MB/s) 1.26621 > 4 34132.2490 Rate (MB/s) 1.45442 > 5 34924.3020 Rate (MB/s) 1.48817 > 6 34315.5290 Rate (MB/s) 1.46223 > 7 33134.9545 Rate (MB/s) 1.41192 > 8 33234.9141 Rate (MB/s) 1.41618 > 9 32584.3349 Rate (MB/s) 1.38846 > 10 32582.3962 Rate (MB/s) 1.38838 > 11 32098.2903 Rate (MB/s) 1.36775 > 12 32064.8779 Rate (MB/s) 1.36632 > 13 31692.0541 Rate (MB/s) 1.35044 > 14 31274.2421 Rate (MB/s) 1.33263 > 15 31574.0196 Rate (MB/s) 1.34541 > 16 30906.7773 Rate (MB/s) 1.31698 > > I also attached the resulting plot. As it seems, I get very bad MPI > speedup (red curve, right?), even decreasing if I use too many threads. I > don't fully understand the reasons given in the discussion you linked since > this is all very new to me, but I take that this is a problem with my > computer which I can't easily fix, right? > > > ----- Message from Barry Smith --------- > Date: Thu, 11 Jan 2024 11:56:24 -0500 > From: Barry Smith > Subject: Re: [petsc-users] Parallel processes run significantly slower > To: Steffen Wilksen | Universitaet Bremen > > Cc: PETSc users list > > > Take a look at the discussion in > https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html and > I suggest you run the streams benchmark from the branch barry/2023-09-15/fix-log-pcmpi > on your machine to get a baseline for what kind of speedup you can expect. > > Then let us know your thoughts. > > Barry > > > > On Jan 11, 2024, at 11:37?AM, Stefano Zampini > wrote: > > You are creating the matrix on the wrong communicator if you want it > parallel. You are using PETSc.COMM_SELF > > On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | Universitaet Bremen < > swilksen at itp.uni-bremen.de> wrote: > >> >> >> >> >> >> >> >> >> >> >> *Hi all, I'm trying to do repeated matrix-vector-multiplication of large >> sparse matrices in python using petsc4py. Even the most simple method of >> parallelization, dividing up the calculation to run on multiple processes >> indenpendtly, does not seem to give a singnificant speed up for large >> matrices. I constructed a minimal working example, which I run using >> mpiexec -n N python parallel_example.py, where N is the number of >> processes. Instead of taking approximately the same time irrespective of >> the number of processes used, the calculation is much slower when starting >> more MPI processes. This translates to little to no speed up when splitting >> up a fixed number of calculations over N processes. As an example, running >> with N=1 takes 9s, while running with N=4 takes 34s. When running with >> smaller matrices, the problem is not as severe (only slower by a factor of >> 1.5 when setting MATSIZE=1e+5 instead of MATSIZE=1e+6). I get the same >> problems when just starting the script four times manually without using >> MPI. I attached both the script and the log file for running the script >> with N=4. Any help would be greatly appreciated. Calculations are done on >> my laptop, arch linux version 6.6.8 and PETSc version 3.20.2. Kind Regards >> Steffen* >> > > > > *----- End message from Barry Smith > > -----* > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jan 12 09:43:42 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 12 Jan 2024 10:43:42 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> <9BD74F88-20FD-4021-B02A-195A58B72282@petsc.dev> Message-ID: PETSc vectors contain inside themselves an array with the numerical values. VecGetArrayF90() exposes this array to Fortran so you may access the values in that array. So VecGetArrayF90() does not create a new array, it gives you temporary access to an already existing array inside the vector. Barry > On Jan 11, 2024, at 11:49?PM, Shatanawi, Sawsan Muhammad wrote: > > Hello, > > Thank you all for your help. > > I have changed VecGetArray to VecGetArrayF90, and the location of destory call. but I want to make sure that VecGet ArrayF90 is to make a new array( vector) that I can use in the rest of my Fortran code? > > when I run it and debugged it, I got > > 5.2000000E-03 > 50.00000 > 10.00000 > 0.0000000E+00 > PETSC: Attaching gdb to /weka/data/lab/richey/sawsan/GW_CODE/code2024/SS_GWM/./GW.exe of pid 33065 on display :0.0 on machine sn16 > Unable to start debugger in xterm: No such file or directory > 0.0000000E+00 > Attempting to use an MPI routine after finalizing MPICH > srun: error: sn16: task 0: Exited with exit code 1 > [sawsan.shatanawi at login-p2n02 SS_GWM]$ gdb ./GW/exe > GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7 > Copyright (C) 2013 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu". > For bug reporting instructions, please see: > ... > ./GW/exe: No such file or directory. > (gdb) run > Starting program: > No executable file specified. > Use the "file" or "exec-file" command. > (gdb) bt > No stack. > (gdb) > > If the highlighted line is the error, I don't know why when I write gdb , it does not show me the location of error > The code : sshatanawi/SS_GWM (github.com) > > I really appreciate your helps > > Sawsan > From: Barry Smith > > Sent: Wednesday, January 10, 2024 5:35 PM > To: Junchao Zhang > > Cc: Shatanawi, Sawsan Muhammad >; Mark Adams >; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code > > [EXTERNAL EMAIL] > > >> On Jan 10, 2024, at 6:49?PM, Junchao Zhang > wrote: >> >> Hi, Sawsan, >> I could build your code and I also could gdb it. >> >> $ gdb ./GW.exe >> ... >> $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault. >> 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 >> 257 *ierr = VecGetArray(*x, &lx); >> (gdb) bt >> #0 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 >> #1 0x000000000040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at GW_solver_try.F90:169 >> #2 0x000000000040c6a8 in test_gw () at test_main.F90:35 >> >> ierr=0x0 caused the segfault. See https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray , you should use VecGetArrayF90 instead. >> >> BTW, Barry, the code https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 has "call VecGetArray(temp_solution, H_vector, ierr)". I don't find petsc Fortran examples doing VecGetArray. Do we still support it? > > This is not the correct calling sequence for VecGetArray() from Fortran. > > Regardless, definitely should not be writing any new code that uses VecGetArray() from Fortran. Should use VecGetArrayF90(). > >> >> --Junchao Zhang >> >> >> On Wed, Jan 10, 2024 at 2:38?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >> Hello all, >> >> I hope you are doing well. >> >> Generally, I use gdb to debug the code. >> I got the attached error message. >> >> I have tried to add the flag -start_in_debugger in the make file, but it didn't work, so it seems I was doing it in the wrong way >> >> This is the link for the whole code: sshatanawi/SS_GWM (github.com) >> >> GitHub - sshatanawi/SS_GWM >> Contribute to sshatanawi/SS_GWM development by creating an account on GitHub. >> github.com >> ? >> >> You can read the description of the code in " Model Desprciption.pdf" >> the compiling file is makefile_f90 where you can find the linked code files >> >> I really appreciate your help >> >> Bests, >> Sawsan >> From: Mark Adams > >> Sent: Friday, January 5, 2024 4:53 AM >> To: Shatanawi, Sawsan Muhammad > >> Cc: Matthew Knepley >; petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >> >> [EXTERNAL EMAIL] >> This is a segv. As Matt said, you need to use a debugger for this or add print statements to narrow down the place where this happens. >> >> You will need to learn how to use debuggers to do your project so you might as well start now. >> >> If you have a machine with a GUI debugger that is easier but command line debuggers are good to learn anyway. >> >> I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) and use a GUI debugger (eg, Totalview or DDT) if available. >> >> Mark >> >> >> On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >> Hello Matthew, >> >> Thank you for your help. I am sorry that I keep coming back with my error messages, but I reached a point that I don't know how to fix them, and I don't understand them easily. >> The list of errors is getting shorter, now I am getting the attached error messages >> >> Thank you again, >> >> Sawsan >> From: Matthew Knepley > >> Sent: Wednesday, December 20, 2023 6:54 PM >> To: Shatanawi, Sawsan Muhammad > >> Cc: Barry Smith >; petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >> >> [EXTERNAL EMAIL] >> On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >> Hello Barry, >> >> Thank you a lot for your help, Now I am getting the attached error message. >> >> Do not destroy the PC from KSPGetPC() >> >> THanks, >> >> Matt >> >> Bests, >> Sawsan >> From: Barry Smith > >> Sent: Wednesday, December 20, 2023 6:32 PM >> To: Shatanawi, Sawsan Muhammad > >> Cc: Mark Adams >; petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >> >> [EXTERNAL EMAIL] >> >> Instead of >> >> call PCCreate(PETSC_COMM_WORLD, pc, ierr) >> call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) >> call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver >> >> do >> >> call KSPGetPC(ksp,pc,ierr) >> call PCSetType(pc, PCILU,ierr) >> >> Do not call KSPSetUp(). It will be taken care of automatically during the solve >> >> >> >>> On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users > wrote: >>> >>> Hello, >>> I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. >>> I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) >>> >>> I appreciate your help >>> >>> Sawsan >>> >>> From: Mark Adams > >>> Sent: Wednesday, December 20, 2023 6:44 AM >>> To: Shatanawi, Sawsan Muhammad > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >>> >>> [EXTERNAL EMAIL] >>> Did you set preallocation values when you created the matrix? >>> Don't do that. >>> >>> On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: >>> Hello, >>> >>> I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it >>> >>> Get Outlook for iOS >>> >>> From: Mark Adams > >>> Sent: Wednesday, December 20, 2023 2:48 AM >>> To: Shatanawi, Sawsan Muhammad > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code >>> >>> [EXTERNAL EMAIL] >>> I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. >>> If this is what you want then you can tell the matrix to let you do that. >>> Otherwise you have a bug. >>> >>> Mark >>> >>> On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: >>> Hello everyone, >>> >>> I hope this email finds you well. >>> >>> My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. >>> >>> I am kindly asking if someone can help me, I would be happy to share my code with him/her. >>> >>> Please find the attached file contains a list of errors I have gotten >>> >>> Thank you in advance for your time and assistance. >>> Best regards, >>> >>> Sawsan >>> >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 12 10:46:41 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 12 Jan 2024 10:46:41 -0600 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> <9BD74F88-20FD-4021-B02A-195A58B72282@petsc.dev> Message-ID: Hi, Sawsan, First in test_main.F90, you need to call VecGetArrayF90(temp_solution, H_vector, ierr) and VecRestoreArrayF90 (temp_solution, H_vector, ierr) as Barry mentioned. Secondly, in the loop of test_main.F90, it calls GW_solver(). Within it, it calls PetscInitialize()/PetscFinalize(). But without MPI being initialized, PetscInitialize()/PetscFinalize()* can only be called once. * do timestep =2 , NTSP call GW_boundary_conditions(timestep-1) !print *,HNEW(1,1,1) call GW_elevation() ! print *, GWTOP(2,2,2) call GW_conductance() ! print *, CC(2,2,2) call GW_recharge() ! print *, B_Rech(5,4) call GW_pumping(timestep-1) ! print *, B_pump(2,2,2) call GW_SW(timestep-1) print *,B_RIVER (2,2,2) call GW_solver(timestep-1,N) call GW_deallocate_loop() end do A solution is to delete PetscInitialize()/PetscFinalize() in GW_solver_try.F90 and move it to test_main.F90, outside the do loop. diff --git a/test_main.F90 b/test_main.F90 index b5997c55..107bd3ee 100644 --- a/test_main.F90 +++ b/test_main.F90 @@ -1,5 +1,6 @@ program test_GW +#include use petsc use GW_constants use GW_param_by_user @@ -8,6 +9,9 @@ program test_GW implicit none integer :: N integer :: timestep + PetscErrorCode ierr + + call PetscInitialize(ierr) call GW_domain(N) !print *, "N=",N !print *, DELTAT @@ -37,4 +41,5 @@ program test_GW end do print *, HNEW(NCOL,3,2) call GW_deallocate () + call PetscFinalize(ierr) end program test_GW With that, the MPI error will be fixed. The code could run to gw_deallocate () before abort. There are other memory errors. You can install/use valgrind to fix them. Run it with valgrind ./GW.exe and look through the output Thanks. --Junchao Zhang On Thu, Jan 11, 2024 at 10:49?PM Shatanawi, Sawsan Muhammad < sawsan.shatanawi at wsu.edu> wrote: > Hello, > > Thank you all for your help. > > I have changed VecGetArray to VecGetArrayF90, and the location of destory > call. but I want to make sure that VecGet ArrayF90 is to make a new array( > vector) that I can use in the rest of my Fortran code? > > when I run it and debugged it, I got > > 5.2000000E-03 > 50.00000 > 10.00000 > 0.0000000E+00 > PETSC: Attaching gdb to > /weka/data/lab/richey/sawsan/GW_CODE/code2024/SS_GWM/./GW.exe of pid 33065 > on display :0.0 on machine sn16 > Unable to start debugger in xterm: No such file or directory > 0.0000000E+00 > Attempting to use an MPI routine after finalizing MPICH > srun: error: sn16: task 0: Exited with exit code 1 > [sawsan.shatanawi at login-p2n02 SS_GWM]$ gdb ./GW/exe > GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7 > Copyright (C) 2013 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later < > http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu". > For bug reporting instructions, please see: > ... > ./GW/exe: No such file or directory. > (gdb) run > Starting program: > No executable file specified. > Use the "file" or "exec-file" command. > (gdb) bt > No stack. > (gdb) > > If the highlighted line is the error, I don't know why when I write gdb , > it does not show me the location of error > The code : sshatanawi/SS_GWM (github.com) > > > I really appreciate your helps > > Sawsan > ------------------------------ > *From:* Barry Smith > *Sent:* Wednesday, January 10, 2024 5:35 PM > *To:* Junchao Zhang > *Cc:* Shatanawi, Sawsan Muhammad ; Mark Adams < > mfadams at lbl.gov>; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > > > On Jan 10, 2024, at 6:49?PM, Junchao Zhang > wrote: > > Hi, Sawsan, > I could build your code and I also could gdb it. > > $ gdb ./GW.exe > ... > $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault. > 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, > ia=0x7fffffffa75c, ierr=0x0) at > /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 > 257 *ierr = VecGetArray(*x, &lx); > (gdb) bt > #0 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, > ia=0x7fffffffa75c, ierr=0x0) at > /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 > #1 0x000000000040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at > GW_solver_try.F90:169 > #2 0x000000000040c6a8 in test_gw () at test_main.F90:35 > > ierr=0x0 caused the segfault. See > https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray > , > you should use VecGetArrayF90 instead. > > BTW, Barry, the code > https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 > > has "call VecGetArray(temp_solution, H_vector, ierr)". I don't find > petsc Fortran examples doing VecGetArray. Do we still support it? > > > This is not the correct calling sequence for VecGetArray() from > Fortran. > > Regardless, definitely should not be writing any new code that uses > VecGetArray() from Fortran. Should use VecGetArrayF90(). > > > --Junchao Zhang > > > On Wed, Jan 10, 2024 at 2:38?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello all, > > I hope you are doing well. > > Generally, I use gdb to debug the code. > I got the attached error message. > > I have tried to add the flag -start_in_debugger in the make file, but it > didn't work, so it seems I was doing it in the wrong way > > This is the link for the whole code: sshatanawi/SS_GWM (github.com) > > > > GitHub - sshatanawi/SS_GWM > > Contribute to sshatanawi/SS_GWM development by creating an account on > GitHub. > github.com > > *?* > > You can read the description of the code in " Model Desprciption.pdf" > the compiling file is makefile_f90 where you can find the linked code > files > > I really appreciate your help > > Bests, > Sawsan > ------------------------------ > *From:* Mark Adams > *Sent:* Friday, January 5, 2024 4:53 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Matthew Knepley ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > This is a segv. As Matt said, you need to use a debugger for this or add > print statements to narrow down the place where this happens. > > You will need to learn how to use debuggers to do your project so you > might as well start now. > > If you have a machine with a GUI debugger that is easier but command line > debuggers are good to learn anyway. > > I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) > and use a GUI debugger (eg, Totalview or DDT) if available. > > Mark > > > On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via > petsc-users wrote: > > Hello Matthew, > > Thank you for your help. I am sorry that I keep coming back with my error > messages, but I reached a point that I don't know how to fix them, and I > don't understand them easily. > The list of errors is getting shorter, now I am getting the attached error > messages > > Thank you again, > > Sawsan > ------------------------------ > *From:* Matthew Knepley > *Sent:* Wednesday, December 20, 2023 6:54 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello Barry, > > Thank you a lot for your help, Now I am getting the attached error message. > > > Do not destroy the PC from KSPGetPC() > > THanks, > > Matt > > > Bests, > Sawsan > ------------------------------ > *From:* Barry Smith > *Sent:* Wednesday, December 20, 2023 6:32 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Mark Adams ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > > Instead of > > call PCCreate(PETSC_COMM_WORLD, pc, ierr) > call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) > call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the > KSP solver > > do > > call KSPGetPC(ksp,pc,ierr) > call PCSetType(pc, PCILU,ierr) > > Do not call KSPSetUp(). It will be taken care of automatically during the > solve > > > > On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, > I don't think that I set preallocation values when I created the matrix, > would you please have look at my code. It is just the petsc related part > from my code. > I was able to fix some of the error messages. Now I have a new set of > error messages related to the KSP solver (attached) > > I appreciate your help > > Sawsan > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 6:44 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > Did you set preallocation values when you created the matrix? > Don't do that. > > On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad < > sawsan.shatanawi at wsu.edu> wrote: > > Hello, > > I am trying to create a sparse matrix( which is as I believe a zero > matrix) then adding some nonzero elements to it over a loop, then > assembling it > > Get Outlook for iOS > > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 2:48 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > I am guessing that you are creating a matrix, adding to it, finalizing it > ("assembly"), and then adding to it again, which is fine, but you are > adding new non-zeros to the sparsity pattern. > If this is what you want then you can tell the matrix to let you do that. > Otherwise you have a bug. > > Mark > > On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello everyone, > > I hope this email finds you well. > > My Name is Sawsan Shatanawi, and I am currently working on developing a > Fortran code for simulating groundwater flow in a 3D system. The code > involves solving a nonlinear system, and I have created the matrix to be > solved using the PCG solver and Picard iteration. However, when I tried > to assign it as a PETSc matrix I started getting a lot of error messages. > > I am kindly asking if someone can help me, I would be happy to share my > code with him/her. > > Please find the attached file contains a list of errors I have gotten > > Thank you in advance for your time and assistance. > > Best regards, > > Sawsan > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swilksen at itp.uni-bremen.de Fri Jan 12 11:13:35 2024 From: swilksen at itp.uni-bremen.de (Steffen Wilksen | Universitaet Bremen) Date: Fri, 12 Jan 2024 18:13:35 +0100 Subject: [petsc-users] Parallel processes run significantly slower In-Reply-To: References: <20240111171810.Horde.KBkZFRx_rd4KbBh9zYpql9G@webmail.uni-bremen.de> <20240112154309.Horde.qQWBnq9gTF7cTnAVnT6a10X@webmail.uni-bremen.de> Message-ID: <20240112181335.Horde.h-9yvlCLFpW-PRp3BQjGBzy@webmail.uni-bremen.de> Hi Junchao, I tried it out, but unfortunately, this does not seem to give any imporvements, the code is still much slower when starting more processes. ----- Message from Junchao Zhang --------- ? ?Date: Fri, 12 Jan 2024 09:41:39 -0600 ? ?From: Junchao Zhang Subject: Re: [petsc-users] Parallel processes run significantly slower ? ? ?To: Steffen Wilksen | Universitaet Bremen ? ? ?Cc: Barry Smith , PETSc users list > Hi,? Steffen, ? Would it be an MPI process binding > issue???Could you try running with? > >> mpiexec --bind-to core -n N python parallel_example.py > > > --Junchao Zhang > > On Fri, Jan 12, 2024 at 8:52?AM Steffen Wilksen | Universitaet > Bremen wrote: > >> _Thank you for your feedback. >> @Stefano: the use of my communicator was intentional, since I later >> intend to distribute M independent calculations to N processes, >> each process then only needing to do M/N calculations. Of course I >> don't expect speed up in my example since the number of >> calculations is constant and not dependent on N, but I would hope >> that the time each process takes does not increase too drastically >> with N. >> @Barry: I tried to do the STREAMS benchmark, these are my results: >> 1? 23467.9961?? Rate (MB/s) 1 >> 2? 26852.0536?? Rate (MB/s) 1.1442 >> 3? 29715.4762?? Rate (MB/s) 1.26621 >> 4? 34132.2490?? Rate (MB/s) 1.45442 >> 5? 34924.3020?? Rate (MB/s) 1.48817 >> 6? 34315.5290?? Rate (MB/s) 1.46223 >> 7? 33134.9545?? Rate (MB/s) 1.41192 >> 8? 33234.9141?? Rate (MB/s) 1.41618 >> 9? 32584.3349?? Rate (MB/s) 1.38846 >> 10? 32582.3962?? Rate (MB/s) 1.38838 >> 11? 32098.2903?? Rate (MB/s) 1.36775 >> 12? 32064.8779?? Rate (MB/s) 1.36632 >> 13? 31692.0541?? Rate (MB/s) 1.35044 >> 14? 31274.2421?? Rate (MB/s) 1.33263 >> 15? 31574.0196?? Rate (MB/s) 1.34541 >> 16? 30906.7773?? Rate (MB/s) 1.31698 >> >> I also attached the resulting plot. As it seems, I get very bad MPI >> speedup (red curve, right?), even decreasing if I use too many >> threads. I don't fully understand the reasons given in the >> discussion you linked since this is all very new to me, but I take >> that this is a problem with my computer which I can't easily fix, >> right? >> >> ----- Message from Barry Smith --------- >> ? ?Date: Thu, 11 Jan 2024 11:56:24 -0500 >> ? ?From: Barry Smith >> Subject: Re: [petsc-users] Parallel processes run significantly slower >> ? ? ?To: Steffen Wilksen | Universitaet Bremen >> ? ? ?Cc: PETSc users list _ >> >>> _?_ >>> _? ?Take a look at the discussion >>> in?https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html?and I suggest you run the streams benchmark from the branch?barry/2023-09-15/fix-log-pcmpi on your machine to get a baseline for what kind of speedup you can expect. ? _ >>> _?_ >>> _? ? Then let us know your thoughts._ >>> _?_ >>> _? ?Barry_ >>> >>> >>> >>> >>>> _On Jan 11, 2024, at 11:37?AM, Stefano Zampini >>>> wrote:_ >>>> >>>> _You are creating the matrix on the wrong >>>> communicator if you want it parallel. You are using >>>> PETSc.COMM_SELF_ >>>> >>>> _On Thu, Jan 11, 2024, 19:28 Steffen >>>> Wilksen | Universitaet Bremen wrote:_ >>>> >>>>> __Hi all, >>>>> >>>>> I'm trying to do repeated matrix-vector-multiplication of large >>>>> sparse matrices in python using petsc4py. Even the most simple >>>>> method of parallelization, dividing up the calculation to run on >>>>> multiple processes indenpendtly, does not seem to give a >>>>> singnificant speed up for large matrices. I constructed a >>>>> minimal working example, which I run using >>>>> >>>>> mpiexec -n N python parallel_example.py, >>>>> >>>>> where N is the number of processes. Instead of taking >>>>> approximately the same time irrespective of the number of >>>>> processes used, the calculation is much slower when starting >>>>> more MPI processes. This translates to little to no speed up >>>>> when splitting up a fixed number of calculations over N >>>>> processes. As an example, running with N=1 takes 9s, while >>>>> running with N=4 takes 34s. When running with smaller matrices, >>>>> the problem is not as severe (only slower by a factor of 1.5 >>>>> when setting MATSIZE=1e+5 instead of MATSIZE=1e+6). I get the >>>>> same problems when just starting the script four times manually >>>>> without using MPI. >>>>> I attached both the script and the log file for running the >>>>> script with N=4. Any help would be greatly appreciated. >>>>> Calculations are done on my laptop, arch linux version 6.6.8 and >>>>> PETSc version 3.20.2. >>>>> >>>>> Kind Regards >>>>> Steffen__ >> >> __----- End message from Barry Smith -----__ >> >> ? _----- End message from Junchao Zhang -----_ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 12 13:35:56 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 12 Jan 2024 13:35:56 -0600 Subject: [petsc-users] Parallel processes run significantly slower In-Reply-To: <20240112181335.Horde.h-9yvlCLFpW-PRp3BQjGBzy@webmail.uni-bremen.de> References: <20240111171810.Horde.KBkZFRx_rd4KbBh9zYpql9G@webmail.uni-bremen.de> <20240112154309.Horde.qQWBnq9gTF7cTnAVnT6a10X@webmail.uni-bremen.de> <20240112181335.Horde.h-9yvlCLFpW-PRp3BQjGBzy@webmail.uni-bremen.de> Message-ID: Hi, Steffen, It is probably because your laptop CPU is "weak". I have a local machine with one Intel Core i7 processor, which has 8 cores (16 hardware threads). I got a similar STREAM speedup. It just means 1~2 MPI ranks can use up all the memory bandwidth. That is why with your (weak scaling) test, more MPI ranks just gave longer time. Because the MPI processes had to share the memory bandwidth. On another machine with two AMD EPYC 7452 32-Core processors, there are 8 NUMA domains. I got $ mpirun -n 1 --bind-to numa --map-by numa ./MPIVersion 1 22594.4873 Rate (MB/s) $ mpirun -n 8 --bind-to numa --map-by numa ./MPIVersion 8 173565.3584 Rate (MB/s) 7.68175 On this kind of machine, you can expect constant time of your test up to 8 MPI ranks. --Junchao Zhang On Fri, Jan 12, 2024 at 11:13?AM Steffen Wilksen | Universitaet Bremen < swilksen at itp.uni-bremen.de> wrote: > Hi Junchao, > > I tried it out, but unfortunately, this does not seem to give any > imporvements, the code is still much slower when starting more processes. > > > ----- Message from Junchao Zhang --------- > Date: Fri, 12 Jan 2024 09:41:39 -0600 > From: Junchao Zhang > Subject: Re: [petsc-users] Parallel processes run significantly slower > To: Steffen Wilksen | Universitaet Bremen > > Cc: Barry Smith , PETSc users list < > petsc-users at mcs.anl.gov> > > Hi, Steffen, > Would it be an MPI process binding issue? Could you try running with > > mpiexec --bind-to core -n N python parallel_example.py > > > --Junchao Zhang > > On Fri, Jan 12, 2024 at 8:52?AM Steffen Wilksen | Universitaet Bremen < > swilksen at itp.uni-bremen.de> wrote: > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> *Thank you for your feedback. @Stefano: the use of my communicator was >> intentional, since I later intend to distribute M independent calculations >> to N processes, each process then only needing to do M/N calculations. Of >> course I don't expect speed up in my example since the number of >> calculations is constant and not dependent on N, but I would hope that the >> time each process takes does not increase too drastically with N. @Barry: I >> tried to do the STREAMS benchmark, these are my results: 1 23467.9961 >> Rate (MB/s) 1 2 26852.0536 Rate (MB/s) 1.1442 3 29715.4762 Rate >> (MB/s) 1.26621 4 34132.2490 Rate (MB/s) 1.45442 5 34924.3020 Rate >> (MB/s) 1.48817 6 34315.5290 Rate (MB/s) 1.46223 7 33134.9545 Rate >> (MB/s) 1.41192 8 33234.9141 Rate (MB/s) 1.41618 9 32584.3349 Rate >> (MB/s) 1.38846 10 32582.3962 Rate (MB/s) 1.38838 11 32098.2903 Rate >> (MB/s) 1.36775 12 32064.8779 Rate (MB/s) 1.36632 13 31692.0541 Rate >> (MB/s) 1.35044 14 31274.2421 Rate (MB/s) 1.33263 15 31574.0196 Rate >> (MB/s) 1.34541 16 30906.7773 Rate (MB/s) 1.31698 I also attached the >> resulting plot. As it seems, I get very bad MPI speedup (red curve, >> right?), even decreasing if I use too many threads. I don't fully >> understand the reasons given in the discussion you linked since this is all >> very new to me, but I take that this is a problem with my computer which I >> can't easily fix, right? ----- Message from Barry Smith > > --------- Date: Thu, 11 Jan 2024 11:56:24 -0500 >> From: Barry Smith > Subject: Re: >> [petsc-users] Parallel processes run significantly slower To: Steffen >> Wilksen | Universitaet Bremen > > Cc: PETSc users list >> >* >> >> >> * Take a look at the discussion >> in https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html >> and >> I suggest you run the streams benchmark from the >> branch barry/2023-09-15/fix-log-pcmpi on your machine to get a baseline for >> what kind of speedup you can expect. * >> >> * Then let us know your thoughts.* >> >> * Barry* >> >> >> >> *On Jan 11, 2024, at 11:37?AM, Stefano Zampini > > wrote:* >> >> *You are creating the matrix on the wrong communicator if you want it >> parallel. You are using PETSc.COMM_SELF* >> >> *On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | Universitaet Bremen >> > wrote:* >> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> *Hi all, I'm trying to do repeated matrix-vector-multiplication of large >>> sparse matrices in python using petsc4py. Even the most simple method of >>> parallelization, dividing up the calculation to run on multiple processes >>> indenpendtly, does not seem to give a singnificant speed up for large >>> matrices. I constructed a minimal working example, which I run using >>> mpiexec -n N python parallel_example.py, where N is the number of >>> processes. Instead of taking approximately the same time irrespective of >>> the number of processes used, the calculation is much slower when starting >>> more MPI processes. This translates to little to no speed up when splitting >>> up a fixed number of calculations over N processes. As an example, running >>> with N=1 takes 9s, while running with N=4 takes 34s. When running with >>> smaller matrices, the problem is not as severe (only slower by a factor of >>> 1.5 when setting MATSIZE=1e+5 instead of MATSIZE=1e+6). I get the same >>> problems when just starting the script four times manually without using >>> MPI. I attached both the script and the log file for running the script >>> with N=4. Any help would be greatly appreciated. Calculations are done on >>> my laptop, arch linux version 6.6.8 and PETSc version 3.20.2. Kind Regards >>> Steffen* >>> >> >> >> >> *----- End message from Barry Smith > >> -----* >> >> >> > > > > *----- End message from Junchao Zhang > -----* > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gourav.kumbhojkar at gmail.com Sun Jan 14 23:50:31 2024 From: gourav.kumbhojkar at gmail.com (Gourav Kumbhojkar) Date: Mon, 15 Jan 2024 05:50:31 +0000 Subject: [petsc-users] Neumann Boundary Condition with DMDACreate3D In-Reply-To: References: <0A4EB78C-997F-4978-8945-771B351B08CE@petsc.dev> <1130EB7D-5472-434C-A700-D424381940B5@petsc.dev> <2E18FABC-B59F-4F02-9BD6-0DFBBD4FB878@petsc.dev> Message-ID: The fix works well. Sorry for the delayed update. Thanks again. Gourav From: Gourav Kumbhojkar Date: Tuesday, January 9, 2024 at 11:50?PM To: Barry Smith Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D Thank you very much for the fix. I?ll also update here as soon as I test it on my application code. Many thanks. Gourav From: Barry Smith Date: Tuesday, January 9, 2024 at 8:59?PM To: Gourav Kumbhojkar Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D Sorry for the delay. The fix is in the git branch barry/2024-01-09/fix-mirror-dmda-3d/release see also https://gitlab.com/petsc/petsc/-/merge_requests/7175 Barry On Jan 8, 2024, at 3:44?PM, Gourav Kumbhojkar wrote: You are right. Attaching the code that I?m using to test this. The output matrix is saved in separate ascii files. You can use ?make noflux? to compile the code. Gourav From: Barry Smith > Date: Saturday, January 6, 2024 at 7:08?PM To: Gourav Kumbhojkar > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D If the mirror code for star stencil is just wrong in 3d we should simply fix it. Not use some other approach. Can you attach code that tries to do what you need for both 2d (that results in a matrix you are happy with) and 3d (that results in a matrix that you are not happy with). Barry On Jan 6, 2024, at 7:30?PM, Gourav Kumbhojkar > wrote: Thank you, Barry. Sorry for the late response. Yes, I was referring to the same text. I am using a star stencil. However, I don?t think the mirror condition is implemented for star stencil either. TLDR version of the whole message typed below ? I think DM_BOUNDARY_GHOSTED is not implemented correctly in 3D. It appears that ghost nodes are mirrored with boundary nodes themselves. They should mirror with the nodes next to boundary. Long version - Here?s what I?m trying to do ? Step 1 - Create a 3D DM ierr = DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DM_BOUNDARY_MIRROR, DMDA_STENCIL_STAR, num_pts, num_pts, num_pts, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, NULL, &da); CHKERRQ(ierr); Note - num_pts = 4 in my code. Step 2 ? Create a Matrix from DM ( a FDM stiffness matrix) DMCreateMatrix(da, &K); globalKMat(K, info); ?globalKMat? is a user-defined function. Here?s a snippet from this function: for (int i = info.xs; i < (info.xs + info.xm); i++){ for(int j = info.ys; j < (info.ys + info.ym); j++){ for (int k = info.zs; k < (info.zs + info.zm); k++){ ncols = 0; row.i = i; row.j = j; row.k = k; col[0].i = i; col[0].j = j; col[0].k = k; vals[ncols++] = -6.; //ncols=1 col[ncols].i = i-1; col[ncols].j = j; col[ncols].k = k; vals[ncols++] = 1.;//ncols=2 There are total 7 ?ncols?. Other than the first one all ncols have value 1 (first one is set to -6). As you can see, this step is to only build the FDM stiffness matrix. I use ?ADD_VALUES? at the end in the above function. Step 3 ? View the stiffness matrix to check the values. I use MatView for this. Here are the results ? 1. 3D DM (showing first few rows of the stiffness matrix here, the original matrix is 64x64)- Mat Object: 1 MPI processes type: seqaij row 0: (0, -3.) (1, 1.) (4, 1.) (16, 1.) row 1: (0, 1.) (1, -4.) (2, 1.) (5, 1.) (17, 1.) row 2: (1, 1.) (2, -4.) (3, 1.) (6, 1.) (18, 1.) row 3: (2, 1.) (3, -3.) (7, 1.) (19, 1.) row 4: (0, 1.) (4, -4.) (5, 1.) (8, 1.) (20, 1.) row 5: (1, 1.) (4, 1.) (5, -5.) (6, 1.) (9, 1.) (21, 1.) 1. Repeat the same steps for a 2D DM to show the difference (the entire matrix is now 16x16) Mat Object: 1 MPI processes type: seqaij row 0: (0, -4.) (1, 2.) (4, 2.) row 1: (0, 1.) (1, -4.) (2, 1.) (5, 2.) row 2: (1, 1.) (2, -4.) (3, 1.) (6, 2.) row 3: (2, 2.) (3, -4.) (7, 2.) row 4: (0, 1.) (4, -4.) (5, 2.) (8, 1.) row 5: (1, 1.) (4, 1.) (5, -4.) (6, 1.) (9, 1.) I suspect that when using ?DM_BOUNDARY_MIRROR? in 3D, the ghost node value is added to the boundary node itself, which would explain why row 0 of the stiffness matrix has -3 instead of -6. In principle the ghost node value should be mirrored with the node next to boundary. Clearly, there?s no issue with the 2D implementation of the mirror boundary. The row 0 values are -4, 2, and 2 as expected. Let me know if I should give any other information about this. I also thought about using DM_BOUNDARY_GHOSTED and implement the mirror boundary in 3D from scratch but I would really appreciate some resources on how to do that. Thank you. Gourav From: Barry Smith > Date: Thursday, January 4, 2024 at 12:24?PM To: Gourav Kumbhojkar > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Neumann Boundary Condition with DMDACreate3D Are you referring to the text? . `DM_BOUNDARY_MIRROR` - the ghost value is the same as the value 1 grid point in; that is, the 0th grid point in the real mesh acts like a mirror to define the ghost point value; not yet implemented for 3d Looking at the code for DMSetUp_DA_3D() I see PetscCheck(stencil_type != DMDA_STENCIL_BOX || (bx != DM_BOUNDARY_MIRROR && by != DM_BOUNDARY_MIRROR && bz != DM_BOUNDARY_MIRROR), PetscObjectComm((PetscObject)da), PETSC_ERR_SUP, "Mirror boundary and box stencil"); which seems (to me) to indicate the mirroring is not done for box stencils but should work for star stencils. Are you using a star stencil or a box stencil? I believe the code is not complete for box stencil because the code to determine the location of the "mirrored point" for extra "box points" is messy in 3d and no one wrote it. You can compare DMSetUp_DA_2D() and DMSetUp_DA_3D() to see what is missing and see if you can determine how to add it for 3d. Barry On Jan 4, 2024, at 1:08?PM, Gourav Kumbhojkar > wrote: Hi, I am trying to implement a No-flux boundary condition for a 3D domain. I previously modeled a no flux boundary in 2D domain using DMDACreate2D and ?PETSC_BOUNDARY_MIRROR? which worked well. However, the manual pages say that the Mirror Boundary is not supported for 3D. Could you please point me to the right resources to implement no flux boundary condition in 3D domains? Regards, Gourav K. -------------- next part -------------- An HTML attachment was scrubbed... URL: From 1143418754 at qq.com Mon Jan 15 03:54:47 2024 From: 1143418754 at qq.com (=?gb18030?B?MTE0MzQxODc1NA==?=) Date: Mon, 15 Jan 2024 17:54:47 +0800 Subject: [petsc-users] Question about petsc4py with cuda Message-ID: 1143418754 1143418754 at qq.com  Hi, I am trying to solve a large linear equation, which needs a GPU solver as comparison. I install a CUDA-enabled PETSc and petsc4py from sources using the release tarball. According to the test results after installation, the PETSc can successfully work with cuda.  All my programs are written in python, so I turn to petsc4py. But I do not find any commands that define variables on coda device or define where the direct solver is executed. I check `nvidia-smi` and find my cuda does not work at all when executing my python script: from petsc4py import PETSc import numpy as np n = 1000 nnz = 3 * np.ones(n, dtype=np.int32) nnz[0] = nnz[-1] = 2 A = PETSc.Mat() A.createAIJ([n, n], nnz=nnz) # First set the first row A.setValue(0, 0, 2) A.setValue(0, 1, -1) # Now we fill the last row A.setValue(999, 998, -1) A.setValue(999, 999, 2) # And now everything else for index in range(1, n - 1): A.setValue(index, index - 1, -1) A.setValue(index, index, 2) A.setValue(index, index + 1, -1)  A.assemble() indexptr, indices, data = A.getValuesCSR() b = A.createVecLeft() b.array[:] = 1 for i in range(10): ksp = PETSc.KSP().create() ksp.setOperators(A) ksp.setType('preonly') ksp.setConvergenceHistory() ksp.getPC().setType('lu') x = A.createVecRight() ksp.solve(2*b, x) residual = A * x - 2*b if i % 10 == 0: print(f"The relative residual is: {residual.norm() / b.norm()}.?) What should I do to utlize GPU to execute the KSP task? Are there some settings to be modified? Looking forward to your early reply. Thanks a lot. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zyliu20 at icloud.com Mon Jan 15 03:09:51 2024 From: zyliu20 at icloud.com (MIA) Date: Mon, 15 Jan 2024 17:09:51 +0800 Subject: [petsc-users] Question about petsc4py with cuda Message-ID: <0441E69D-CE6F-41E3-AE67-7BA4C55B044A@icloud.com> Hi, I am trying to solve a large linear equation, which needs a GPU solver as comparison. I install a CUDA-enabled PETSc and petsc4py from sources using the release tarball. According to the test results after installation, the PETSc can successfully work with cuda. All my programs are written in python, so I turn to petsc4py. But I do not find any commands that define variables on coda device or define where the direct solver is executed. I check `nvidia-smi` and find my cuda does not work at all when executing my python script: from petsc4py import PETSc import numpy as np n = 1000 nnz = 3 * np.ones(n, dtype=np.int32) nnz[0] = nnz[-1] = 2 A = PETSc.Mat() A.createAIJ([n, n], nnz=nnz) # First set the first row A.setValue(0, 0, 2) A.setValue(0, 1, -1) # Now we fill the last row A.setValue(999, 998, -1) A.setValue(999, 999, 2) # And now everything else for index in range(1, n - 1): A.setValue(index, index - 1, -1) A.setValue(index, index, 2) A.setValue(index, index + 1, -1) A.assemble() indexptr, indices, data = A.getValuesCSR() b = A.createVecLeft() b.array[:] = 1 for i in range(10): ksp = PETSc.KSP().create() ksp.setOperators(A) ksp.setType('preonly') ksp.setConvergenceHistory() ksp.getPC().setType('lu') x = A.createVecRight() ksp.solve(2*b, x) residual = A * x - 2*b if i % 10 == 0: print(f"The relative residual is: {residual.norm() / b.norm()}.?) What should I do to utlize GPU to execute the KSP task? Are there some settings to be modified? Looking forward to your early reply. Thanks a lot. From knepley at gmail.com Mon Jan 15 11:23:25 2024 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 Jan 2024 12:23:25 -0500 Subject: [petsc-users] Question about petsc4py with cuda In-Reply-To: <0441E69D-CE6F-41E3-AE67-7BA4C55B044A@icloud.com> References: <0441E69D-CE6F-41E3-AE67-7BA4C55B044A@icloud.com> Message-ID: On Mon, Jan 15, 2024 at 11:57?AM MIA via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > I am trying to solve a large linear equation, which needs a GPU solver as > comparison. I install a CUDA-enabled PETSc and petsc4py from sources using > the release tarball. According to the test results after installation, the > PETSc can successfully work with cuda. > Here is a How-To for GPUs: https://petsc.org/main/faq/#doc-faq-gpuhowto Thanks, Matt > All my programs are written in python, so I turn to petsc4py. But I do not > find any commands that define variables on coda device or define where the > direct solver is executed. I check `nvidia-smi` and find my cuda does not > work at all when executing my python script: > > from petsc4py import PETSc > import numpy as np > > n = 1000 > > nnz = 3 * np.ones(n, dtype=np.int32) > nnz[0] = nnz[-1] = 2 > > A = PETSc.Mat() > A.createAIJ([n, n], nnz=nnz) > > # First set the first row > A.setValue(0, 0, 2) > A.setValue(0, 1, -1) > # Now we fill the last row > A.setValue(999, 998, -1) > A.setValue(999, 999, 2) > > # And now everything else > for index in range(1, n - 1): > A.setValue(index, index - 1, -1) > A.setValue(index, index, 2) > A.setValue(index, index + 1, -1) > > A.assemble() > > indexptr, indices, data = A.getValuesCSR() > b = A.createVecLeft() > b.array[:] = 1 > for i in range(10): > ksp = PETSc.KSP().create() > ksp.setOperators(A) > ksp.setType('preonly') > ksp.setConvergenceHistory() > ksp.getPC().setType('lu') > x = A.createVecRight() > ksp.solve(2*b, x) > residual = A * x - 2*b > if i % 10 == 0: > print(f"The relative residual is: {residual.norm() / b.norm()}.?) > > What should I do to utlize GPU to execute the KSP task? Are there some > settings to be modified? > > Looking forward to your early reply. Thanks a lot. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Tue Jan 16 03:47:52 2024 From: y.hu at mpie.de (Yi Hu) Date: Tue, 16 Jan 2024 10:47:52 +0100 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: References: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> <10A79A13-67A7-4C2E-B6F6-6E73B58856A2@petsc.dev> Message-ID: Dear Barry, Thanks for your reply. In fact, this still does not work, giving the error ?code not yet written for matrix type shell?. I also tried recording F_global as base in my global shell matrix Jac_PETSc. And fetch it in my GK_op, but the same error occurs. So the question is still, how can I assign the Jac in my formJacobian() to my global defined Jac_PETSc when set up with SNESSetJacobian(snes,Jac_PETSc,Jac_PETSc,formJacobian,0,ierr). As far as I understand, when using SNESSetJacobian() the jac argument is only for reserving memory. And the matrix is not written yet. Only a valid formJacobian() can lead to a write to the matrix. My formJacobian maybe not valid (because the output not gives me print), so the Jac_PETSc is never written for the SNES. Maybe I am wrong about it. ? Best wishes, Yi From: Barry Smith Sent: Thursday, January 11, 2024 4:47 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation The following assumes you are not using the shell matrix context for some other purpose subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) SNES :: snes Vec :: F_global ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & ! F Mat :: Jac, Jac_pre PetscObject :: dummy PetscErrorCode :: err_PETSc print*, '@@ start build my jac' PetscCall(MatShellSetContext(Jac,F_global,ierr)) ! record the current base vector where the Jacobian is to be applied print*, '@@ end build my jac' end subroutine formJacobian subroutine Gk_op ... Vec base PetscCall(MatShellGetContext(Jac,base,ierr)) ! use base in the computation of your matrix-free Jacobian vector product .... On Jan 11, 2024, at 5:55?AM, Yi Hu wrote: Now I understand a bit more about the workflow of set jacobian. It seems that the SNES can be really fine-grained. As you point out, J is built via formJacobian() callback, and can be based on previous solution (or the base vector u, as you mentioned). And then KSP can use a customized MATOP_MULT to solve the linear equations J(u)*x=rhs. So I followed your idea about removing DMSNESSetJacobianLocal() and did the following. ?? call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& 0,Jac_PETSc,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,formJacobian,0,err_PETSc) CHKERRQ(err_PETSc) ?? And my formJacobian() is subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) SNES :: snes Vec :: F_global ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & ! F Mat :: Jac, Jac_pre PetscObject :: dummy PetscErrorCode :: err_PETSc print*, '@@ start build my jac' call MatCopy(Jac_PETSc,Jac,SAME_NONZERO_PATTERN,err_PETSc) CHKERRQ(err_PETSc) call MatCopy(Jac_PETSc,Jac_pre,SAME_NONZERO_PATTERN,err_PETSc) CHKERRQ(err_PETSc) ! Jac = Jac_PETSc ! Jac_pre = Jac_PETSc print*, '@@ end build my jac' end subroutine formJacobian it turns out that no matter by a simple assignment or MatCopy(), the compiled program gives me the same error as before. So I guess the real jacobian is still not set. I wonder how to get around this and let this built jac in formJacobian() to be the same as my shell matrix. Yi From: Barry Smith Sent: Wednesday, January 10, 2024 4:27 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation By default if SNESSetJacobian() is not called with a function pointer PETSc attempts to compute the Jacobian matrix explicitly with finite differences and coloring. This doesn't makes sense with a shell matrix. Hence the error message below regarding MatFDColoringCreate(). DMSNESSetJacobianLocal() calls SNESSetJacobian() with a function pointer of SNESComputeJacobian_DMLocal() so preventing the error from triggering in your code. You can provide your own function to SNESSetJacobian() and thus not need to call DMSNESSetJacobianLocal(). What you do depends on how you want to record the "base" vector that tells your matrix-free multiply routine where the Jacobian matrix vector product is being applied, that is J(u)*x. u is the "base" vector which is passed to the function provided with SNESSetJacobian(). Barry On Jan 10, 2024, at 6:20?AM, Yi Hu wrote: Thanks for the clarification. It is more clear to me now about the global to local processes after checking the examples, e.g. ksp/ksp/tutorials/ex14f.F90. And for using Vec locally, I followed your advice of VecGet.. and VecRestore? In fact I used DMDAVecGetArrayReadF90() and some other relevant subroutines. For your comment on DMSNESSetJacobianLocal(). It seems that I need to use both SNESSetJacobian() and then DMSNESSetJacobianLocal() to get things working. When I do only SNESSetJacobian(), it does not work, meaning the following does not work ?? call DMDASNESsetFunctionLocal(DM_mech,INSERT_VALUES,formResidual,PETSC_NULL_SNES,err_PETSc) CHKERRQ(err_PETSc) call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& 0,Jac_PETSc,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) CHKERRQ(err_PETSc) !call DMSNESsetJacobianLocal(DM_mech,formJacobian,PETSC_NULL_SNES,err_PETSc) !CHKERRQ(err_PETSc) call SNESsetConvergenceTest(SNES_mech,converged,PETSC_NULL_SNES,PETSC_NULL_FUNCTION,err_PETSc) CHKERRQ(err_PETSc) call SNESSetDM(SNES_mech,DM_mech,err_PETSc) CHKERRQ(err_PETSc) ?? It gives me the message [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Code not yet written for matrix type shell [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.4, Feb 02, 2022 [0]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu --with-fortran-bindings --with-mpi-f90module-visibility=0 --download-fftw --download-hdf5 --download-hdf5-fortran-bindings --download-fblaslapack --download-ml --download-zlib [0]PETSC ERROR: #1 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 [0]PETSC ERROR: #2 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173[0]PETSC ERROR: #3 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 [0]PETSC ERROR: #4 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 [0]PETSC ERROR: #5 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 [0]PETSC ERROR: #6 User provided function() at User file:0 [0]PETSC ERROR: #7 VecSetErrorIfLocked() at /home/yi/app/petsc-3.16.4/include/petscvec.h:623 [0]PETSC ERROR: #8 VecGetArray() at /home/yi/app/petsc-3.16.4/src/vec/vec/interface/rvector.c:1769 [0]PETSC ERROR: #9 User provided function() at User file:0 [0]PETSC ERROR: #10 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 [0]PETSC ERROR: #11 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173 [0]PETSC ERROR: #12 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 [0]PETSC ERROR: #14 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 It seems that I have to use a DMSNESSetJacobianLocal() to ?activate? the use of my shell matrix, although the formJacobian() in the DMSNESSetJacobianLocal() is doing nothing. Best wishes, Yi From: Barry Smith Sent: Tuesday, January 9, 2024 4:49 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. The input for the matrix-vector product is a global vector, as is the result. (Not like the arguments to DMSNESSetJacobianLocal). This means that your MATOP_MULT function needs to do the DMGlobalToLocal() vector operation first then the "unwrapping" from the vector to the array format at the beginning of the routine. Similarly it needs to "unwrap" the result vector as an array. See src/snes/tutorials/ex14f.F90 and in particular the code block PetscCall(DMGlobalToLocalBegin(da,X,INSERT_VALUES,localX,ierr)) PetscCall(DMGlobalToLocalEnd(da,X,INSERT_VALUES,localX,ierr)) ! Get pointers to vector data PetscCall(VecGetArrayReadF90(localX,xx,ierr)) PetscCall(VecGetArrayF90(F,ff,ierr)) Barry You really shouldn't be using DMSNESSetJacobianLocal() for your code. Basically all the DMSNESSetJacobianLocal() gives you is that it automatically handles the global to local mapping and unwrapping of the vector to an array, but it doesn't work for shell matrices. On Jan 9, 2024, at 6:30?AM, Yi Hu wrote: Dear Barry, Thanks for your help. It works when doing first SNESSetJacobian() with my created shell matrix Jac in the main (or module) and then DMSNESSetJacobianLocal() to associate with my DM and an dummy formJacobian callback (which is doing nothing). My SNES can now recognize my shell matrix and do my customized operation. However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. Best wishes, Yi From: Barry Smith Sent: Monday, January 8, 2024 6:41 PM To: Yi Hu Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] SNES seems not use my matrix-free operation "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). Barry Yes, "form" is a bad word that should not have been used in our code. On Jan 8, 2024, at 12:24?PM, Yi Hu wrote: Dear PETSc Experts, I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) In the main program, I define my residual and jacobian and matrix-free jacobian like the following, ? call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) ? subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) #include use petscmat implicit None DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & F !< deformation gradient field Mat :: Jac, Jac_pre PetscObject :: dummy PetscErrorCode :: err_PETSc PetscInt :: N_dof ! global number of DoF, maybe only a placeholder N_dof = 9*product(cells(1:2))*cells3 print*, 'in my jac' call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) print*, 'in my jac' ! for jac preconditioner call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) print*, 'in my jac' end subroutine formJacobian subroutine GK_op(Jac,dF,output,err_PETSc) real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & dF real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & output real(pREAL), dimension(3,3) :: & deltaF_aim = 0.0_pREAL Mat :: Jac PetscErrorCode :: err_PETSc integer :: i, j, k, e ? a lot of calculations ? print*, 'in GK op' end subroutine GK_op The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? Thanks, Yi ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Tue Jan 16 04:27:55 2024 From: y.hu at mpie.de (Yi Hu) Date: Tue, 16 Jan 2024 11:27:55 +0100 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: References: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> <10A79A13-67A7-4C2E-B6F6-6E73B58856A2@petsc.dev> Message-ID: Just to append the last mail. My defined interface for ctx usage is as follows ? interface MatShellSetContext ??? subroutine MatShellSetContext(mat,ctx_F_base,ierr) ????? use petscmat ????? Mat :: mat ????? Vec :: ctx_F_base ????? PetscErrorCode :: ierr ??? end subroutine MatShellSetContext ? end interface MatShellSetContext ? interface MatShellGetContext ??? subroutine MatShellGetContext(mat,ctx_F_base,ierr) ????? use petscmat ????? Mat :: mat ????? Vec :: ctx_F_base ????? PetscErrorCode :: ierr ??? end subroutine MatShellGetContext ? end interface MatShellGetContext From: Yi Hu Sent: Tuesday, January 16, 2024 10:48 AM To: 'Barry Smith' Cc: 'petsc-users' Subject: RE: [petsc-users] SNES seems not use my matrix-free operation Dear Barry, Thanks for your reply. In fact, this still does not work, giving the error ?code not yet written for matrix type shell?. I also tried recording F_global as base in my global shell matrix Jac_PETSc. And fetch it in my GK_op, but the same error occurs. So the question is still, how can I assign the Jac in my formJacobian() to my global defined Jac_PETSc when set up with SNESSetJacobian(snes,Jac_PETSc,Jac_PETSc,formJacobian,0,ierr). As far as I understand, when using SNESSetJacobian() the jac argument is only for reserving memory. And the matrix is not written yet. Only a valid formJacobian() can lead to a write to the matrix. My formJacobian maybe not valid (because the output not gives me print), so the Jac_PETSc is never written for the SNES. Maybe I am wrong about it. Best wishes, Yi From: Barry Smith Sent: Thursday, January 11, 2024 4:47 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation The following assumes you are not using the shell matrix context for some other purpose subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) SNES :: snes Vec :: F_global ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & ! F Mat :: Jac, Jac_pre PetscObject :: dummy PetscErrorCode :: err_PETSc print*, '@@ start build my jac' PetscCall(MatShellSetContext(Jac,F_global,ierr)) ! record the current base vector where the Jacobian is to be applied print*, '@@ end build my jac' end subroutine formJacobian subroutine Gk_op ... Vec base PetscCall(MatShellGetContext(Jac,base,ierr)) ! use base in the computation of your matrix-free Jacobian vector product .... On Jan 11, 2024, at 5:55?AM, Yi Hu wrote: Now I understand a bit more about the workflow of set jacobian. It seems that the SNES can be really fine-grained. As you point out, J is built via formJacobian() callback, and can be based on previous solution (or the base vector u, as you mentioned). And then KSP can use a customized MATOP_MULT to solve the linear equations J(u)*x=rhs. So I followed your idea about removing DMSNESSetJacobianLocal() and did the following. ?? call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& 0,Jac_PETSc,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,formJacobian,0,err_PETSc) CHKERRQ(err_PETSc) ?? And my formJacobian() is subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) SNES :: snes Vec :: F_global ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & ! F Mat :: Jac, Jac_pre PetscObject :: dummy PetscErrorCode :: err_PETSc print*, '@@ start build my jac' call MatCopy(Jac_PETSc,Jac,SAME_NONZERO_PATTERN,err_PETSc) CHKERRQ(err_PETSc) call MatCopy(Jac_PETSc,Jac_pre,SAME_NONZERO_PATTERN,err_PETSc) CHKERRQ(err_PETSc) ! Jac = Jac_PETSc ! Jac_pre = Jac_PETSc print*, '@@ end build my jac' end subroutine formJacobian it turns out that no matter by a simple assignment or MatCopy(), the compiled program gives me the same error as before. So I guess the real jacobian is still not set. I wonder how to get around this and let this built jac in formJacobian() to be the same as my shell matrix. Yi From: Barry Smith Sent: Wednesday, January 10, 2024 4:27 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation By default if SNESSetJacobian() is not called with a function pointer PETSc attempts to compute the Jacobian matrix explicitly with finite differences and coloring. This doesn't makes sense with a shell matrix. Hence the error message below regarding MatFDColoringCreate(). DMSNESSetJacobianLocal() calls SNESSetJacobian() with a function pointer of SNESComputeJacobian_DMLocal() so preventing the error from triggering in your code. You can provide your own function to SNESSetJacobian() and thus not need to call DMSNESSetJacobianLocal(). What you do depends on how you want to record the "base" vector that tells your matrix-free multiply routine where the Jacobian matrix vector product is being applied, that is J(u)*x. u is the "base" vector which is passed to the function provided with SNESSetJacobian(). Barry On Jan 10, 2024, at 6:20?AM, Yi Hu wrote: Thanks for the clarification. It is more clear to me now about the global to local processes after checking the examples, e.g. ksp/ksp/tutorials/ex14f.F90. And for using Vec locally, I followed your advice of VecGet.. and VecRestore? In fact I used DMDAVecGetArrayReadF90() and some other relevant subroutines. For your comment on DMSNESSetJacobianLocal(). It seems that I need to use both SNESSetJacobian() and then DMSNESSetJacobianLocal() to get things working. When I do only SNESSetJacobian(), it does not work, meaning the following does not work ?? call DMDASNESsetFunctionLocal(DM_mech,INSERT_VALUES,formResidual,PETSC_NULL_SNES,err_PETSc) CHKERRQ(err_PETSc) call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& 0,Jac_PETSc,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) CHKERRQ(err_PETSc) !call DMSNESsetJacobianLocal(DM_mech,formJacobian,PETSC_NULL_SNES,err_PETSc) !CHKERRQ(err_PETSc) call SNESsetConvergenceTest(SNES_mech,converged,PETSC_NULL_SNES,PETSC_NULL_FUNCTION,err_PETSc) CHKERRQ(err_PETSc) call SNESSetDM(SNES_mech,DM_mech,err_PETSc) CHKERRQ(err_PETSc) ?? It gives me the message [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Code not yet written for matrix type shell [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.4, Feb 02, 2022 [0]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu --with-fortran-bindings --with-mpi-f90module-visibility=0 --download-fftw --download-hdf5 --download-hdf5-fortran-bindings --download-fblaslapack --download-ml --download-zlib [0]PETSC ERROR: #1 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 [0]PETSC ERROR: #2 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173[0]PETSC ERROR: #3 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 [0]PETSC ERROR: #4 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 [0]PETSC ERROR: #5 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 [0]PETSC ERROR: #6 User provided function() at User file:0 [0]PETSC ERROR: #7 VecSetErrorIfLocked() at /home/yi/app/petsc-3.16.4/include/petscvec.h:623 [0]PETSC ERROR: #8 VecGetArray() at /home/yi/app/petsc-3.16.4/src/vec/vec/interface/rvector.c:1769 [0]PETSC ERROR: #9 User provided function() at User file:0 [0]PETSC ERROR: #10 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 [0]PETSC ERROR: #11 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173 [0]PETSC ERROR: #12 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 [0]PETSC ERROR: #14 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 It seems that I have to use a DMSNESSetJacobianLocal() to ?activate? the use of my shell matrix, although the formJacobian() in the DMSNESSetJacobianLocal() is doing nothing. Best wishes, Yi From: Barry Smith Sent: Tuesday, January 9, 2024 4:49 PM To: Yi Hu Cc: petsc-users Subject: Re: [petsc-users] SNES seems not use my matrix-free operation However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. The input for the matrix-vector product is a global vector, as is the result. (Not like the arguments to DMSNESSetJacobianLocal). This means that your MATOP_MULT function needs to do the DMGlobalToLocal() vector operation first then the "unwrapping" from the vector to the array format at the beginning of the routine. Similarly it needs to "unwrap" the result vector as an array. See src/snes/tutorials/ex14f.F90 and in particular the code block PetscCall(DMGlobalToLocalBegin(da,X,INSERT_VALUES,localX,ierr)) PetscCall(DMGlobalToLocalEnd(da,X,INSERT_VALUES,localX,ierr)) ! Get pointers to vector data PetscCall(VecGetArrayReadF90(localX,xx,ierr)) PetscCall(VecGetArrayF90(F,ff,ierr)) Barry You really shouldn't be using DMSNESSetJacobianLocal() for your code. Basically all the DMSNESSetJacobianLocal() gives you is that it automatically handles the global to local mapping and unwrapping of the vector to an array, but it doesn't work for shell matrices. On Jan 9, 2024, at 6:30?AM, Yi Hu wrote: Dear Barry, Thanks for your help. It works when doing first SNESSetJacobian() with my created shell matrix Jac in the main (or module) and then DMSNESSetJacobianLocal() to associate with my DM and an dummy formJacobian callback (which is doing nothing). My SNES can now recognize my shell matrix and do my customized operation. However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. Best wishes, Yi From: Barry Smith Sent: Monday, January 8, 2024 6:41 PM To: Yi Hu Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] SNES seems not use my matrix-free operation "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). Barry Yes, "form" is a bad word that should not have been used in our code. On Jan 8, 2024, at 12:24?PM, Yi Hu wrote: Dear PETSc Experts, I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) In the main program, I define my residual and jacobian and matrix-free jacobian like the following, ? call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) ? subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) #include use petscmat implicit None DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & F !< deformation gradient field Mat :: Jac, Jac_pre PetscObject :: dummy PetscErrorCode :: err_PETSc PetscInt :: N_dof ! global number of DoF, maybe only a placeholder N_dof = 9*product(cells(1:2))*cells3 print*, 'in my jac' call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) print*, 'in my jac' ! for jac preconditioner call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) CHKERRQ(err_PETSc) call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) CHKERRQ(err_PETSc) print*, 'in my jac' end subroutine formJacobian subroutine GK_op(Jac,dF,output,err_PETSc) real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & dF real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & output real(pREAL), dimension(3,3) :: & deltaF_aim = 0.0_pREAL Mat :: Jac PetscErrorCode :: err_PETSc integer :: i, j, k, e ? a lot of calculations ? print*, 'in GK op' end subroutine GK_op The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? Thanks, Yi ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 16 09:15:40 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 16 Jan 2024 10:15:40 -0500 Subject: [petsc-users] SNES seems not use my matrix-free operation In-Reply-To: References: <7B0A8642-74DC-44FF-906A-E11FDB95C331@petsc.dev> <49913a4e-b55d-46b2-9141-a6c1d1ca7cf7@mpie.de> <5EA5452B-7EDE-4D47-8A6C-1D03AA57AD58@petsc.dev> <10A79A13-67A7-4C2E-B6F6-6E73B58856A2@petsc.dev> Message-ID: <88CF1E54-D8F5-460B-9275-F394721CD202@petsc.dev> Send your test code that fails. > On Jan 16, 2024, at 5:27?AM, Yi Hu wrote: > > Just to append the last mail. My defined interface for ctx usage is as follows > > interface MatShellSetContext > subroutine MatShellSetContext(mat,ctx_F_base,ierr) > use petscmat > Mat :: mat > Vec :: ctx_F_base > PetscErrorCode :: ierr > end subroutine MatShellSetContext > end interface MatShellSetContext > > interface MatShellGetContext > subroutine MatShellGetContext(mat,ctx_F_base,ierr) > use petscmat > Mat :: mat > Vec :: ctx_F_base > PetscErrorCode :: ierr > end subroutine MatShellGetContext > end interface MatShellGetContext > > > From: Yi Hu > > Sent: Tuesday, January 16, 2024 10:48 AM > To: 'Barry Smith' > > Cc: 'petsc-users' > > Subject: RE: [petsc-users] SNES seems not use my matrix-free operation > > Dear Barry, > > Thanks for your reply. In fact, this still does not work, giving the error ?code not yet written for matrix type shell?. I also tried recording F_global as base in my global shell matrix Jac_PETSc. And fetch it in my GK_op, but the same error occurs. So the question is still, how can I assign the Jac in my formJacobian() to my global defined Jac_PETSc when set up with SNESSetJacobian(snes,Jac_PETSc,Jac_PETSc,formJacobian,0,ierr). As far as I understand, when using SNESSetJacobian() the jac argument is only for reserving memory. And the matrix is not written yet. Only a valid formJacobian() can lead to a write to the matrix. My formJacobian maybe not valid (because the output not gives me print), so the Jac_PETSc is never written for the SNES. Maybe I am wrong about it. > > Best wishes, > Yi > > From: Barry Smith > > Sent: Thursday, January 11, 2024 4:47 PM > To: Yi Hu > > Cc: petsc-users > > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > > The following assumes you are not using the shell matrix context for some other purpose > > subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) > > SNES :: snes > Vec :: F_global > > ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > ! F > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > > print*, '@@ start build my jac' > > PetscCall(MatShellSetContext(Jac,F_global,ierr)) ! record the current base vector where the Jacobian is to be applied > print*, '@@ end build my jac' > > end subroutine formJacobian > > subroutine Gk_op > ... > Vec base > PetscCall(MatShellGetContext(Jac,base,ierr)) > > ! use base in the computation of your matrix-free Jacobian vector product > .... > > > > > On Jan 11, 2024, at 5:55?AM, Yi Hu > wrote: > > Now I understand a bit more about the workflow of set jacobian. It seems that the SNES can be really fine-grained. As you point out, J is built via formJacobian() callback, and can be based on previous solution (or the base vector u, as you mentioned). And then KSP can use a customized MATOP_MULT to solve the linear equations J(u)*x=rhs. > > So I followed your idea about removing DMSNESSetJacobianLocal() and did the following. > > ?? > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& > 0,Jac_PETSc,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,formJacobian,0,err_PETSc) > CHKERRQ(err_PETSc) > ?? > > And my formJacobian() is > > subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc) > > SNES :: snes > Vec :: F_global > > ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > ! F > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > > print*, '@@ start build my jac' > > call MatCopy(Jac_PETSc,Jac,SAME_NONZERO_PATTERN,err_PETSc) > CHKERRQ(err_PETSc) > call MatCopy(Jac_PETSc,Jac_pre,SAME_NONZERO_PATTERN,err_PETSc) > CHKERRQ(err_PETSc) > ! Jac = Jac_PETSc > ! Jac_pre = Jac_PETSc > > print*, '@@ end build my jac' > > end subroutine formJacobian > > it turns out that no matter by a simple assignment or MatCopy(), the compiled program gives me the same error as before. So I guess the real jacobian is still not set. I wonder how to get around this and let this built jac in formJacobian() to be the same as my shell matrix. > > Yi > > From: Barry Smith > > Sent: Wednesday, January 10, 2024 4:27 PM > To: Yi Hu > > Cc: petsc-users > > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > > By default if SNESSetJacobian() is not called with a function pointer PETSc attempts to compute the Jacobian matrix explicitly with finite differences and coloring. This doesn't makes sense with a shell matrix. Hence the error message below regarding MatFDColoringCreate(). > > DMSNESSetJacobianLocal() calls SNESSetJacobian() with a function pointer of SNESComputeJacobian_DMLocal() so preventing the error from triggering in your code. > > You can provide your own function to SNESSetJacobian() and thus not need to call DMSNESSetJacobianLocal(). What you do depends on how you want to record the "base" vector that tells your matrix-free multiply routine where the Jacobian matrix vector product is being applied, that is J(u)*x. u is the "base" vector which is passed to the function provided with SNESSetJacobian(). > > Barry > > > > > On Jan 10, 2024, at 6:20?AM, Yi Hu > wrote: > > Thanks for the clarification. It is more clear to me now about the global to local processes after checking the examples, e.g. ksp/ksp/tutorials/ex14f.F90. > > And for using Vec locally, I followed your advice of VecGet.. and VecRestore? In fact I used DMDAVecGetArrayReadF90() and some other relevant subroutines. > > For your comment on DMSNESSetJacobianLocal(). It seems that I need to use both SNESSetJacobian() and then DMSNESSetJacobianLocal() to get things working. When I do only SNESSetJacobian(), it does not work, meaning the following does not work > > ?? > call DMDASNESsetFunctionLocal(DM_mech,INSERT_VALUES,formResidual,PETSC_NULL_SNES,err_PETSc) > CHKERRQ(err_PETSc) > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,& > 0,Jac_PETSc,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) > CHKERRQ(err_PETSc) > !call DMSNESsetJacobianLocal(DM_mech,formJacobian,PETSC_NULL_SNES,err_PETSc) > !CHKERRQ(err_PETSc) > call SNESsetConvergenceTest(SNES_mech,converged,PETSC_NULL_SNES,PETSC_NULL_FUNCTION,err_PETSc) > CHKERRQ(err_PETSc) > call SNESSetDM(SNES_mech,DM_mech,err_PETSc) > CHKERRQ(err_PETSc) > ?? > > It gives me the message > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Code not yet written for matrix type shell > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.4, Feb 02, 2022 > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu --with-fortran-bindings --with-mpi-f90module-visibility=0 --download-fftw --download-hdf5 --download-hdf5-fortran-bindings --download-fblaslapack --download-ml --download-zlib > [0]PETSC ERROR: #1 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 > [0]PETSC ERROR: #2 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173[0]PETSC ERROR: #3 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 > [0]PETSC ERROR: #4 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 > [0]PETSC ERROR: #5 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 > [0]PETSC ERROR: #6 User provided function() at User file:0 > [0]PETSC ERROR: #7 VecSetErrorIfLocked() at /home/yi/app/petsc-3.16.4/include/petscvec.h:623 > [0]PETSC ERROR: #8 VecGetArray() at /home/yi/app/petsc-3.16.4/src/vec/vec/interface/rvector.c:1769 > [0]PETSC ERROR: #9 User provided function() at User file:0 > [0]PETSC ERROR: #10 MatFDColoringCreate() at /home/yi/app/petsc-3.16.4/src/mat/matfd/fdmatrix.c:471 > [0]PETSC ERROR: #11 SNESComputeJacobian_DMDA() at /home/yi/app/petsc-3.16.4/src/snes/utils/dmdasnes.c:173 > [0]PETSC ERROR: #12 SNESComputeJacobian() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:2864 > [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() at /home/yi/app/petsc-3.16.4/src/snes/impls/ls/ls.c:222 > [0]PETSC ERROR: #14 SNESSolve() at /home/yi/app/petsc-3.16.4/src/snes/interface/snes.c:4809 > > It seems that I have to use a DMSNESSetJacobianLocal() to ?activate? the use of my shell matrix, although the formJacobian() in the DMSNESSetJacobianLocal() is doing nothing. > > Best wishes, > Yi > > > > From: Barry Smith > > Sent: Tuesday, January 9, 2024 4:49 PM > To: Yi Hu > > Cc: petsc-users > > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. > > > The input for the matrix-vector product is a global vector, as is the result. (Not like the arguments to DMSNESSetJacobianLocal). > > This means that your MATOP_MULT function needs to do the DMGlobalToLocal() vector operation first then the "unwrapping" from the vector to the array format at the beginning of the routine. Similarly it needs to "unwrap" the result vector as an array. See src/snes/tutorials/ex14f.F90 and in particular the code block > > PetscCall(DMGlobalToLocalBegin(da,X,INSERT_VALUES,localX,ierr)) > PetscCall(DMGlobalToLocalEnd(da,X,INSERT_VALUES,localX,ierr)) > > ! Get pointers to vector data > > PetscCall(VecGetArrayReadF90(localX,xx,ierr)) > PetscCall(VecGetArrayF90(F,ff,ierr)) > > Barry > > You really shouldn't be using DMSNESSetJacobianLocal() for your code. Basically all the DMSNESSetJacobianLocal() gives you is that it automatically handles the global to local mapping and unwrapping of the vector to an array, but it doesn't work for shell matrices. > > > > > > On Jan 9, 2024, at 6:30?AM, Yi Hu > wrote: > > Dear Barry, > > Thanks for your help. > > It works when doing first SNESSetJacobian() with my created shell matrix Jac in the main (or module) and then DMSNESSetJacobianLocal() to associate with my DM and an dummy formJacobian callback (which is doing nothing). My SNES can now recognize my shell matrix and do my customized operation. > > However, my GK_op (which is the reloaded MATOP_MULT) gives me some problem. It is entered but then crashed with Segmentation Violation error. So I guess my indices may be wrong. I wonder do I need to use the local Vec (of dF), and should my output Vec also be in the correct shape (i.e. after calculation I need to transform back into a Vec)? As you can see here, my dF is a tensor defined on every grid point. > > Best wishes, > Yi > > From: Barry Smith > > Sent: Monday, January 8, 2024 6:41 PM > To: Yi Hu > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] SNES seems not use my matrix-free operation > > > "formJacobian" should not be __creating__ the matrices. Here "form" means computing the numerical values in the matrix (or when using a shell matrix it means keeping a copy of X so that your custom matrix-free multiply knows the base location where the matrix free Jacobian-vector products are computed.) > > You create the shell matrices up in your main program and pass them in with SNESSetJacobian(). > > Try first calling SNESSetJacobian() to provide the matrices (provide a dummy function argument) and then call DMSNESSetJacobianLocal() to provide your "formjacobian" function (that does not create the matrices). > > Barry > > > Yes, "form" is a bad word that should not have been used in our code. > > > > > > > > On Jan 8, 2024, at 12:24?PM, Yi Hu > wrote: > > Dear PETSc Experts, > > I am implementing a matrix-free jacobian for my SNES solver in Fortran. (command line option -snes_type newtonls -ksp_type gmres) > > In the main program, I define my residual and jacobian and matrix-free jacobian like the following, > > ? > call DMDASNESSetFunctionLocal(DM_mech, INSERT_VALUES, formResidual, PETSC_NULL_SNES, err_PETSc) > call DMSNESSetJacobianLocal(DM_mech, formJacobian, PETSC_NULL_SNES, err_PETSc) > ? > > subroutine formJacobian(residual_subdomain,F,Jac_pre,Jac,dummy,err_PETSc) > > #include > use petscmat > implicit None > DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: & > residual_subdomain !< DMDA info (needs to be named "in" for macros like XRANGE to work) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > F !< deformation gradient field > Mat :: Jac, Jac_pre > PetscObject :: dummy > PetscErrorCode :: err_PETSc > PetscInt :: N_dof ! global number of DoF, maybe only a placeholder > > N_dof = 9*product(cells(1:2))*cells3 > > print*, 'in my jac' > > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > ! for jac preconditioner > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N_dof,N_dof,0,Jac_pre,err_PETSc) > CHKERRQ(err_PETSc) > call MatShellSetOperation(Jac_pre,MATOP_MULT,GK_op,err_PETSc) > CHKERRQ(err_PETSc) > > print*, 'in my jac' > > end subroutine formJacobian > > subroutine GK_op(Jac,dF,output,err_PETSc) > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: & > dF > real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(out) :: & > output > real(pREAL), dimension(3,3) :: & > deltaF_aim = 0.0_pREAL > > Mat :: Jac > PetscErrorCode :: err_PETSc > > integer :: i, j, k, e > > ? a lot of calculations ? > > print*, 'in GK op' > > end subroutine GK_op > > The first question is that: it seems I still need to explicitly define the interface of MatCreateShell() and MatShellSetOperation() to properly use them, even though I include them via ?use petscmat?. It is a little bit strange to me, since some examples do not perform this step. > > Then the main issue is that I can build my own Jacobian from my call back function formJacobian, and confirm my Jacobian is a shell matrix (by MatView). However, my customized operator GK_op is not called when solving the nonlinear system (not print my ?in GK op?). When I try to monitor my SNES, it gives me some conventional output not mentioning my matrix-free operations. So I guess my customized MATOP_MULT may be not associated with Jacobian. Or my configuration is somehow wrong. Could you help me solve this issue? > > Thanks, > Yi > > > > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From sawsan.shatanawi at wsu.edu Tue Jan 16 12:43:08 2024 From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad) Date: Tue, 16 Jan 2024 18:43:08 +0000 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> <9BD74F88-20FD-4021-B02A-195A58B72282@petsc.dev> Message-ID: Hello all, Thank you for your valuable help. I will do your recommendations and hope it will run without any issues. Bests, Sawsan ________________________________ From: Junchao Zhang Sent: Friday, January 12, 2024 8:46 AM To: Shatanawi, Sawsan Muhammad Cc: Barry Smith ; Matthew Knepley ; Mark Adams ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Hi, Sawsan, First in test_main.F90, you need to call VecGetArrayF90(temp_solution, H_vector, ierr) and VecRestoreArrayF90 (temp_solution, H_vector, ierr) as Barry mentioned. Secondly, in the loop of test_main.F90, it calls GW_solver(). Within it, it calls PetscInitialize()/PetscFinalize(). But without MPI being initialized, PetscInitialize()/PetscFinalize() can only be called once. do timestep =2 , NTSP call GW_boundary_conditions(timestep-1) !print *,HNEW(1,1,1) call GW_elevation() ! print *, GWTOP(2,2,2) call GW_conductance() ! print *, CC(2,2,2) call GW_recharge() ! print *, B_Rech(5,4) call GW_pumping(timestep-1) ! print *, B_pump(2,2,2) call GW_SW(timestep-1) print *,B_RIVER (2,2,2) call GW_solver(timestep-1,N) call GW_deallocate_loop() end do A solution is to delete PetscInitialize()/PetscFinalize() in GW_solver_try.F90 and move it to test_main.F90, outside the do loop. diff --git a/test_main.F90 b/test_main.F90 index b5997c55..107bd3ee 100644 --- a/test_main.F90 +++ b/test_main.F90 @@ -1,5 +1,6 @@ program test_GW +#include use petsc use GW_constants use GW_param_by_user @@ -8,6 +9,9 @@ program test_GW implicit none integer :: N integer :: timestep + PetscErrorCode ierr + + call PetscInitialize(ierr) call GW_domain(N) !print *, "N=",N !print *, DELTAT @@ -37,4 +41,5 @@ program test_GW end do print *, HNEW(NCOL,3,2) call GW_deallocate () + call PetscFinalize(ierr) end program test_GW With that, the MPI error will be fixed. The code could run to gw_deallocate () before abort. There are other memory errors. You can install/use valgrind to fix them. Run it with valgrind ./GW.exe and look through the output Thanks. --Junchao Zhang On Thu, Jan 11, 2024 at 10:49?PM Shatanawi, Sawsan Muhammad > wrote: Hello, Thank you all for your help. I have changed VecGetArray to VecGetArrayF90, and the location of destory call. but I want to make sure that VecGet ArrayF90 is to make a new array( vector) that I can use in the rest of my Fortran code? when I run it and debugged it, I got 5.2000000E-03 50.00000 10.00000 0.0000000E+00 PETSC: Attaching gdb to /weka/data/lab/richey/sawsan/GW_CODE/code2024/SS_GWM/./GW.exe of pid 33065 on display :0.0 on machine sn16 Unable to start debugger in xterm: No such file or directory 0.0000000E+00 Attempting to use an MPI routine after finalizing MPICH srun: error: sn16: task 0: Exited with exit code 1 [sawsan.shatanawi at login-p2n02 SS_GWM]$ gdb ./GW/exe GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later > This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: >... ./GW/exe: No such file or directory. (gdb) run Starting program: No executable file specified. Use the "file" or "exec-file" command. (gdb) bt No stack. (gdb) If the highlighted line is the error, I don't know why when I write gdb , it does not show me the location of error The code : sshatanawi/SS_GWM (github.com) I really appreciate your helps Sawsan ________________________________ From: Barry Smith > Sent: Wednesday, January 10, 2024 5:35 PM To: Junchao Zhang > Cc: Shatanawi, Sawsan Muhammad >; Mark Adams >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] On Jan 10, 2024, at 6:49?PM, Junchao Zhang > wrote: Hi, Sawsan, I could build your code and I also could gdb it. $ gdb ./GW.exe ... $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault. 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 257 *ierr = VecGetArray(*x, &lx); (gdb) bt #0 0x00007ffff1e6d44f in vecgetarray_ (x=0x7fffffffa718, fa=0x0, ia=0x7fffffffa75c, ierr=0x0) at /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257 #1 0x000000000040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at GW_solver_try.F90:169 #2 0x000000000040c6a8 in test_gw () at test_main.F90:35 ierr=0x0 caused the segfault. See https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray, you should use VecGetArrayF90 instead. BTW, Barry, the code https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 has "call VecGetArray(temp_solution, H_vector, ierr)". I don't find petsc Fortran examples doing VecGetArray. Do we still support it? This is not the correct calling sequence for VecGetArray() from Fortran. Regardless, definitely should not be writing any new code that uses VecGetArray() from Fortran. Should use VecGetArrayF90(). --Junchao Zhang On Wed, Jan 10, 2024 at 2:38?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello all, I hope you are doing well. Generally, I use gdb to debug the code. I got the attached error message. I have tried to add the flag -start_in_debugger in the make file, but it didn't work, so it seems I was doing it in the wrong way This is the link for the whole code: sshatanawi/SS_GWM (github.com) [https://opengraph.githubassets.com/9eb6cd14baf12f04848ed209b6f502415eb531bdd7b3a5f9696af68663b870c0/sshatanawi/SS_GWM] GitHub - sshatanawi/SS_GWM Contribute to sshatanawi/SS_GWM development by creating an account on GitHub. github.com ? You can read the description of the code in " Model Desprciption.pdf" the compiling file is makefile_f90 where you can find the linked code files I really appreciate your help Bests, Sawsan ________________________________ From: Mark Adams > Sent: Friday, January 5, 2024 4:53 AM To: Shatanawi, Sawsan Muhammad > Cc: Matthew Knepley >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] This is a segv. As Matt said, you need to use a debugger for this or add print statements to narrow down the place where this happens. You will need to learn how to use debuggers to do your project so you might as well start now. If you have a machine with a GUI debugger that is easier but command line debuggers are good to learn anyway. I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) and use a GUI debugger (eg, Totalview or DDT) if available. Mark On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello Matthew, Thank you for your help. I am sorry that I keep coming back with my error messages, but I reached a point that I don't know how to fix them, and I don't understand them easily. The list of errors is getting shorter, now I am getting the attached error messages Thank you again, Sawsan ________________________________ From: Matthew Knepley > Sent: Wednesday, December 20, 2023 6:54 PM To: Shatanawi, Sawsan Muhammad > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello Barry, Thank you a lot for your help, Now I am getting the attached error message. Do not destroy the PC from KSPGetPC() THanks, Matt Bests, Sawsan ________________________________ From: Barry Smith > Sent: Wednesday, December 20, 2023 6:32 PM To: Shatanawi, Sawsan Muhammad > Cc: Mark Adams >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Instead of call PCCreate(PETSC_COMM_WORLD, pc, ierr) call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver do call KSPGetPC(ksp,pc,ierr) call PCSetType(pc, PCILU,ierr) Do not call KSPSetUp(). It will be taken care of automatically during the solve On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello, I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) I appreciate your help Sawsan ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 6:44 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Did you set preallocation values when you created the matrix? Don't do that. On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: Hello, I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it Get Outlook for iOS ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 2:48 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. If this is what you want then you can tell the matrix to let you do that. Otherwise you have a bug. Mark On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello everyone, I hope this email finds you well. My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. I am kindly asking if someone can help me, I would be happy to share my code with him/her. Please find the attached file contains a list of errors I have gotten Thank you in advance for your time and assistance. Best regards, Sawsan -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yuanxi at advancesoft.jp Thu Jan 18 01:46:47 2024 From: yuanxi at advancesoft.jp (=?UTF-8?B?6KKB54WV?=) Date: Thu, 18 Jan 2024 16:46:47 +0900 Subject: [petsc-users] MatAssemblyBegin freezes during MPI communication Message-ID: Dear PETSc Experts, My FEM program works well generally, but in some specific cases with multiple CPUs are used, it freezes when calling MatAssemblyBegin where PMPI_Allreduce is called (see attached file). After some investigation, I found that it is most probably due to ? MatSetValue is not called from all CPUs before MatAssemblyBegin For example, when 4 CPUs are used, if there are elements in CPU 0,1,2 but no elements in CPU 3, then all CPUs other than CPU 3 would call MatSetValue function. I want to know 1. If my conjecture could be right? And If so 2. Are there any convenient means to avoid this problem? Thanks, Xi YUAN, PhD Solid Mechanics -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: aa.PNG Type: image/png Size: 112520 bytes Desc: not available URL: From knepley at gmail.com Thu Jan 18 06:46:44 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jan 2024 07:46:44 -0500 Subject: [petsc-users] MatAssemblyBegin freezes during MPI communication In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 2:47?AM ?? wrote: > Dear PETSc Experts, > > My FEM program works well generally, but in some specific cases with > multiple CPUs are used, it freezes when calling MatAssemblyBegin where > PMPI_Allreduce is called (see attached file). > > After some investigation, I found that it is most probably due to > > ? MatSetValue is not called from all CPUs before MatAssemblyBegin > > For example, when 4 CPUs are used, if there are elements in CPU 0,1,2 but > no elements in CPU 3, then all CPUs other than CPU 3 would call > MatSetValue function. I want to know > > 1. If my conjecture could be right? And If so > No, you do not have to call MatSetValue() from all processes. > 2. Are there any convenient means to avoid this problem? > Are you calling MatAssemblyBegin() from all processes? This is necessary. Thanks, Matt > Thanks, > Xi YUAN, PhD Solid Mechanics > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Jan 18 07:20:00 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 18 Jan 2024 07:20:00 -0600 Subject: [petsc-users] MatAssemblyBegin freezes during MPI communication In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 1:47?AM ?? wrote: > Dear PETSc Experts, > > My FEM program works well generally, but in some specific cases with > multiple CPUs are used, it freezes when calling MatAssemblyBegin where > PMPI_Allreduce is called (see attached file). > > After some investigation, I found that it is most probably due to > > ? MatSetValue is not called from all CPUs before MatAssemblyBegin > > For example, when 4 CPUs are used, if there are elements in CPU 0,1,2 but > no elements in CPU 3, then all CPUs other than CPU 3 would call > MatSetValue function. I want to know > > 1. If my conjecture could be right? And If so > No. All processes do MPI_Allreduce to know if there are incoming values set by others. To know why hanging, you can attach gdb to all MPI processes to see where they are. > > 2. Are there any convenient means to avoid this problem? > > Thanks, > Xi YUAN, PhD Solid Mechanics > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anna at oden.utexas.edu Wed Jan 17 15:51:08 2024 From: anna at oden.utexas.edu (Yesypenko, Anna) Date: Wed, 17 Jan 2024 21:51:08 +0000 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix Message-ID: Dear Petsc users/developers, I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). Here is a minimum snippet that populates a sparse tridiagonal matrix. ``` from petsc4py import PETSc from scipy.sparse import diags import numpy as np n = int(5e5); nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 A = PETSc.Mat(comm=PETSc.COMM_WORLD) A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) A.setType('aijcusparse') tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. A.assemble() ``` The error trace is below: ``` File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv petsc4py.PETSc.Error: error code 76 [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 [0] Error in external library [0] [khash] Assertion: `ret >= 0' failed. ``` If I run the same script a handful of times, it will run without errors eventually. Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. Thank you! Anna -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.tardieu at edf.fr Thu Jan 18 08:20:22 2024 From: nicolas.tardieu at edf.fr (TARDIEU Nicolas) Date: Thu, 18 Jan 2024 14:20:22 +0000 Subject: [petsc-users] Undestanding how to increase the overlap Message-ID: Dear PETSc Team, I am trying to understand how to increase the overlap of a matrix. I wrote the attached petsc4py script where I build a simple matrix and play with the increaseOverlap method. Unfortunately, before and after the call, nothing changes in the index set. FYI, I have tried to mimic src/ksp/ksp/tutorials/ex82.c line 72:76. Here is how I run the script : "mpiexec -n 2 python test_overlap.py" Could you please indicate what I am missing ? Regards, Nicolas -- Nicolas Tardieu Ing PhD Computational Mechanics EDF - R&D Dpt ERMES PARIS-SACLAY, FRANCE Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message. Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_overlap.py Type: text/x-python Size: 708 bytes Desc: test_overlap.py URL: From knepley at gmail.com Thu Jan 18 08:29:15 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jan 2024 09:29:15 -0500 Subject: [petsc-users] Undestanding how to increase the overlap In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 9:24?AM TARDIEU Nicolas via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc Team, > > I am trying to understand how to increase the overlap of a matrix. > I wrote the attached petsc4py script where I build a simple matrix and > play with the increaseOverlap method. Unfortunately, before and after the > call, nothing changes in the index set. FYI, I have tried to mimic > src/ksp/ksp/tutorials/ex82.c > line 72:76. > Here is how I run the script : "mpiexec -n 2 python test_overlap.py" > > Could you please indicate what I am missing ? > Usually matrix functions like this take input in global indices. It looks like your isovl is in local indices. Am I reading that correctly? Thanks, Matt > Regards, > Nicolas > -- > Nicolas Tardieu > Ing PhD Computational Mechanics > EDF - R&D Dpt ERMES > PARIS-SACLAY, FRANCE > > > Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont > ?tablis ? l'intention exclusive des destinataires et les informations qui y > figurent sont strictement confidentielles. Toute utilisation de ce Message > non conforme ? sa destination, toute diffusion ou toute publication totale > ou partielle, est interdite sauf autorisation expresse. > > Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de > le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou > partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de > votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace > sur quelque support que ce soit. Nous vous remercions ?galement d'en > avertir imm?diatement l'exp?diteur par retour du message. > > Il est impossible de garantir que les communications par messagerie > ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute > erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for > the addressees. The information contained in this Message is confidential. > Any use of information contained in this Message not in accord with its > purpose, any dissemination or disclosure, either whole or partial, is > prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use > any part of it. If you have received this message in error, please delete > it and all copies from your system and notify the sender immediately by > return message. > > E-mail communication cannot be guaranteed to be timely secure, error or > virus-free. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 18 08:33:24 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jan 2024 09:33:24 -0500 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 9:04?AM Yesypenko, Anna wrote: > Dear Petsc users/developers, > > I'm experiencing a bug when using petsc4py with GPU support. It may be my > mistake in how I set up a AIJCUSPARSE matrix. > For larger matrices, I sometimes encounter a error in assigning matrix > values; the error is thrown in PetscHMapIJVQuerySet(). > Here is a minimum snippet that populates a sparse tridiagonal matrix. > > ``` > from petsc4py import PETSc > from scipy.sparse import diags > import numpy as np > > n = int(5e5); > > nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 > A = PETSc.Mat(comm=PETSc.COMM_WORLD) > A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) > A.setType('aijcusparse') > tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() > A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) > ####### this is the line where the error is thrown. > A.assemble() > ``` > I don't have scipy installed. Since the matrix is so small, can you print tmp.indptr,tmp.indices,tmp.data when you run? It seems to be either bad values there, or something is wrong with passing those pointers. Thanks, Matt > The error trace is below: > ``` > File "petsc4py/PETSc/Mat.pyx", line 2603, in > petsc4py.PETSc.Mat.setValuesCSR > File "petsc4py/PETSc/petscmat.pxi", line 1039, in > petsc4py.PETSc.matsetvalues_csr > File "petsc4py/PETSc/petscmat.pxi", line 1032, in > petsc4py.PETSc.matsetvalues_ijv > petsc4py.PETSc.Error: error code 76 > [0] MatSetValues() at > /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 > [0] MatSetValues_Seq_Hash() at > /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 > [0] PetscHMapIJVQuerySet() at > /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 > [0] Error in external library > [0] [khash] Assertion: `ret >= 0' failed. > ``` > > If I run the same script a handful of times, it will run without errors > eventually. > Does anyone have insight on why it is behaving this way? I'm running on a > node with 3x NVIDIA A100 PCIE 40GB. > > Thank you! > Anna > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.tardieu at edf.fr Thu Jan 18 08:46:22 2024 From: nicolas.tardieu at edf.fr (TARDIEU Nicolas) Date: Thu, 18 Jan 2024 14:46:22 +0000 Subject: [petsc-users] Undestanding how to increase the overlap In-Reply-To: References: Message-ID: Hi Matt, The isovl is in global numbering. I am pasting the output of the script : #-------------------------------------------------------------------------------- # The matrix # ----------- [1,0]:Mat Object: 2 MPI processes [1,0]: type: mpiaij [1,0]:row 0: (0, 0.) [1,0]:row 1: (1, 1.) [1,0]:row 2: (2, 2.) [1,0]:row 3: (3, 3.) [1,0]:row 4: (4, 4.) [1,0]:row 5: (5, 5.) #-------------------------------------------------------------------------------- # The IS isovl before the call to increaseOverlap # ----------- [1,0]:locSize before = 3 [1,0]:IS Object: 1 MPI processes [1,0]: type: stride [1,0]:Number of indices in (stride) set 3 [1,0]:0 0 [1,0]:1 1 [1,0]:2 2 [1,1]:locSize before = 3 [1,1]:IS Object: 1 MPI processes [1,1]: type: stride [1,1]:Number of indices in (stride) set 3 [1,1]:0 3 [1,1]:1 4 [1,1]:2 5 #-------------------------------------------------------------------------------- # The IS isovl after the call to increaseOverlap # ----------- [1,0]:locSize after = 3 [1,0]:IS Object: 1 MPI processes [1,0]: type: general [1,0]:Number of indices in set 3 [1,0]:0 0 [1,0]:1 1 [1,0]:2 2 [1,1]:locSize after = 3 [1,1]:IS Object: 1 MPI processes [1,1]: type: general [1,1]:Number of indices in set 3 [1,1]:0 3 [1,1]:1 4 [1,1]:2 5 #-------------------------------------------------------------------------------- Regards, Nicolas -- Nicolas Tardieu Ing PhD Computational Mechanics EDF - R&D Dpt ERMES PARIS-SACLAY, FRANCE ________________________________ De : knepley at gmail.com Envoy? : jeudi 18 janvier 2024 15:29 ? : TARDIEU Nicolas Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] Undestanding how to increase the overlap On Thu, Jan 18, 2024 at 9:24?AM TARDIEU Nicolas via petsc-users > wrote: Dear PETSc Team, I am trying to understand how to increase the overlap of a matrix. I wrote the attached petsc4py script where I build a simple matrix and play with the increaseOverlap method. Unfortunately, before and after the call, nothing changes in the index set. FYI, I have tried to mimic src/ksp/ksp/tutorials/ex82.c line 72:76. Here is how I run the script : "mpiexec -n 2 python test_overlap.py" Could you please indicate what I am missing ? Usually matrix functions like this take input in global indices. It looks like your isovl is in local indices. Am I reading that correctly? Thanks, Matt Regards, Nicolas -- Nicolas Tardieu Ing PhD Computational Mechanics EDF - R&D Dpt ERMES PARIS-SACLAY, FRANCE Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message. Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message. Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 18 08:49:13 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jan 2024 09:49:13 -0500 Subject: [petsc-users] Undestanding how to increase the overlap In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 9:46?AM TARDIEU Nicolas wrote: > Hi Matt, > The isovl is in global numbering. I am pasting the output of the script : > I see. Is your matrix diagonal? If so, there is no overlap. You need connections to the other rows in order to have overlapping submatrices. Thanks, Matt > > #-------------------------------------------------------------------------------- > # The matrix > # ----------- > [1,0]:Mat Object: 2 MPI processes > [1,0]: type: mpiaij > [1,0]:row 0: (0, 0.) > [1,0]:row 1: (1, 1.) > [1,0]:row 2: (2, 2.) > [1,0]:row 3: (3, 3.) > [1,0]:row 4: (4, 4.) > [1,0]:row 5: (5, 5.) > > #-------------------------------------------------------------------------------- > # The IS isovl before the call to increaseOverlap > # ----------- > [1,0]:locSize before = 3 > [1,0]:IS Object: 1 MPI processes > [1,0]: type: stride > [1,0]:Number of indices in (stride) set 3 > [1,0]:0 0 > [1,0]:1 1 > [1,0]:2 2 > [1,1]:locSize before = 3 > [1,1]:IS Object: 1 MPI processes > [1,1]: type: stride > [1,1]:Number of indices in (stride) set 3 > [1,1]:0 3 > [1,1]:1 4 > [1,1]:2 5 > > #-------------------------------------------------------------------------------- > # The IS isovl after the call to increaseOverlap > # ----------- > [1,0]:locSize after = 3 > [1,0]:IS Object: 1 MPI processes > [1,0]: type: general > [1,0]:Number of indices in set 3 > [1,0]:0 0 > [1,0]:1 1 > [1,0]:2 2 > [1,1]:locSize after = 3 > [1,1]:IS Object: 1 MPI processes > [1,1]: type: general > [1,1]:Number of indices in set 3 > [1,1]:0 3 > [1,1]:1 4 > [1,1]:2 5 > > #-------------------------------------------------------------------------------- > > Regards, > Nicolas > -- > Nicolas Tardieu > Ing PhD Computational Mechanics > EDF - R&D Dpt ERMES > PARIS-SACLAY, FRANCE > ------------------------------ > *De :* knepley at gmail.com > *Envoy? :* jeudi 18 janvier 2024 15:29 > *? :* TARDIEU Nicolas > *Cc :* petsc-users at mcs.anl.gov > *Objet :* Re: [petsc-users] Undestanding how to increase the overlap > > On Thu, Jan 18, 2024 at 9:24?AM TARDIEU Nicolas via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Dear PETSc Team, > > I am trying to understand how to increase the overlap of a matrix. > I wrote the attached petsc4py script where I build a simple matrix and > play with the increaseOverlap method. Unfortunately, before and after the > call, nothing changes in the index set. FYI, I have tried to mimic > src/ksp/ksp/tutorials/ex82.c > line 72:76. > Here is how I run the script : "mpiexec -n 2 python test_overlap.py" > > Could you please indicate what I am missing ? > > > Usually matrix functions like this take input in global indices. It looks > like your isovl is in local indices. Am I reading that correctly? > > Thanks, > > Matt > > > Regards, > Nicolas > -- > Nicolas Tardieu > Ing PhD Computational Mechanics > EDF - R&D Dpt ERMES > PARIS-SACLAY, FRANCE > > > Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont > ?tablis ? l'intention exclusive des destinataires et les informations qui y > figurent sont strictement confidentielles. Toute utilisation de ce Message > non conforme ? sa destination, toute diffusion ou toute publication totale > ou partielle, est interdite sauf autorisation expresse. > > Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de > le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou > partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de > votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace > sur quelque support que ce soit. Nous vous remercions ?galement d'en > avertir imm?diatement l'exp?diteur par retour du message. > > Il est impossible de garantir que les communications par messagerie > ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute > erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for > the addressees. The information contained in this Message is confidential. > Any use of information contained in this Message not in accord with its > purpose, any dissemination or disclosure, either whole or partial, is > prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use > any part of it. If you have received this message in error, please delete > it and all copies from your system and notify the sender immediately by > return message. > > E-mail communication cannot be guaranteed to be timely secure, error or > virus-free. > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont > ?tablis ? l'intention exclusive des destinataires et les informations qui y > figurent sont strictement confidentielles. Toute utilisation de ce Message > non conforme ? sa destination, toute diffusion ou toute publication totale > ou partielle, est interdite sauf autorisation expresse. > > Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de > le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou > partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de > votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace > sur quelque support que ce soit. Nous vous remercions ?galement d'en > avertir imm?diatement l'exp?diteur par retour du message. > > Il est impossible de garantir que les communications par messagerie > ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute > erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for > the addressees. The information contained in this Message is confidential. > Any use of information contained in this Message not in accord with its > purpose, any dissemination or disclosure, either whole or partial, is > prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use > any part of it. If you have received this message in error, please delete > it and all copies from your system and notify the sender immediately by > return message. > > E-mail communication cannot be guaranteed to be timely secure, error or > virus-free. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.tardieu at edf.fr Thu Jan 18 09:05:26 2024 From: nicolas.tardieu at edf.fr (TARDIEU Nicolas) Date: Thu, 18 Jan 2024 15:05:26 +0000 Subject: [petsc-users] Undestanding how to increase the overlap In-Reply-To: References: Message-ID: Arrghhh ! Shame on me ! Sorry for that and thank you for the help Matt. I works like a charm now. ________________________________ De : knepley at gmail.com Envoy? : jeudi 18 janvier 2024 15:49 ? : TARDIEU Nicolas Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] Undestanding how to increase the overlap On Thu, Jan 18, 2024 at 9:46?AM TARDIEU Nicolas > wrote: Hi Matt, The isovl is in global numbering. I am pasting the output of the script : I see. Is your matrix diagonal? If so, there is no overlap. You need connections to the other rows in order to have overlapping submatrices. Thanks, Matt #-------------------------------------------------------------------------------- # The matrix # ----------- [1,0]:Mat Object: 2 MPI processes [1,0]: type: mpiaij [1,0]:row 0: (0, 0.) [1,0]:row 1: (1, 1.) [1,0]:row 2: (2, 2.) [1,0]:row 3: (3, 3.) [1,0]:row 4: (4, 4.) [1,0]:row 5: (5, 5.) #-------------------------------------------------------------------------------- # The IS isovl before the call to increaseOverlap # ----------- [1,0]:locSize before = 3 [1,0]:IS Object: 1 MPI processes [1,0]: type: stride [1,0]:Number of indices in (stride) set 3 [1,0]:0 0 [1,0]:1 1 [1,0]:2 2 [1,1]:locSize before = 3 [1,1]:IS Object: 1 MPI processes [1,1]: type: stride [1,1]:Number of indices in (stride) set 3 [1,1]:0 3 [1,1]:1 4 [1,1]:2 5 #-------------------------------------------------------------------------------- # The IS isovl after the call to increaseOverlap # ----------- [1,0]:locSize after = 3 [1,0]:IS Object: 1 MPI processes [1,0]: type: general [1,0]:Number of indices in set 3 [1,0]:0 0 [1,0]:1 1 [1,0]:2 2 [1,1]:locSize after = 3 [1,1]:IS Object: 1 MPI processes [1,1]: type: general [1,1]:Number of indices in set 3 [1,1]:0 3 [1,1]:1 4 [1,1]:2 5 #-------------------------------------------------------------------------------- Regards, Nicolas -- Nicolas Tardieu Ing PhD Computational Mechanics EDF - R&D Dpt ERMES PARIS-SACLAY, FRANCE ________________________________ De : knepley at gmail.com > Envoy? : jeudi 18 janvier 2024 15:29 ? : TARDIEU Nicolas > Cc : petsc-users at mcs.anl.gov > Objet : Re: [petsc-users] Undestanding how to increase the overlap On Thu, Jan 18, 2024 at 9:24?AM TARDIEU Nicolas via petsc-users > wrote: Dear PETSc Team, I am trying to understand how to increase the overlap of a matrix. I wrote the attached petsc4py script where I build a simple matrix and play with the increaseOverlap method. Unfortunately, before and after the call, nothing changes in the index set. FYI, I have tried to mimic src/ksp/ksp/tutorials/ex82.c line 72:76. Here is how I run the script : "mpiexec -n 2 python test_overlap.py" Could you please indicate what I am missing ? Usually matrix functions like this take input in global indices. It looks like your isovl is in local indices. Am I reading that correctly? Thanks, Matt Regards, Nicolas -- Nicolas Tardieu Ing PhD Computational Mechanics EDF - R&D Dpt ERMES PARIS-SACLAY, FRANCE Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message. Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message. Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message. Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Jan 18 09:08:11 2024 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 18 Jan 2024 10:08:11 -0500 Subject: [petsc-users] Swarm view HDF5 Message-ID: I am trying to view a DMSwarm with: -weights_view hdf5:part.h5 Vec f; PetscCall(DMSetOutputSequenceNumber(sw, 0, 0.0)); PetscCall(DMSwarmCreateGlobalVectorFromField(sw, "w_q", &f)); PetscCall(PetscObjectSetName((PetscObject)f, "particle weights")); PetscCall(VecViewFromOptions(f, NULL, "-weights_view")); PetscCall(DMSwarmDestroyGlobalVectorFromField(sw, "w_q", &f)); And I get this error. I had this working once and did not set PetscViewerHDF5PushTimestepping, so I wanted to check. Thanks, Mark [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Timestepping has not been pushed yet. Call PetscViewerHDF5PushTimestepping() first [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [0]PETSC ERROR: Option left: name:-options_left (no value) source: command line [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.20.3-461-g585a01bd688 GIT Date: 2024-01-16 23:32:45 +0000 [0]PETSC ERROR: ./ex30k on a arch-macosx-gnu-O named MarksMac-302.local by markadams Thu Jan 18 10:05:53 2024 [0]PETSC ERROR: Configure options CFLAGS="-g -Wno-deprecated-declarations " CXXFLAGS="-g -Wno-deprecated-declarations " COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich --with-strict-petscerrorcode --download-triangle=1 --with-debugging=0 --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O [0]PETSC ERROR: #1 PetscViewerHDF5SetTimestep() at /Users/markadams/Codes/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c:990 [0]PETSC ERROR: #2 VecView_Swarm_HDF5_Internal() at /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:45 [0]PETSC ERROR: #3 VecView_Swarm() at /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:86 [0]PETSC ERROR: #4 VecView() at /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:806 [0]PETSC ERROR: #5 PetscObjectView() at /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:76 [0]PETSC ERROR: #6 PetscObjectViewFromOptions() at /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:128 [0]PETSC ERROR: #7 VecViewFromOptions() at /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:691 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 18 10:25:01 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jan 2024 11:25:01 -0500 Subject: [petsc-users] Undestanding how to increase the overlap In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 10:05?AM TARDIEU Nicolas wrote: > Arrghhh ! Shame on me ! > Sorry for that and thank you for the help Matt. > I works like a charm now. > No problem. I like when I figure it out :) Thanks, Matt > ------------------------------ > *De :* knepley at gmail.com > *Envoy? :* jeudi 18 janvier 2024 15:49 > *? :* TARDIEU Nicolas > *Cc :* petsc-users at mcs.anl.gov > *Objet :* Re: [petsc-users] Undestanding how to increase the overlap > > On Thu, Jan 18, 2024 at 9:46?AM TARDIEU Nicolas > wrote: > > Hi Matt, > The isovl is in global numbering. I am pasting the output of the script : > > > I see. Is your matrix diagonal? If so, there is no overlap. You need > connections to the other rows in order to have overlapping submatrices. > > Thanks, > > Matt > > > > #-------------------------------------------------------------------------------- > # The matrix > # ----------- > [1,0]:Mat Object: 2 MPI processes > [1,0]: type: mpiaij > [1,0]:row 0: (0, 0.) > [1,0]:row 1: (1, 1.) > [1,0]:row 2: (2, 2.) > [1,0]:row 3: (3, 3.) > [1,0]:row 4: (4, 4.) > [1,0]:row 5: (5, 5.) > > #-------------------------------------------------------------------------------- > # The IS isovl before the call to increaseOverlap > # ----------- > [1,0]:locSize before = 3 > [1,0]:IS Object: 1 MPI processes > [1,0]: type: stride > [1,0]:Number of indices in (stride) set 3 > [1,0]:0 0 > [1,0]:1 1 > [1,0]:2 2 > [1,1]:locSize before = 3 > [1,1]:IS Object: 1 MPI processes > [1,1]: type: stride > [1,1]:Number of indices in (stride) set 3 > [1,1]:0 3 > [1,1]:1 4 > [1,1]:2 5 > > #-------------------------------------------------------------------------------- > # The IS isovl after the call to increaseOverlap > # ----------- > [1,0]:locSize after = 3 > [1,0]:IS Object: 1 MPI processes > [1,0]: type: general > [1,0]:Number of indices in set 3 > [1,0]:0 0 > [1,0]:1 1 > [1,0]:2 2 > [1,1]:locSize after = 3 > [1,1]:IS Object: 1 MPI processes > [1,1]: type: general > [1,1]:Number of indices in set 3 > [1,1]:0 3 > [1,1]:1 4 > [1,1]:2 5 > > #-------------------------------------------------------------------------------- > > Regards, > Nicolas > -- > Nicolas Tardieu > Ing PhD Computational Mechanics > EDF - R&D Dpt ERMES > PARIS-SACLAY, FRANCE > ------------------------------ > *De :* knepley at gmail.com > *Envoy? :* jeudi 18 janvier 2024 15:29 > *? :* TARDIEU Nicolas > *Cc :* petsc-users at mcs.anl.gov > *Objet :* Re: [petsc-users] Undestanding how to increase the overlap > > On Thu, Jan 18, 2024 at 9:24?AM TARDIEU Nicolas via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Dear PETSc Team, > > I am trying to understand how to increase the overlap of a matrix. > I wrote the attached petsc4py script where I build a simple matrix and > play with the increaseOverlap method. Unfortunately, before and after the > call, nothing changes in the index set. FYI, I have tried to mimic > src/ksp/ksp/tutorials/ex82.c > line 72:76. > Here is how I run the script : "mpiexec -n 2 python test_overlap.py" > > Could you please indicate what I am missing ? > > > Usually matrix functions like this take input in global indices. It looks > like your isovl is in local indices. Am I reading that correctly? > > Thanks, > > Matt > > > Regards, > Nicolas > -- > Nicolas Tardieu > Ing PhD Computational Mechanics > EDF - R&D Dpt ERMES > PARIS-SACLAY, FRANCE > > > Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont > ?tablis ? l'intention exclusive des destinataires et les informations qui y > figurent sont strictement confidentielles. Toute utilisation de ce Message > non conforme ? sa destination, toute diffusion ou toute publication totale > ou partielle, est interdite sauf autorisation expresse. > > Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de > le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou > partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de > votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace > sur quelque support que ce soit. Nous vous remercions ?galement d'en > avertir imm?diatement l'exp?diteur par retour du message. > > Il est impossible de garantir que les communications par messagerie > ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute > erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for > the addressees. The information contained in this Message is confidential. > Any use of information contained in this Message not in accord with its > purpose, any dissemination or disclosure, either whole or partial, is > prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use > any part of it. If you have received this message in error, please delete > it and all copies from your system and notify the sender immediately by > return message. > > E-mail communication cannot be guaranteed to be timely secure, error or > virus-free. > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont > ?tablis ? l'intention exclusive des destinataires et les informations qui y > figurent sont strictement confidentielles. Toute utilisation de ce Message > non conforme ? sa destination, toute diffusion ou toute publication totale > ou partielle, est interdite sauf autorisation expresse. > > Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de > le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou > partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de > votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace > sur quelque support que ce soit. Nous vous remercions ?galement d'en > avertir imm?diatement l'exp?diteur par retour du message. > > Il est impossible de garantir que les communications par messagerie > ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute > erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for > the addressees. The information contained in this Message is confidential. > Any use of information contained in this Message not in accord with its > purpose, any dissemination or disclosure, either whole or partial, is > prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use > any part of it. If you have received this message in error, please delete > it and all copies from your system and notify the sender immediately by > return message. > > E-mail communication cannot be guaranteed to be timely secure, error or > virus-free. > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont > ?tablis ? l'intention exclusive des destinataires et les informations qui y > figurent sont strictement confidentielles. Toute utilisation de ce Message > non conforme ? sa destination, toute diffusion ou toute publication totale > ou partielle, est interdite sauf autorisation expresse. > > Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de > le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou > partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de > votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace > sur quelque support que ce soit. Nous vous remercions ?galement d'en > avertir imm?diatement l'exp?diteur par retour du message. > > Il est impossible de garantir que les communications par messagerie > ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute > erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for > the addressees. The information contained in this Message is confidential. > Any use of information contained in this Message not in accord with its > purpose, any dissemination or disclosure, either whole or partial, is > prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use > any part of it. If you have received this message in error, please delete > it and all copies from your system and notify the sender immediately by > return message. > > E-mail communication cannot be guaranteed to be timely secure, error or > virus-free. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 18 10:26:13 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jan 2024 11:26:13 -0500 Subject: [petsc-users] Swarm view HDF5 In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 10:08?AM Mark Adams wrote: > I am trying to view a DMSwarm with: -weights_view hdf5:part.h5 > > Vec f; > PetscCall(DMSetOutputSequenceNumber(sw, 0, 0.0)); > PetscCall(DMSwarmCreateGlobalVectorFromField(sw, "w_q", &f)); > PetscCall(PetscObjectSetName((PetscObject)f, "particle weights")); > PetscCall(VecViewFromOptions(f, NULL, "-weights_view")); > PetscCall(DMSwarmDestroyGlobalVectorFromField(sw, "w_q", &f)); > > And I get this error. I had this working once and did not set > PetscViewerHDF5PushTimestepping, so I wanted to check. > We probably were not checking then. We might have to check there when we set the timestep. Thanks, Matt > Thanks, > Mark > > > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Timestepping has not been pushed yet. Call > PetscViewerHDF5PushTimestepping() first > [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the > program crashed before usage or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-options_left (no value) source: > command line > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.20.3-461-g585a01bd688 > GIT Date: 2024-01-16 23:32:45 +0000 > [0]PETSC ERROR: ./ex30k on a arch-macosx-gnu-O named MarksMac-302.local by > markadams Thu Jan 18 10:05:53 2024 > [0]PETSC ERROR: Configure options CFLAGS="-g -Wno-deprecated-declarations > " CXXFLAGS="-g -Wno-deprecated-declarations " COPTFLAGS=-O CXXOPTFLAGS=-O > --with-cc=/usr/local/opt/llvm/bin/clang > --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich > --with-strict-petscerrorcode --download-triangle=1 --with-debugging=0 > --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O > [0]PETSC ERROR: #1 PetscViewerHDF5SetTimestep() at > /Users/markadams/Codes/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c:990 > [0]PETSC ERROR: #2 VecView_Swarm_HDF5_Internal() at > /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:45 > [0]PETSC ERROR: #3 VecView_Swarm() at > /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:86 > [0]PETSC ERROR: #4 VecView() at > /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:806 > [0]PETSC ERROR: #5 PetscObjectView() at > /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:76 > [0]PETSC ERROR: #6 PetscObjectViewFromOptions() at > /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:128 > [0]PETSC ERROR: #7 VecViewFromOptions() at > /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:691 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alscheinberg at gmail.com Thu Jan 18 10:55:30 2024 From: alscheinberg at gmail.com (Aaron Scheinberg) Date: Thu, 18 Jan 2024 11:55:30 -0500 Subject: [petsc-users] undefined reference to `petsc_allreduce_ct_th' Message-ID: Hello, I'm getting this error when linking: undefined reference to `petsc_allreduce_ct_th' The instances are regular MPI_Allreduces in my code that are not located in parts of the code related to PETSc, so I'm wondering what is happening to involve PETSc here? Can I configure it to avoid that? I consulted google, the FAQ and skimmed other documentation but didn't see anything. Thanks! Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Jan 18 11:02:05 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 18 Jan 2024 11:02:05 -0600 (CST) Subject: [petsc-users] undefined reference to `petsc_allreduce_ct_th' In-Reply-To: References: Message-ID: <880caffa-5bdf-3c71-ffe2-01b951c977e4@mcs.anl.gov> On Thu, 18 Jan 2024, Aaron Scheinberg wrote: > Hello, > > I'm getting this error when linking: > > undefined reference to `petsc_allreduce_ct_th' > > The instances are regular MPI_Allreduces in my code that are not located in > parts of the code related to PETSc, so I'm wondering what is happening to > involve PETSc here? This symbol should be in libpetsc.so. Are you including petsc.h - but not linking in -lpetsc - from your code? balay at pj01:~/petsc/arch-linux-c-debug/lib$ nm -Ao libpetsc.so |grep petsc_allreduce_ct_th libpetsc.so:0000000004279a50 B petsc_allreduce_ct_th > Can I configure it to avoid that? I consulted google, > the FAQ and skimmed other documentation but didn't see anything. Thanks! If you wish to avoid petsc logging of MPI messages (but include petsc.h in your code?) - you can use in your code: >>>> #define PETSC_HAVE_BROKEN_RECURSIVE_MACRO #include <<<< Or build it with -DPETSC_HAVE_BROKEN_RECURSIVE_MACRO compiler option Satish From bsmith at petsc.dev Thu Jan 18 11:03:28 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 18 Jan 2024 12:03:28 -0500 Subject: [petsc-users] undefined reference to `petsc_allreduce_ct_th' In-Reply-To: References: Message-ID: The PETSc petsclog.h (included by petscsys.h) uses C macro magic to log calls to MPI routines. This is how the symbol is getting into your code. But normally if you use PetscInitialize() and link to the PETSc library the symbol would get resolved. If that part of the code does not need PETSc at all you can not include petscsys.h and instead include mpi.h otherwise you need to track down why when your code gets linked against PETSc libraries that symbol is not resolved. Barry > On Jan 18, 2024, at 11:55?AM, Aaron Scheinberg wrote: > > Hello, > > I'm getting this error when linking: > > undefined reference to `petsc_allreduce_ct_th' > > The instances are regular MPI_Allreduces in my code that are not located in parts of the code related to PETSc, so I'm wondering what is happening to involve PETSc here? Can I configure it to avoid that? I consulted google, the FAQ and skimmed other documentation but didn't see anything. Thanks! > > Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From alscheinberg at gmail.com Thu Jan 18 11:56:03 2024 From: alscheinberg at gmail.com (Aaron Scheinberg) Date: Thu, 18 Jan 2024 12:56:03 -0500 Subject: [petsc-users] undefined reference to `petsc_allreduce_ct_th' In-Reply-To: References: Message-ID: Thanks, it turns out there was another installation of PETSc, and it was linking with the wrong one. It builds now. On Thu, Jan 18, 2024 at 12:03?PM Barry Smith wrote: > > The PETSc petsclog.h (included by petscsys.h) uses C macro magic to > log calls to MPI routines. This is how the symbol is getting into your > code. But normally > if you use PetscInitialize() and link to the PETSc library the symbol > would get resolved. > > If that part of the code does not need PETSc at all you can not include > petscsys.h and instead include mpi.h otherwise you need to track down why > when your code gets linked against PETSc libraries that symbol is not > resolved. > > Barry > > > On Jan 18, 2024, at 11:55?AM, Aaron Scheinberg > wrote: > > Hello, > > I'm getting this error when linking: > > undefined reference to `petsc_allreduce_ct_th' > > The instances are regular MPI_Allreduces in my code that are not located > in parts of the code related to PETSc, so I'm wondering what is happening > to involve PETSc here? Can I configure it to avoid that? I consulted > google, the FAQ and skimmed other documentation but didn't see anything. > Thanks! > > Aaron > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjool at dtu.dk Thu Jan 18 11:59:25 2024 From: pjool at dtu.dk (=?iso-8859-1?Q?Peder_J=F8rgensgaard_Olesen?=) Date: Thu, 18 Jan 2024 17:59:25 +0000 Subject: [petsc-users] ScaLAPACK EPS error Message-ID: Hello, I need to determine the full set of eigenpairs to a rather large (N=16,000) dense Hermitian matrix. I've managed to do this using SLEPc's standard Krylov-Schur EPS, but I think it could be done more efficiently using ScaLAPACK. I receive the following error when attempting this. As I understand it, descinit is used to initialize an array, and the variable in question designates the leading dimension of the array, for which it seems an illegal value is somehow passed. I know ScaLAPACK is an external package, but it seems as if the error would be in the call from SLEPc. Any ideas as to what could cause this? Thanks, Peder Error message (excerpt): PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032 PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250 PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47 PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323 PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134 PETSC ERROR: ------ Error message ------ PETSC ERROR: Error in external library PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9 (...) Log file (excerpt): { 357, 0}: On entry to DESCINIT parameter number 9 had an illegal value [and a few hundred lines similar to this] -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Jan 18 12:20:16 2024 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 18 Jan 2024 13:20:16 -0500 Subject: [petsc-users] Swarm view HDF5 In-Reply-To: References: Message-ID: I had this working at one point. Should I add PetscViewerHDF5PushTimestepping? I don't create a viewer now, but I could make one. Thanks, Mark On Thu, Jan 18, 2024 at 11:26?AM Matthew Knepley wrote: > On Thu, Jan 18, 2024 at 10:08?AM Mark Adams wrote: > >> I am trying to view a DMSwarm with: -weights_view hdf5:part.h5 >> >> Vec f; >> PetscCall(DMSetOutputSequenceNumber(sw, 0, 0.0)); >> PetscCall(DMSwarmCreateGlobalVectorFromField(sw, "w_q", &f)); >> PetscCall(PetscObjectSetName((PetscObject)f, "particle weights")); >> PetscCall(VecViewFromOptions(f, NULL, "-weights_view")); >> PetscCall(DMSwarmDestroyGlobalVectorFromField(sw, "w_q", &f)); >> >> And I get this error. I had this working once and did not set >> PetscViewerHDF5PushTimestepping, so I wanted to check. >> > > We probably were not checking then. We might have to check there when we > set the timestep. > > Thanks, > > Matt > > >> Thanks, >> Mark >> >> >> [0]PETSC ERROR: Object is in wrong state >> [0]PETSC ERROR: Timestepping has not been pushed yet. Call >> PetscViewerHDF5PushTimestepping() first >> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the >> program crashed before usage or a spelling mistake, etc! >> [0]PETSC ERROR: Option left: name:-options_left (no value) source: >> command line >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.20.3-461-g585a01bd688 >> GIT Date: 2024-01-16 23:32:45 +0000 >> [0]PETSC ERROR: ./ex30k on a arch-macosx-gnu-O named MarksMac-302.local >> by markadams Thu Jan 18 10:05:53 2024 >> [0]PETSC ERROR: Configure options CFLAGS="-g -Wno-deprecated-declarations >> " CXXFLAGS="-g -Wno-deprecated-declarations " COPTFLAGS=-O CXXOPTFLAGS=-O >> --with-cc=/usr/local/opt/llvm/bin/clang >> --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich >> --with-strict-petscerrorcode --download-triangle=1 --with-debugging=0 >> --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O >> [0]PETSC ERROR: #1 PetscViewerHDF5SetTimestep() at >> /Users/markadams/Codes/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c:990 >> [0]PETSC ERROR: #2 VecView_Swarm_HDF5_Internal() at >> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:45 >> [0]PETSC ERROR: #3 VecView_Swarm() at >> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:86 >> [0]PETSC ERROR: #4 VecView() at >> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:806 >> [0]PETSC ERROR: #5 PetscObjectView() at >> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:76 >> [0]PETSC ERROR: #6 PetscObjectViewFromOptions() at >> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:128 >> [0]PETSC ERROR: #7 VecViewFromOptions() at >> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:691 >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Jan 18 12:28:26 2024 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 18 Jan 2024 19:28:26 +0100 Subject: [petsc-users] ScaLAPACK EPS error In-Reply-To: References: Message-ID: How are you setting up your input matrix? Are you giving the local sizes or setting them to PETSC_DECIDE? Do you get the same error for different number of MPI processes? Can you send a small code reproducing the error? Jose > El 18 ene 2024, a las 18:59, Peder J?rgensgaard Olesen via petsc-users escribi?: > > Hello, > > I need to determine the full set of eigenpairs to a rather large (N=16,000) dense Hermitian matrix. I've managed to do this using SLEPc's standard Krylov-Schur EPS, but I think it could be done more efficiently using ScaLAPACK. I receive the following error when attempting this. As I understand it, descinit is used to initialize an array, and the variable in question designates the leading dimension of the array, for which it seems an illegal value is somehow passed. > > I know ScaLAPACK is an external package, but it seems as if the error would be in the call from SLEPc. Any ideas as to what could cause this? > > Thanks, > Peder > > Error message (excerpt): > > PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032 > PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250 > PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47 > PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323 > PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134 > PETSC ERROR: ------ Error message ------ > PETSC ERROR: Error in external library > PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9 > (...) > > Log file (excerpt): > { 357, 0}: On entry to DESCINIT parameter number 9 had an illegal value > [and a few hundred lines similar to this] From bsmith at petsc.dev Thu Jan 18 12:29:00 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 18 Jan 2024 13:29:00 -0500 Subject: [petsc-users] ScaLAPACK EPS error In-Reply-To: References: Message-ID: Looks like you are using an older version of PETSc. Could you please switch to the latest and try again and send same information if that also fails. Barry > On Jan 18, 2024, at 12:59?PM, Peder J?rgensgaard Olesen via petsc-users wrote: > > Hello, > > I need to determine the full set of eigenpairs to a rather large (N=16,000) dense Hermitian matrix. I've managed to do this using SLEPc's standard Krylov-Schur EPS, but I think it could be done more efficiently using ScaLAPACK. I receive the following error when attempting this. As I understand it, descinit is used to initialize an array, and the variable in question designates the leading dimension of the array, for which it seems an illegal value is somehow passed. > > I know ScaLAPACK is an external package, but it seems as if the error would be in the call from SLEPc. Any ideas as to what could cause this? > > Thanks, > Peder > > Error message (excerpt): > > PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032 > PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250 > PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47 > PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323 > PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134 > PETSC ERROR: ------ Error message ------ > PETSC ERROR: Error in external library > PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9 > (...) > > Log file (excerpt): > { 357, 0}: On entry to DESCINIT parameter number 9 had an illegal value > [and a few hundred lines similar to this] -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjool at dtu.dk Thu Jan 18 14:06:46 2024 From: pjool at dtu.dk (=?iso-8859-1?Q?Peder_J=F8rgensgaard_Olesen?=) Date: Thu, 18 Jan 2024 20:06:46 +0000 Subject: [petsc-users] ScaLAPACK EPS error In-Reply-To: References: Message-ID: I set up the matrix using MatCreateDense(), passing PETSC_DECIDE for the local dimensions. The same error appears with 8, 12, and 16 nodes (32 proc/node). I'll have to get back to you regarding a minimal example. Best, Peder ________________________________ Fra: Jose E. Roman Sendt: 18. januar 2024 19:28 Til: Peder J?rgensgaard Olesen Cc: petsc-users at mcs.anl.gov Emne: Re: [petsc-users] ScaLAPACK EPS error How are you setting up your input matrix? Are you giving the local sizes or setting them to PETSC_DECIDE? Do you get the same error for different number of MPI processes? Can you send a small code reproducing the error? Jose > El 18 ene 2024, a las 18:59, Peder J?rgensgaard Olesen via petsc-users escribi?: > > Hello, > > I need to determine the full set of eigenpairs to a rather large (N=16,000) dense Hermitian matrix. I've managed to do this using SLEPc's standard Krylov-Schur EPS, but I think it could be done more efficiently using ScaLAPACK. I receive the following error when attempting this. As I understand it, descinit is used to initialize an array, and the variable in question designates the leading dimension of the array, for which it seems an illegal value is somehow passed. > > I know ScaLAPACK is an external package, but it seems as if the error would be in the call from SLEPc. Any ideas as to what could cause this? > > Thanks, > Peder > > Error message (excerpt): > > PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032 > PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250 > PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47 > PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323 > PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134 > PETSC ERROR: ------ Error message ------ > PETSC ERROR: Error in external library > PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9 > (...) > > Log file (excerpt): > { 357, 0}: On entry to DESCINIT parameter number 9 had an illegal value > [and a few hundred lines similar to this] -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 18 14:08:16 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jan 2024 15:08:16 -0500 Subject: [petsc-users] Swarm view HDF5 In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 1:20?PM Mark Adams wrote: > I had this working at one point. > > Should I add PetscViewerHDF5PushTimestepping? > I don't create a viewer now, but I could make one. > That will make it work. The real fix would be to check at swarm.c:45 to see whether timestepping is set. Thanks, Matt > Thanks, > Mark > > > On Thu, Jan 18, 2024 at 11:26?AM Matthew Knepley > wrote: > >> On Thu, Jan 18, 2024 at 10:08?AM Mark Adams wrote: >> >>> I am trying to view a DMSwarm with: -weights_view hdf5:part.h5 >>> >>> Vec f; >>> PetscCall(DMSetOutputSequenceNumber(sw, 0, 0.0)); >>> PetscCall(DMSwarmCreateGlobalVectorFromField(sw, "w_q", &f)); >>> PetscCall(PetscObjectSetName((PetscObject)f, "particle weights")); >>> PetscCall(VecViewFromOptions(f, NULL, "-weights_view")); >>> PetscCall(DMSwarmDestroyGlobalVectorFromField(sw, "w_q", &f)); >>> >>> And I get this error. I had this working once and did not set >>> PetscViewerHDF5PushTimestepping, so I wanted to check. >>> >> >> We probably were not checking then. We might have to check there when we >> set the timestep. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Mark >>> >>> >>> [0]PETSC ERROR: Object is in wrong state >>> [0]PETSC ERROR: Timestepping has not been pushed yet. Call >>> PetscViewerHDF5PushTimestepping() first >>> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the >>> program crashed before usage or a spelling mistake, etc! >>> [0]PETSC ERROR: Option left: name:-options_left (no value) source: >>> command line >>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>> [0]PETSC ERROR: Petsc Development GIT revision: v3.20.3-461-g585a01bd688 >>> GIT Date: 2024-01-16 23:32:45 +0000 >>> [0]PETSC ERROR: ./ex30k on a arch-macosx-gnu-O named MarksMac-302.local >>> by markadams Thu Jan 18 10:05:53 2024 >>> [0]PETSC ERROR: Configure options CFLAGS="-g >>> -Wno-deprecated-declarations " CXXFLAGS="-g -Wno-deprecated-declarations " >>> COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang >>> --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich >>> --with-strict-petscerrorcode --download-triangle=1 --with-debugging=0 >>> --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O >>> [0]PETSC ERROR: #1 PetscViewerHDF5SetTimestep() at >>> /Users/markadams/Codes/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c:990 >>> [0]PETSC ERROR: #2 VecView_Swarm_HDF5_Internal() at >>> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:45 >>> [0]PETSC ERROR: #3 VecView_Swarm() at >>> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:86 >>> [0]PETSC ERROR: #4 VecView() at >>> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:806 >>> [0]PETSC ERROR: #5 PetscObjectView() at >>> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:76 >>> [0]PETSC ERROR: #6 PetscObjectViewFromOptions() at >>> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:128 >>> [0]PETSC ERROR: #7 VecViewFromOptions() at >>> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:691 >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjool at dtu.dk Thu Jan 18 14:14:26 2024 From: pjool at dtu.dk (=?utf-8?B?UGVkZXIgSsO4cmdlbnNnYWFyZCBPbGVzZW4=?=) Date: Thu, 18 Jan 2024 20:14:26 +0000 Subject: [petsc-users] ScaLAPACK EPS error In-Reply-To: References: Message-ID: It appears my setup doesn't allow me to use versions > 3.17.4, unfortunately (I believe I'll need to speak to admin for this). Best, Peder ________________________________ Fra: Barry Smith Sendt: 18. januar 2024 19:29 Til: Peder J?rgensgaard Olesen Cc: petsc-users at mcs.anl.gov Emne: Re: [petsc-users] ScaLAPACK EPS error Looks like you are using an older version of PETSc. Could you please switch to the latest and try again and send same information if that also fails. Barry On Jan 18, 2024, at 12:59?PM, Peder J?rgensgaard Olesen via petsc-users wrote: Hello, I need to determine the full set of eigenpairs to a rather large (N=16,000) dense Hermitian matrix. I've managed to do this using SLEPc's standard Krylov-Schur EPS, but I think it could be done more efficiently using ScaLAPACK. I receive the following error when attempting this. As I understand it, descinit is used to initialize an array, and the variable in question designates the leading dimension of the array, for which it seems an illegal value is somehow passed. I know ScaLAPACK is an external package, but it seems as if the error would be in the call from SLEPc. Any ideas as to what could cause this? Thanks, Peder Error message (excerpt): PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032 PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250 PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47 PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323 PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134 PETSC ERROR: ------ Error message ------ PETSC ERROR: Error in external library PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9 (...) Log file (excerpt): { 357, 0}: On entry to DESCINIT parameter number 9 had an illegal value [and a few hundred lines similar to this] -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 18 14:35:59 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 18 Jan 2024 15:35:59 -0500 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: References: Message-ID: Do you ever get a problem with 'aij` ? Can you run in a loop with 'aij' to confirm it doesn't fail then? Barry > On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna wrote: > > Dear Petsc users/developers, > > I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. > For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). > Here is a minimum snippet that populates a sparse tridiagonal matrix. > > ``` > from petsc4py import PETSc > from scipy.sparse import diags > import numpy as np > > n = int(5e5); > > nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 > A = PETSc.Mat(comm=PETSc.COMM_WORLD) > A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) > A.setType('aijcusparse') > tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() > A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. > A.assemble() > ``` > > The error trace is below: > ``` > File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR > File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr > File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv > petsc4py.PETSc.Error: error code 76 > [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 > [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 > [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 > [0] Error in external library > [0] [khash] Assertion: `ret >= 0' failed. > ``` > > If I run the same script a handful of times, it will run without errors eventually. > Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. > > Thank you! > Anna -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 18 15:12:08 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 18 Jan 2024 16:12:08 -0500 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: References: Message-ID: It appears to be crashing in kh_resize() in khash.h on a memory allocation failure when it tries to get additional memory for storing the matrix. This code seems to be only using the CPU memory so it should also fail in a similar way with 'aij'. But the matrix is not large and so I don't think it should be running out of memory. I cannot reproduce the crash with same parameters on my non-CUDA machine so debugging will be tricky. Barry > On Jan 18, 2024, at 3:35?PM, Barry Smith wrote: > > > Do you ever get a problem with 'aij` ? Can you run in a loop with 'aij' to confirm it doesn't fail then? > > > > Barry > > >> On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna wrote: >> >> Dear Petsc users/developers, >> >> I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. >> For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). >> Here is a minimum snippet that populates a sparse tridiagonal matrix. >> >> ``` >> from petsc4py import PETSc >> from scipy.sparse import diags >> import numpy as np >> >> n = int(5e5); >> >> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 >> A = PETSc.Mat(comm=PETSc.COMM_WORLD) >> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) >> A.setType('aijcusparse') >> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() >> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. >> A.assemble() >> ``` >> >> The error trace is below: >> ``` >> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR >> File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr >> File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv >> petsc4py.PETSc.Error: error code 76 >> [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 >> [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 >> [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 >> [0] Error in external library >> [0] [khash] Assertion: `ret >= 0' failed. >> ``` >> >> If I run the same script a handful of times, it will run without errors eventually. >> Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. >> >> Thank you! >> Anna > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anna at oden.utexas.edu Thu Jan 18 15:18:48 2024 From: anna at oden.utexas.edu (Yesypenko, Anna) Date: Thu, 18 Jan 2024 21:18:48 +0000 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: References: Message-ID: Hi Matt, Barry, Apologies for the extra dependency on scipy. I can replicate the error by calling setValue (i,j,v) in a loop as well. In roughly half of 10 runs, the following script fails because of an error in hashmapijv ? the same as my original post. It successfully runs without error the other times. Barry is right that it's CUDA specific. The script runs fine on the CPU. Do you have any suggestions or example scripts on assigning entries to a AIJCUSPARSE matrix? Here is a minimum snippet that doesn't depend on scipy. ``` from petsc4py import PETSc import numpy as np n = int(5e5); nnz = 3 * np.ones(n, dtype=np.int32) nnz[0] = nnz[-1] = 2 A = PETSc.Mat(comm=PETSc.COMM_WORLD) A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) A.setType('aijcusparse') A.setValue(0, 0, 2) A.setValue(0, 1, -1) A.setValue(n-1, n-2, -1) A.setValue(n-1, n-1, 2) for index in range(1, n - 1): A.setValue(index, index - 1, -1) A.setValue(index, index, 2) A.setValue(index, index + 1, -1) A.assemble() ``` If it means anything to you, when the hash error occurs, it is for index 67283 after filling 201851 nonzero values. Thank you for your help and suggestions! Anna ________________________________ From: Barry Smith Sent: Thursday, January 18, 2024 2:35 PM To: Yesypenko, Anna Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix Do you ever get a problem with 'aij` ? Can you run in a loop with 'aij' to confirm it doesn't fail then? Barry On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna wrote: Dear Petsc users/developers, I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). Here is a minimum snippet that populates a sparse tridiagonal matrix. ``` from petsc4py import PETSc from scipy.sparse import diags import numpy as np n = int(5e5); nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 A = PETSc.Mat(comm=PETSc.COMM_WORLD) A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) A.setType('aijcusparse') tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. A.assemble() ``` The error trace is below: ``` File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv petsc4py.PETSc.Error: error code 76 [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 [0] Error in external library [0] [khash] Assertion: `ret >= 0' failed. ``` If I run the same script a handful of times, it will run without errors eventually. Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. Thank you! Anna -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 18 15:28:14 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jan 2024 16:28:14 -0500 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: References: Message-ID: On Thu, Jan 18, 2024 at 4:18?PM Yesypenko, Anna wrote: > Hi Matt, Barry, > > Apologies for the extra dependency on scipy. I can replicate the error by > calling setValue (i,j,v) in a loop as well. > In roughly half of 10 runs, the following script fails because of an error > in hashmapijv ? the same as my original post. > It successfully runs without error the other times. > > Barry is right that it's CUDA specific. The script runs fine on the CPU. > Do you have any suggestions or example scripts on assigning entries to a > AIJCUSPARSE matrix? > Oh, you definitely do not want to be doing this. I believe you would rather 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient. 2) Produce the values on the GPU and call https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/ https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/ This is what most people do who are forming matrices directly on the GPU. What you are currently doing is incredibly inefficient, and I think accounts for you running out of memory. It talks back and forth between the CPU and GPU. Thanks, Matt Here is a minimum snippet that doesn't depend on scipy. > ``` > from petsc4py import PETSc > import numpy as np > > n = int(5e5); > nnz = 3 * np.ones(n, dtype=np.int32) > nnz[0] = nnz[-1] = 2 > A = PETSc.Mat(comm=PETSc.COMM_WORLD) > A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) > A.setType('aijcusparse') > > A.setValue(0, 0, 2) > A.setValue(0, 1, -1) > A.setValue(n-1, n-2, -1) > A.setValue(n-1, n-1, 2) > > for index in range(1, n - 1): > A.setValue(index, index - 1, -1) > A.setValue(index, index, 2) > A.setValue(index, index + 1, -1) > A.assemble() > ``` > If it means anything to you, when the hash error occurs, it is for index > 67283 after filling 201851 nonzero values. > > Thank you for your help and suggestions! > Anna > > ------------------------------ > *From:* Barry Smith > *Sent:* Thursday, January 18, 2024 2:35 PM > *To:* Yesypenko, Anna > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] HashMap Error when populating AIJCUSPARSE > matrix > > > Do you ever get a problem with 'aij` ? Can you run in a loop with > 'aij' to confirm it doesn't fail then? > > > > Barry > > > On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna wrote: > > Dear Petsc users/developers, > > I'm experiencing a bug when using petsc4py with GPU support. It may be my > mistake in how I set up a AIJCUSPARSE matrix. > For larger matrices, I sometimes encounter a error in assigning matrix > values; the error is thrown in PetscHMapIJVQuerySet(). > Here is a minimum snippet that populates a sparse tridiagonal matrix. > > ``` > from petsc4py import PETSc > from scipy.sparse import diags > import numpy as np > > n = int(5e5); > > nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 > A = PETSc.Mat(comm=PETSc.COMM_WORLD) > A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) > A.setType('aijcusparse') > tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() > A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) > ####### this is the line where the error is thrown. > A.assemble() > ``` > > The error trace is below: > ``` > File "petsc4py/PETSc/Mat.pyx", line 2603, in > petsc4py.PETSc.Mat.setValuesCSR > File "petsc4py/PETSc/petscmat.pxi", line 1039, in > petsc4py.PETSc.matsetvalues_csr > File "petsc4py/PETSc/petscmat.pxi", line 1032, in > petsc4py.PETSc.matsetvalues_ijv > petsc4py.PETSc.Error: error code 76 > [0] MatSetValues() at > /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 > [0] MatSetValues_Seq_Hash() at > /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 > [0] PetscHMapIJVQuerySet() at > /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 > [0] Error in external library > [0] [khash] Assertion: `ret >= 0' failed. > ``` > > If I run the same script a handful of times, it will run without errors > eventually. > Does anyone have insight on why it is behaving this way? I'm running on a > node with 3x NVIDIA A100 PCIE 40GB. > > Thank you! > Anna > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 18 15:38:59 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 18 Jan 2024 16:38:59 -0500 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: References: Message-ID: <2A5CEB4C-1446-4126-AE8D-A11360B98B2C@petsc.dev> It is using the hash map system for inserting values which only inserts on the CPU, not on the GPU. So I don't see that it would be moving any data to the GPU until the mat assembly() is done which it never gets to. Hence I have trouble understanding why the GPU has anything to do with the crash. I guess I need to try to reproduce it on a GPU system. Barry > On Jan 18, 2024, at 4:28?PM, Matthew Knepley wrote: > > On Thu, Jan 18, 2024 at 4:18?PM Yesypenko, Anna > wrote: >> Hi Matt, Barry, >> >> Apologies for the extra dependency on scipy. I can replicate the error by calling setValue (i,j,v) in a loop as well. >> In roughly half of 10 runs, the following script fails because of an error in hashmapijv ? the same as my original post. >> It successfully runs without error the other times. >> >> Barry is right that it's CUDA specific. The script runs fine on the CPU. >> Do you have any suggestions or example scripts on assigning entries to a AIJCUSPARSE matrix? > > Oh, you definitely do not want to be doing this. I believe you would rather > > 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient. > > 2) Produce the values on the GPU and call > > https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/ > https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/ > > This is what most people do who are forming matrices directly on the GPU. > > What you are currently doing is incredibly inefficient, and I think accounts for you running out of memory. > It talks back and forth between the CPU and GPU. > > Thanks, > > Matt > >> Here is a minimum snippet that doesn't depend on scipy. >> ``` >> from petsc4py import PETSc >> import numpy as np >> >> n = int(5e5); >> nnz = 3 * np.ones(n, dtype=np.int32) >> nnz[0] = nnz[-1] = 2 >> A = PETSc.Mat(comm=PETSc.COMM_WORLD) >> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) >> A.setType('aijcusparse') >> >> A.setValue(0, 0, 2) >> A.setValue(0, 1, -1) >> A.setValue(n-1, n-2, -1) >> A.setValue(n-1, n-1, 2) >> >> for index in range(1, n - 1): >> A.setValue(index, index - 1, -1) >> A.setValue(index, index, 2) >> A.setValue(index, index + 1, -1) >> A.assemble() >> ``` >> If it means anything to you, when the hash error occurs, it is for index 67283 after filling 201851 nonzero values. >> >> Thank you for your help and suggestions! >> Anna >> >> From: Barry Smith > >> Sent: Thursday, January 18, 2024 2:35 PM >> To: Yesypenko, Anna > >> Cc: petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix >> >> >> Do you ever get a problem with 'aij` ? Can you run in a loop with 'aij' to confirm it doesn't fail then? >> >> >> >> Barry >> >> >>> On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna > wrote: >>> >>> Dear Petsc users/developers, >>> >>> I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. >>> For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). >>> Here is a minimum snippet that populates a sparse tridiagonal matrix. >>> >>> ``` >>> from petsc4py import PETSc >>> from scipy.sparse import diags >>> import numpy as np >>> >>> n = int(5e5); >>> >>> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 >>> A = PETSc.Mat(comm=PETSc.COMM_WORLD) >>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) >>> A.setType('aijcusparse') >>> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() >>> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. >>> A.assemble() >>> ``` >>> >>> The error trace is below: >>> ``` >>> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR >>> File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr >>> File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv >>> petsc4py.PETSc.Error: error code 76 >>> [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 >>> [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 >>> [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 >>> [0] Error in external library >>> [0] [khash] Assertion: `ret >= 0' failed. >>> ``` >>> >>> If I run the same script a handful of times, it will run without errors eventually. >>> Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. >>> >>> Thank you! >>> Anna >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From anna at oden.utexas.edu Thu Jan 18 15:47:55 2024 From: anna at oden.utexas.edu (Yesypenko, Anna) Date: Thu, 18 Jan 2024 21:47:55 +0000 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: <2A5CEB4C-1446-4126-AE8D-A11360B98B2C@petsc.dev> References: <2A5CEB4C-1446-4126-AE8D-A11360B98B2C@petsc.dev> Message-ID: Hi all, Matt's suggestions worked great! The script works consistently now. What I was doing is a bad way to populate sparse matrices on the GPU ? I'm not sure why it fails but luckily we found a fix. Thank you all for your help and suggestions! Best, Anna ________________________________ From: Barry Smith Sent: Thursday, January 18, 2024 3:38 PM To: Yesypenko, Anna Cc: petsc-users at mcs.anl.gov ; Victor Eijkhout Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix It is using the hash map system for inserting values which only inserts on the CPU, not on the GPU. So I don't see that it would be moving any data to the GPU until the mat assembly() is done which it never gets to. Hence I have trouble understanding why the GPU has anything to do with the crash. I guess I need to try to reproduce it on a GPU system. Barry On Jan 18, 2024, at 4:28?PM, Matthew Knepley wrote: On Thu, Jan 18, 2024 at 4:18?PM Yesypenko, Anna > wrote: Hi Matt, Barry, Apologies for the extra dependency on scipy. I can replicate the error by calling setValue (i,j,v) in a loop as well. In roughly half of 10 runs, the following script fails because of an error in hashmapijv ? the same as my original post. It successfully runs without error the other times. Barry is right that it's CUDA specific. The script runs fine on the CPU. Do you have any suggestions or example scripts on assigning entries to a AIJCUSPARSE matrix? Oh, you definitely do not want to be doing this. I believe you would rather 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient. 2) Produce the values on the GPU and call https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/ https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/ This is what most people do who are forming matrices directly on the GPU. What you are currently doing is incredibly inefficient, and I think accounts for you running out of memory. It talks back and forth between the CPU and GPU. Thanks, Matt Here is a minimum snippet that doesn't depend on scipy. ``` from petsc4py import PETSc import numpy as np n = int(5e5); nnz = 3 * np.ones(n, dtype=np.int32) nnz[0] = nnz[-1] = 2 A = PETSc.Mat(comm=PETSc.COMM_WORLD) A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) A.setType('aijcusparse') A.setValue(0, 0, 2) A.setValue(0, 1, -1) A.setValue(n-1, n-2, -1) A.setValue(n-1, n-1, 2) for index in range(1, n - 1): A.setValue(index, index - 1, -1) A.setValue(index, index, 2) A.setValue(index, index + 1, -1) A.assemble() ``` If it means anything to you, when the hash error occurs, it is for index 67283 after filling 201851 nonzero values. Thank you for your help and suggestions! Anna ________________________________ From: Barry Smith > Sent: Thursday, January 18, 2024 2:35 PM To: Yesypenko, Anna > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix Do you ever get a problem with 'aij` ? Can you run in a loop with 'aij' to confirm it doesn't fail then? Barry On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna > wrote: Dear Petsc users/developers, I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). Here is a minimum snippet that populates a sparse tridiagonal matrix. ``` from petsc4py import PETSc from scipy.sparse import diags import numpy as np n = int(5e5); nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 A = PETSc.Mat(comm=PETSc.COMM_WORLD) A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) A.setType('aijcusparse') tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. A.assemble() ``` The error trace is below: ``` File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv petsc4py.PETSc.Error: error code 76 [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 [0] Error in external library [0] [khash] Assertion: `ret >= 0' failed. ``` If I run the same script a handful of times, it will run without errors eventually. Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. Thank you! Anna -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 18 16:43:55 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 18 Jan 2024 17:43:55 -0500 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: <2A5CEB4C-1446-4126-AE8D-A11360B98B2C@petsc.dev> References: <2A5CEB4C-1446-4126-AE8D-A11360B98B2C@petsc.dev> Message-ID: <85B83047-55B2-4E55-B4D1-E8B6DE2FCF51@petsc.dev> Ok, I ran it on an ANL machine with CUDA and it worked fine for many runs, even increased the problem size without producing any problems. Both versions of the Python code. Anna, What version of PETSc are you using? Victor, Does anyone at ANL have access to this TACC system to try to reproduce? Barry > On Jan 18, 2024, at 4:38?PM, Barry Smith wrote: > > > It is using the hash map system for inserting values which only inserts on the CPU, not on the GPU. So I don't see that it would be moving any data to the GPU until the mat assembly() is done which it never gets to. Hence I have trouble understanding why the GPU has anything to do with the crash. > > I guess I need to try to reproduce it on a GPU system. > > Barry > > > > >> On Jan 18, 2024, at 4:28?PM, Matthew Knepley wrote: >> >> On Thu, Jan 18, 2024 at 4:18?PM Yesypenko, Anna > wrote: >>> Hi Matt, Barry, >>> >>> Apologies for the extra dependency on scipy. I can replicate the error by calling setValue (i,j,v) in a loop as well. >>> In roughly half of 10 runs, the following script fails because of an error in hashmapijv ? the same as my original post. >>> It successfully runs without error the other times. >>> >>> Barry is right that it's CUDA specific. The script runs fine on the CPU. >>> Do you have any suggestions or example scripts on assigning entries to a AIJCUSPARSE matrix? >> >> Oh, you definitely do not want to be doing this. I believe you would rather >> >> 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient. >> >> 2) Produce the values on the GPU and call >> >> https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/ >> https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/ >> >> This is what most people do who are forming matrices directly on the GPU. >> >> What you are currently doing is incredibly inefficient, and I think accounts for you running out of memory. >> It talks back and forth between the CPU and GPU. >> >> Thanks, >> >> Matt >> >>> Here is a minimum snippet that doesn't depend on scipy. >>> ``` >>> from petsc4py import PETSc >>> import numpy as np >>> >>> n = int(5e5); >>> nnz = 3 * np.ones(n, dtype=np.int32) >>> nnz[0] = nnz[-1] = 2 >>> A = PETSc.Mat(comm=PETSc.COMM_WORLD) >>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) >>> A.setType('aijcusparse') >>> >>> A.setValue(0, 0, 2) >>> A.setValue(0, 1, -1) >>> A.setValue(n-1, n-2, -1) >>> A.setValue(n-1, n-1, 2) >>> >>> for index in range(1, n - 1): >>> A.setValue(index, index - 1, -1) >>> A.setValue(index, index, 2) >>> A.setValue(index, index + 1, -1) >>> A.assemble() >>> ``` >>> If it means anything to you, when the hash error occurs, it is for index 67283 after filling 201851 nonzero values. >>> >>> Thank you for your help and suggestions! >>> Anna >>> >>> From: Barry Smith > >>> Sent: Thursday, January 18, 2024 2:35 PM >>> To: Yesypenko, Anna > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix >>> >>> >>> Do you ever get a problem with 'aij` ? Can you run in a loop with 'aij' to confirm it doesn't fail then? >>> >>> >>> >>> Barry >>> >>> >>>> On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna > wrote: >>>> >>>> Dear Petsc users/developers, >>>> >>>> I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. >>>> For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). >>>> Here is a minimum snippet that populates a sparse tridiagonal matrix. >>>> >>>> ``` >>>> from petsc4py import PETSc >>>> from scipy.sparse import diags >>>> import numpy as np >>>> >>>> n = int(5e5); >>>> >>>> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 >>>> A = PETSc.Mat(comm=PETSc.COMM_WORLD) >>>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) >>>> A.setType('aijcusparse') >>>> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() >>>> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. >>>> A.assemble() >>>> ``` >>>> >>>> The error trace is below: >>>> ``` >>>> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR >>>> File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr >>>> File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv >>>> petsc4py.PETSc.Error: error code 76 >>>> [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 >>>> [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 >>>> [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 >>>> [0] Error in external library >>>> [0] [khash] Assertion: `ret >= 0' failed. >>>> ``` >>>> >>>> If I run the same script a handful of times, it will run without errors eventually. >>>> Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. >>>> >>>> Thank you! >>>> Anna >>> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anna at oden.utexas.edu Thu Jan 18 17:09:44 2024 From: anna at oden.utexas.edu (Yesypenko, Anna) Date: Thu, 18 Jan 2024 23:09:44 +0000 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: <85B83047-55B2-4E55-B4D1-E8B6DE2FCF51@petsc.dev> References: <2A5CEB4C-1446-4126-AE8D-A11360B98B2C@petsc.dev> <85B83047-55B2-4E55-B4D1-E8B6DE2FCF51@petsc.dev> Message-ID: Hi Barry, I'm using version 3.20.3. The tacc system is lonestar6. Best, Anna ________________________________ From: Barry Smith Sent: Thursday, January 18, 2024 4:43 PM To: Yesypenko, Anna Cc: petsc-users at mcs.anl.gov ; Victor Eijkhout Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix Ok, I ran it on an ANL machine with CUDA and it worked fine for many runs, even increased the problem size without producing any problems. Both versions of the Python code. Anna, What version of PETSc are you using? Victor, Does anyone at ANL have access to this TACC system to try to reproduce? Barry On Jan 18, 2024, at 4:38?PM, Barry Smith wrote: It is using the hash map system for inserting values which only inserts on the CPU, not on the GPU. So I don't see that it would be moving any data to the GPU until the mat assembly() is done which it never gets to. Hence I have trouble understanding why the GPU has anything to do with the crash. I guess I need to try to reproduce it on a GPU system. Barry On Jan 18, 2024, at 4:28?PM, Matthew Knepley wrote: On Thu, Jan 18, 2024 at 4:18?PM Yesypenko, Anna > wrote: Hi Matt, Barry, Apologies for the extra dependency on scipy. I can replicate the error by calling setValue (i,j,v) in a loop as well. In roughly half of 10 runs, the following script fails because of an error in hashmapijv ? the same as my original post. It successfully runs without error the other times. Barry is right that it's CUDA specific. The script runs fine on the CPU. Do you have any suggestions or example scripts on assigning entries to a AIJCUSPARSE matrix? Oh, you definitely do not want to be doing this. I believe you would rather 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient. 2) Produce the values on the GPU and call https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/ https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/ This is what most people do who are forming matrices directly on the GPU. What you are currently doing is incredibly inefficient, and I think accounts for you running out of memory. It talks back and forth between the CPU and GPU. Thanks, Matt Here is a minimum snippet that doesn't depend on scipy. ``` from petsc4py import PETSc import numpy as np n = int(5e5); nnz = 3 * np.ones(n, dtype=np.int32) nnz[0] = nnz[-1] = 2 A = PETSc.Mat(comm=PETSc.COMM_WORLD) A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) A.setType('aijcusparse') A.setValue(0, 0, 2) A.setValue(0, 1, -1) A.setValue(n-1, n-2, -1) A.setValue(n-1, n-1, 2) for index in range(1, n - 1): A.setValue(index, index - 1, -1) A.setValue(index, index, 2) A.setValue(index, index + 1, -1) A.assemble() ``` If it means anything to you, when the hash error occurs, it is for index 67283 after filling 201851 nonzero values. Thank you for your help and suggestions! Anna ________________________________ From: Barry Smith > Sent: Thursday, January 18, 2024 2:35 PM To: Yesypenko, Anna > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix Do you ever get a problem with 'aij` ? Can you run in a loop with 'aij' to confirm it doesn't fail then? Barry On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna > wrote: Dear Petsc users/developers, I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). Here is a minimum snippet that populates a sparse tridiagonal matrix. ``` from petsc4py import PETSc from scipy.sparse import diags import numpy as np n = int(5e5); nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 A = PETSc.Mat(comm=PETSc.COMM_WORLD) A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) A.setType('aijcusparse') tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. A.assemble() ``` The error trace is below: ``` File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv petsc4py.PETSc.Error: error code 76 [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 [0] Error in external library [0] [khash] Assertion: `ret >= 0' failed. ``` If I run the same script a handful of times, it will run without errors eventually. Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. Thank you! Anna -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 18 18:10:46 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 18 Jan 2024 19:10:46 -0500 Subject: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix In-Reply-To: References: <2A5CEB4C-1446-4126-AE8D-A11360B98B2C@petsc.dev> <85B83047-55B2-4E55-B4D1-E8B6DE2FCF51@petsc.dev> Message-ID: Thanks. Same version I tried. > On Jan 18, 2024, at 6:09?PM, Yesypenko, Anna wrote: > > Hi Barry, > > I'm using version 3.20.3. The tacc system is lonestar6. > > Best, > Anna > From: Barry Smith > > Sent: Thursday, January 18, 2024 4:43 PM > To: Yesypenko, Anna > > Cc: petsc-users at mcs.anl.gov >; Victor Eijkhout > > Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix > > > Ok, I ran it on an ANL machine with CUDA and it worked fine for many runs, even increased the problem size without producing any problems. Both versions of the Python code. > > Anna, > > What version of PETSc are you using? > > Victor, > > Does anyone at ANL have access to this TACC system to try to reproduce? > > > Barry > > > >> On Jan 18, 2024, at 4:38?PM, Barry Smith > wrote: >> >> >> It is using the hash map system for inserting values which only inserts on the CPU, not on the GPU. So I don't see that it would be moving any data to the GPU until the mat assembly() is done which it never gets to. Hence I have trouble understanding why the GPU has anything to do with the crash. >> >> I guess I need to try to reproduce it on a GPU system. >> >> Barry >> >> >> >> >>> On Jan 18, 2024, at 4:28?PM, Matthew Knepley > wrote: >>> >>> On Thu, Jan 18, 2024 at 4:18?PM Yesypenko, Anna > wrote: >>> Hi Matt, Barry, >>> >>> Apologies for the extra dependency on scipy. I can replicate the error by calling setValue (i,j,v) in a loop as well. >>> In roughly half of 10 runs, the following script fails because of an error in hashmapijv ? the same as my original post. >>> It successfully runs without error the other times. >>> >>> Barry is right that it's CUDA specific. The script runs fine on the CPU. >>> Do you have any suggestions or example scripts on assigning entries to a AIJCUSPARSE matrix? >>> >>> Oh, you definitely do not want to be doing this. I believe you would rather >>> >>> 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient. >>> >>> 2) Produce the values on the GPU and call >>> >>> https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/ >>> https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/ >>> >>> This is what most people do who are forming matrices directly on the GPU. >>> >>> What you are currently doing is incredibly inefficient, and I think accounts for you running out of memory. >>> It talks back and forth between the CPU and GPU. >>> >>> Thanks, >>> >>> Matt >>> >>> Here is a minimum snippet that doesn't depend on scipy. >>> ``` >>> from petsc4py import PETSc >>> import numpy as np >>> >>> n = int(5e5); >>> nnz = 3 * np.ones(n, dtype=np.int32) >>> nnz[0] = nnz[-1] = 2 >>> A = PETSc.Mat(comm=PETSc.COMM_WORLD) >>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) >>> A.setType('aijcusparse') >>> >>> A.setValue(0, 0, 2) >>> A.setValue(0, 1, -1) >>> A.setValue(n-1, n-2, -1) >>> A.setValue(n-1, n-1, 2) >>> >>> for index in range(1, n - 1): >>> A.setValue(index, index - 1, -1) >>> A.setValue(index, index, 2) >>> A.setValue(index, index + 1, -1) >>> A.assemble() >>> ``` >>> If it means anything to you, when the hash error occurs, it is for index 67283 after filling 201851 nonzero values. >>> >>> Thank you for your help and suggestions! >>> Anna >>> >>> From: Barry Smith > >>> Sent: Thursday, January 18, 2024 2:35 PM >>> To: Yesypenko, Anna > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix >>> >>> >>> Do you ever get a problem with 'aij` ? Can you run in a loop with 'aij' to confirm it doesn't fail then? >>> >>> >>> >>> Barry >>> >>> >>>> On Jan 17, 2024, at 4:51?PM, Yesypenko, Anna > wrote: >>>> >>>> Dear Petsc users/developers, >>>> >>>> I'm experiencing a bug when using petsc4py with GPU support. It may be my mistake in how I set up a AIJCUSPARSE matrix. >>>> For larger matrices, I sometimes encounter a error in assigning matrix values; the error is thrown in PetscHMapIJVQuerySet(). >>>> Here is a minimum snippet that populates a sparse tridiagonal matrix. >>>> >>>> ``` >>>> from petsc4py import PETSc >>>> from scipy.sparse import diags >>>> import numpy as np >>>> >>>> n = int(5e5); >>>> >>>> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2 >>>> A = PETSc.Mat(comm=PETSc.COMM_WORLD) >>>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz) >>>> A.setType('aijcusparse') >>>> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr() >>>> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data) ####### this is the line where the error is thrown. >>>> A.assemble() >>>> ``` >>>> >>>> The error trace is below: >>>> ``` >>>> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR >>>> File "petsc4py/PETSc/petscmat.pxi", line 1039, in petsc4py.PETSc.matsetvalues_csr >>>> File "petsc4py/PETSc/petscmat.pxi", line 1032, in petsc4py.PETSc.matsetvalues_ijv >>>> petsc4py.PETSc.Error: error code 76 >>>> [0] MatSetValues() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497 >>>> [0] MatSetValues_Seq_Hash() at /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52 >>>> [0] PetscHMapIJVQuerySet() at /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10 >>>> [0] Error in external library >>>> [0] [khash] Assertion: `ret >= 0' failed. >>>> ``` >>>> >>>> If I run the same script a handful of times, it will run without errors eventually. >>>> Does anyone have insight on why it is behaving this way? I'm running on a node with 3x NVIDIA A100 PCIE 40GB. >>>> >>>> Thank you! >>>> Anna >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yuanxi at advancesoft.jp Thu Jan 18 20:17:41 2024 From: yuanxi at advancesoft.jp (=?UTF-8?B?6KKB54WV?=) Date: Fri, 19 Jan 2024 11:17:41 +0900 Subject: [petsc-users] MatAssemblyBegin freezes during MPI communication In-Reply-To: References: Message-ID: Thanks for your explanation. It seems that it is due to my calling of MatDiagonalSet() before MatAssemblyBegin(). My problem is resolved by putting MatDiagonalSet() after MatAssemblyBegin(). Much thanks for your help. Xi YUAN, PhD Solid Mechanics 2024?1?18?(?) 22:20 Junchao Zhang : > > > > On Thu, Jan 18, 2024 at 1:47?AM ?? wrote: > >> Dear PETSc Experts, >> >> My FEM program works well generally, but in some specific cases with >> multiple CPUs are used, it freezes when calling MatAssemblyBegin where >> PMPI_Allreduce is called (see attached file). >> >> After some investigation, I found that it is most probably due to >> >> ? MatSetValue is not called from all CPUs before MatAssemblyBegin >> >> For example, when 4 CPUs are used, if there are elements in CPU 0,1,2 but >> no elements in CPU 3, then all CPUs other than CPU 3 would call >> MatSetValue function. I want to know >> >> 1. If my conjecture could be right? And If so >> > No. All processes do MPI_Allreduce to know if there are incoming values > set by others. To know why hanging, you can attach gdb to all MPI > processes to see where they are. > >> >> > 2. Are there any convenient means to avoid this problem? >> >> Thanks, >> Xi YUAN, PhD Solid Mechanics >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmoschopoulos at outlook.com Fri Jan 19 02:19:48 2024 From: pmoschopoulos at outlook.com (Pantelis Moschopoulos) Date: Fri, 19 Jan 2024 08:19:48 +0000 Subject: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT Message-ID: Dear all, When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything works fine. When I turn now to parallel, I observe that the number of ranks that I can use must divide the number of N without any remainder, where N is the number of unknowns. Otherwise, an error of the following form emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". Can I do something to overcome this? Thanks, Pantelis -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjool at dtu.dk Fri Jan 19 03:35:30 2024 From: pjool at dtu.dk (=?iso-8859-1?Q?Peder_J=F8rgensgaard_Olesen?=) Date: Fri, 19 Jan 2024 09:35:30 +0000 Subject: [petsc-users] ScaLAPACK EPS error In-Reply-To: References: Message-ID: The attached code reproduces the error on my system. Best, Peder ________________________________ Fra: Peder J?rgensgaard Olesen Sendt: 18. januar 2024 21:06 Til: Jose E. Roman Cc: petsc-users at mcs.anl.gov Emne: Sv: [petsc-users] ScaLAPACK EPS error I set up the matrix using MatCreateDense(), passing PETSC_DECIDE for the local dimensions. The same error appears with 8, 12, and 16 nodes (32 proc/node). I'll have to get back to you regarding a minimal example. Best, Peder ________________________________ Fra: Jose E. Roman Sendt: 18. januar 2024 19:28 Til: Peder J?rgensgaard Olesen Cc: petsc-users at mcs.anl.gov Emne: Re: [petsc-users] ScaLAPACK EPS error How are you setting up your input matrix? Are you giving the local sizes or setting them to PETSC_DECIDE? Do you get the same error for different number of MPI processes? Can you send a small code reproducing the error? Jose > El 18 ene 2024, a las 18:59, Peder J?rgensgaard Olesen via petsc-users escribi?: > > Hello, > > I need to determine the full set of eigenpairs to a rather large (N=16,000) dense Hermitian matrix. I've managed to do this using SLEPc's standard Krylov-Schur EPS, but I think it could be done more efficiently using ScaLAPACK. I receive the following error when attempting this. As I understand it, descinit is used to initialize an array, and the variable in question designates the leading dimension of the array, for which it seems an illegal value is somehow passed. > > I know ScaLAPACK is an external package, but it seems as if the error would be in the call from SLEPc. Any ideas as to what could cause this? > > Thanks, > Peder > > Error message (excerpt): > > PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032 > PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250 > PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47 > PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323 > PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134 > PETSC ERROR: ------ Error message ------ > PETSC ERROR: Error in external library > PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9 > (...) > > Log file (excerpt): > { 357, 0}: On entry to DESCINIT parameter number 9 had an illegal value > [and a few hundred lines similar to this] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mre.c URL: From jroman at dsic.upv.es Fri Jan 19 05:37:18 2024 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 19 Jan 2024 12:37:18 +0100 Subject: [petsc-users] ScaLAPACK EPS error In-Reply-To: References: Message-ID: I tried your example with both real and complex arithmetic, with version 3.20. I did not finish the computation, but I can confirm that the EPSSetUp() phase completes without errors. There is a fix that was made in PETSc around version 3.17, it may affect you if you don't have it: https://gitlab.com/petsc/petsc/-/merge_requests/5441 I would suggest upgrading to the latest version. Jose > El 19 ene 2024, a las 10:35, Peder J?rgensgaard Olesen escribi?: > > The attached code reproduces the error on my system. > > Best, > PederFra: Peder J?rgensgaard Olesen > Sendt: 18. januar 2024 21:06 > Til: Jose E. Roman > Cc: petsc-users at mcs.anl.gov > Emne: Sv: [petsc-users] ScaLAPACK EPS error > I set up the matrix using MatCreateDense(), passing PETSC_DECIDE for the local dimensions. > > The same error appears with 8, 12, and 16 nodes (32 proc/node). > > I'll have to get back to you regarding a minimal example. > > Best, > PederFra: Jose E. Roman > Sendt: 18. januar 2024 19:28 > Til: Peder J?rgensgaard Olesen > Cc: petsc-users at mcs.anl.gov > Emne: Re: [petsc-users] ScaLAPACK EPS error > How are you setting up your input matrix? Are you giving the local sizes or setting them to PETSC_DECIDE? > Do you get the same error for different number of MPI processes? > Can you send a small code reproducing the error? > > Jose > > > > El 18 ene 2024, a las 18:59, Peder J?rgensgaard Olesen via petsc-users escribi?: > > > > Hello, > > > > I need to determine the full set of eigenpairs to a rather large (N=16,000) dense Hermitian matrix. I've managed to do this using SLEPc's standard Krylov-Schur EPS, but I think it could be done more efficiently using ScaLAPACK. I receive the following error when attempting this. As I understand it, descinit is used to initialize an array, and the variable in question designates the leading dimension of the array, for which it seems an illegal value is somehow passed. > > > > I know ScaLAPACK is an external package, but it seems as if the error would be in the call from SLEPc. Any ideas as to what could cause this? > > > > Thanks, > > Peder > > > > Error message (excerpt): > > > > PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032 > > PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250 > > PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47 > > PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323 > > PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134 > > PETSC ERROR: ------ Error message ------ > > PETSC ERROR: Error in external library > > PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9 > > (...) > > > > Log file (excerpt): > > { 357, 0}: On entry to DESCINIT parameter number 9 had an illegal value > > [and a few hundred lines similar to this] > > > From bsmith at petsc.dev Fri Jan 19 09:28:33 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 19 Jan 2024 10:28:33 -0500 Subject: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT In-Reply-To: References: Message-ID: Generally fieldsplit is used on problems that have a natural "split" of the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 This is often indicated in the vectors and matrices with the "blocksize" argument, 2 in this case. DM also often provides this information. When laying out a vector/matrix with a blocksize one must ensure that an equal number of of the subsets appears on each MPI process. So, for example, if the above vector is distributed over 3 MPI processes one could use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 v1,u2,v2 u3,v3. Another way to think about it is that one must split up the vector as indexed by block among the processes. For most multicomponent problems this type of decomposition is very natural in the logic of the code. Barry > On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos wrote: > > Dear all, > > When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything works fine. When I turn now to parallel, I observe that the number of ranks that I can use must divide the number of N without any remainder, where N is the number of unknowns. Otherwise, an error of the following form emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". > > Can I do something to overcome this? > > Thanks, > Pantelis -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Jan 19 11:35:17 2024 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 19 Jan 2024 09:35:17 -0800 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) Message-ID: Hi all, I am trying to understand the logging information associated with the %flops-performed-on-the-gpu reported by -log_view when running src/ksp/ksp/tutorials/ex34 with the following options -da_grid_x 192 -da_grid_y 192 -da_grid_z 192 -dm_mat_type seqaijhipsparse -dm_vec_type seqhip -ksp_max_it 10 -ksp_monitor -ksp_type richardson -ksp_view -log_view -mg_coarse_ksp_max_it 2 -mg_coarse_ksp_type richardson -mg_coarse_pc_type none -mg_levels_ksp_type richardson -mg_levels_pc_type none -options_left -pc_mg_levels 3 -pc_mg_log -pc_type mg This config is not intended to actually solve the problem, rather it is a stripped down set of options designed to understand what parts of the smoothers are being executed on the GPU. With respect to the log file attached, my first set of questions related to the data reported under "Event Stage 2: MG Apply". [1] Why is the log littered with nan's? * I don't understand how and why "GPU Mflop/s" should be reported as nan when a value is given for "GPU %F" (see MatMult for example). * For events executed on the GPU, I assume the column "Time (sec)" relates to "CPU execute time", this would explain why we see a nan in "Time (sec)" for MatMult. If my assumption is correct, how should I interpret the column "Flop (Max)" which is showing 1.92e+09? I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as 93. I believe this value should be 100 as the smoother (and coarse grid solver) are configured as richardson(2)+none and thus should run entirely on the GPU. Furthermore, when one inspects all events listed under "Event Stage 2: MG Apply" those events which do flops correctly report "GPU %F" as 100. And the events showing "GPU %F" = 0 such as MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync don't do any flops (on the CPU or GPU) - which is also correct (although non GPU events should show nan??) Hence I am wondering what is the explanation for the missing 7% from "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? Does anyone understand this -log_view, or can explain to me how to interpret it? It could simply be that: a) something is messed up with -pc_mg_log b) something is messed up with the PETSc build c) I am putting too much faith in -log_view and should profile the code differently. Either way I'd really like to understand what is going on. Cheers, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex34_192_mg_seqhip_richardson_pcnone.o5748667 Type: application/octet-stream Size: 25092 bytes Desc: not available URL: From bsmith at petsc.dev Fri Jan 19 13:26:41 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 19 Jan 2024 14:26:41 -0500 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: References: Message-ID: <0D321AB8-9E8F-4484-8D52-AE39FCCD8644@petsc.dev> Nans indicate we do not have valid computational times for these operations; think of them as Not Available. Providing valid times for the "inner" operations listed with Nans requires inaccurate times (higher) for the outer operations, since extra synchronization between the CPU and GPU must be done to get valid times for the inner options. We opted to have the best valid times for the outer operations since those times reflect the time of the application. > On Jan 19, 2024, at 12:35?PM, Dave May wrote: > > Hi all, > > I am trying to understand the logging information associated with the %flops-performed-on-the-gpu reported by -log_view when running > src/ksp/ksp/tutorials/ex34 > with the following options > -da_grid_x 192 > -da_grid_y 192 > -da_grid_z 192 > -dm_mat_type seqaijhipsparse > -dm_vec_type seqhip > -ksp_max_it 10 > -ksp_monitor > -ksp_type richardson > -ksp_view > -log_view > -mg_coarse_ksp_max_it 2 > -mg_coarse_ksp_type richardson > -mg_coarse_pc_type none > -mg_levels_ksp_type richardson > -mg_levels_pc_type none > -options_left > -pc_mg_levels 3 > -pc_mg_log > -pc_type mg > > This config is not intended to actually solve the problem, rather it is a stripped down set of options designed to understand what parts of the smoothers are being executed on the GPU. > > With respect to the log file attached, my first set of questions related to the data reported under "Event Stage 2: MG Apply". > > [1] Why is the log littered with nan's? > * I don't understand how and why "GPU Mflop/s" should be reported as nan when a value is given for "GPU %F" (see MatMult for example). > > * For events executed on the GPU, I assume the column "Time (sec)" relates to "CPU execute time", this would explain why we see a nan in "Time (sec)" for MatMult. > If my assumption is correct, how should I interpret the column "Flop (Max)" which is showing 1.92e+09? > I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" > > [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as 93. I believe this value should be 100 as the smoother (and coarse grid solver) are configured as richardson(2)+none and thus should run entirely on the GPU. > Furthermore, when one inspects all events listed under "Event Stage 2: MG Apply" those events which do flops correctly report "GPU %F" as 100. > And the events showing "GPU %F" = 0 such as > MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync > don't do any flops (on the CPU or GPU) - which is also correct (although non GPU events should show nan??) > > Hence I am wondering what is the explanation for the missing 7% from "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? > > Does anyone understand this -log_view, or can explain to me how to interpret it? > > It could simply be that: > a) something is messed up with -pc_mg_log > b) something is messed up with the PETSc build > c) I am putting too much faith in -log_view and should profile the code differently. > > Either way I'd really like to understand what is going on. > > > Cheers, > Dave > > > > From junchao.zhang at gmail.com Fri Jan 19 13:39:00 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 19 Jan 2024 13:39:00 -0600 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: References: Message-ID: Try to also add -log_view_gpu_time, https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/ --Junchao Zhang On Fri, Jan 19, 2024 at 11:35?AM Dave May wrote: > Hi all, > > I am trying to understand the logging information associated with the > %flops-performed-on-the-gpu reported by -log_view when running > src/ksp/ksp/tutorials/ex34 > with the following options > -da_grid_x 192 > -da_grid_y 192 > -da_grid_z 192 > -dm_mat_type seqaijhipsparse > -dm_vec_type seqhip > -ksp_max_it 10 > -ksp_monitor > -ksp_type richardson > -ksp_view > -log_view > -mg_coarse_ksp_max_it 2 > -mg_coarse_ksp_type richardson > -mg_coarse_pc_type none > -mg_levels_ksp_type richardson > -mg_levels_pc_type none > -options_left > -pc_mg_levels 3 > -pc_mg_log > -pc_type mg > > This config is not intended to actually solve the problem, rather it is a > stripped down set of options designed to understand what parts of the > smoothers are being executed on the GPU. > > With respect to the log file attached, my first set of questions related > to the data reported under "Event Stage 2: MG Apply". > > [1] Why is the log littered with nan's? > * I don't understand how and why "GPU Mflop/s" should be reported as nan > when a value is given for "GPU %F" (see MatMult for example). > > * For events executed on the GPU, I assume the column "Time (sec)" relates > to "CPU execute time", this would explain why we see a nan in "Time (sec)" > for MatMult. > If my assumption is correct, how should I interpret the column "Flop > (Max)" which is showing 1.92e+09? > I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should > also relate to CPU and GPU flops would be logged in "GPU Mflop/s" > > [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, > MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as > 93. I believe this value should be 100 as the smoother (and coarse grid > solver) are configured as richardson(2)+none and thus should run entirely > on the GPU. > Furthermore, when one inspects all events listed under "Event Stage 2: MG > Apply" those events which do flops correctly report "GPU %F" as 100. > And the events showing "GPU %F" = 0 such as > MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync > don't do any flops (on the CPU or GPU) - which is also correct > (although non GPU events should show nan??) > > Hence I am wondering what is the explanation for the missing 7% from "GPU > %F" for KSPSolve and MGSmooth {0,1,2}?? > > Does anyone understand this -log_view, or can explain to me how to > interpret it? > > It could simply be that: > a) something is messed up with -pc_mg_log > b) something is messed up with the PETSc build > c) I am putting too much faith in -log_view and should profile the code > differently. > > Either way I'd really like to understand what is going on. > > > Cheers, > Dave > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Jan 19 14:02:04 2024 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 19 Jan 2024 12:02:04 -0800 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: References: Message-ID: Thank you Barry and Junchao for these explanations. I'll turn on -log_view_gpu_time. Do either of you have any thoughts regarding why the percentage of flop's being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this solver configuration? This number should have nothing to do with timings as it reports the ratio of operations performed on the GPU and CPU, presumably obtained from PetscLogFlops() and PetscLogGpuFlops(). Cheers, Dave On Fri, 19 Jan 2024 at 11:39, Junchao Zhang wrote: > Try to also add -log_view_gpu_time, > https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/ > > --Junchao Zhang > > > On Fri, Jan 19, 2024 at 11:35?AM Dave May wrote: > >> Hi all, >> >> I am trying to understand the logging information associated with the >> %flops-performed-on-the-gpu reported by -log_view when running >> src/ksp/ksp/tutorials/ex34 >> with the following options >> -da_grid_x 192 >> -da_grid_y 192 >> -da_grid_z 192 >> -dm_mat_type seqaijhipsparse >> -dm_vec_type seqhip >> -ksp_max_it 10 >> -ksp_monitor >> -ksp_type richardson >> -ksp_view >> -log_view >> -mg_coarse_ksp_max_it 2 >> -mg_coarse_ksp_type richardson >> -mg_coarse_pc_type none >> -mg_levels_ksp_type richardson >> -mg_levels_pc_type none >> -options_left >> -pc_mg_levels 3 >> -pc_mg_log >> -pc_type mg >> >> This config is not intended to actually solve the problem, rather it is a >> stripped down set of options designed to understand what parts of the >> smoothers are being executed on the GPU. >> >> With respect to the log file attached, my first set of questions related >> to the data reported under "Event Stage 2: MG Apply". >> >> [1] Why is the log littered with nan's? >> * I don't understand how and why "GPU Mflop/s" should be reported as nan >> when a value is given for "GPU %F" (see MatMult for example). >> >> * For events executed on the GPU, I assume the column "Time (sec)" >> relates to "CPU execute time", this would explain why we see a nan in "Time >> (sec)" for MatMult. >> If my assumption is correct, how should I interpret the column "Flop >> (Max)" which is showing 1.92e+09? >> I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" >> should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" >> >> [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, >> MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as >> 93. I believe this value should be 100 as the smoother (and coarse grid >> solver) are configured as richardson(2)+none and thus should run entirely >> on the GPU. >> Furthermore, when one inspects all events listed under "Event Stage 2: MG >> Apply" those events which do flops correctly report "GPU %F" as 100. >> And the events showing "GPU %F" = 0 such as >> MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync >> don't do any flops (on the CPU or GPU) - which is also correct >> (although non GPU events should show nan??) >> >> Hence I am wondering what is the explanation for the missing 7% from "GPU >> %F" for KSPSolve and MGSmooth {0,1,2}?? >> >> Does anyone understand this -log_view, or can explain to me how to >> interpret it? >> >> It could simply be that: >> a) something is messed up with -pc_mg_log >> b) something is messed up with the PETSc build >> c) I am putting too much faith in -log_view and should profile the code >> differently. >> >> Either way I'd really like to understand what is going on. >> >> >> Cheers, >> Dave >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jan 19 14:17:48 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 19 Jan 2024 15:17:48 -0500 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: References: Message-ID: <1C095AF6-5F15-42D7-A831-F97D14A23725@petsc.dev> Junchao I run the following on the CI machine, why does this happen? With trivial solver options it runs ok. bsmith at petsc-gpu-02:/scratch/bsmith/petsc/src/ksp/ksp/tutorials$ ./ex34 -da_grid_x 192 -da_grid_y 192 -da_grid_z 192 -dm_mat_type seqaijhipsparse -dm_vec_type seqhip -ksp_max_it 10 -ksp_monitor -ksp_type richardson -ksp_view -log_view -mg_coarse_ksp_max_it 2 -mg_coarse_ksp_type richardson -mg_coarse_pc_type none -mg_levels_ksp_type richardson -mg_levels_pc_type none -options_left -pc_mg_levels 3 -pc_mg_log -pc_type mg [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: GPU error [0]PETSC ERROR: hipSPARSE errorcode 3 (HIPSPARSE_STATUS_INVALID_VALUE) [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [0]PETSC ERROR: Option left: name:-options_left (no value) source: command line [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.20.3, unknown [0]PETSC ERROR: ./ex34 on a named petsc-gpu-02 by bsmith Fri Jan 19 14:15:20 2024 [0]PETSC ERROR: Configure options --package-prefix-hash=/home/bsmith/petsc-hash-pkgs --with-make-np=24 --with-make-test-np=8 --with-hipc=/opt/rocm-5.4.3/bin/hipcc --with-hip-dir=/opt/rocm-5.4.3 COPTFLAGS="-g -O" FOPTFLAGS="-g -O" CXXOPTFLAGS="-g -O" HIPOPTFLAGS="-g -O" --with-cuda=0 --with-hip=1 --with-precision=double --with-clanguage=c --download-kokkos --download-kokkos-kernels --download-hypre --download-magma --with-magma-fortran-bindings=0 --download-mfem --download-metis --with-strict-petscerrorcode PETSC_ARCH=arch-ci-linux-hip-double [0]PETSC ERROR: #1 MatMultAddKernel_SeqAIJHIPSPARSE() at /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3131 [0]PETSC ERROR: #2 MatMultAdd_SeqAIJHIPSPARSE() at /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3004 [0]PETSC ERROR: #3 MatMultAdd() at /scratch/bsmith/petsc/src/mat/interface/matrix.c:2770 [0]PETSC ERROR: #4 MatInterpolateAdd() at /scratch/bsmith/petsc/src/mat/interface/matrix.c:8603 [0]PETSC ERROR: #5 PCMGMCycle_Private() at /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:87 [0]PETSC ERROR: #6 PCMGMCycle_Private() at /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:83 [0]PETSC ERROR: #7 PCApply_MG_Internal() at /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:611 [0]PETSC ERROR: #8 PCApply_MG() at /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:633 [0]PETSC ERROR: #9 PCApply() at /scratch/bsmith/petsc/src/ksp/pc/interface/precon.c:498 [0]PETSC ERROR: #10 KSP_PCApply() at /scratch/bsmith/petsc/include/petsc/private/kspimpl.h:383 [0]PETSC ERROR: #11 KSPSolve_Richardson() at /scratch/bsmith/petsc/src/ksp/ksp/impls/rich/rich.c:106 [0]PETSC ERROR: #12 KSPSolve_Private() at /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:906 [0]PETSC ERROR: #13 KSPSolve() at /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:1079 [0]PETSC ERROR: #14 main() at ex34.c:52 [0]PETSC ERROR: PETSc Option Table entries: Dave, Trying to debug the 7% now, but having trouble running, as you see above. > On Jan 19, 2024, at 3:02?PM, Dave May wrote: > > Thank you Barry and Junchao for these explanations. I'll turn on -log_view_gpu_time. > > Do either of you have any thoughts regarding why the percentage of flop's being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this solver configuration? > > This number should have nothing to do with timings as it reports the ratio of operations performed on the GPU and CPU, presumably obtained from PetscLogFlops() and PetscLogGpuFlops(). > > Cheers, > Dave > > On Fri, 19 Jan 2024 at 11:39, Junchao Zhang > wrote: >> Try to also add -log_view_gpu_time, https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/ >> >> --Junchao Zhang >> >> >> On Fri, Jan 19, 2024 at 11:35?AM Dave May > wrote: >>> Hi all, >>> >>> I am trying to understand the logging information associated with the %flops-performed-on-the-gpu reported by -log_view when running >>> src/ksp/ksp/tutorials/ex34 >>> with the following options >>> -da_grid_x 192 >>> -da_grid_y 192 >>> -da_grid_z 192 >>> -dm_mat_type seqaijhipsparse >>> -dm_vec_type seqhip >>> -ksp_max_it 10 >>> -ksp_monitor >>> -ksp_type richardson >>> -ksp_view >>> -log_view >>> -mg_coarse_ksp_max_it 2 >>> -mg_coarse_ksp_type richardson >>> -mg_coarse_pc_type none >>> -mg_levels_ksp_type richardson >>> -mg_levels_pc_type none >>> -options_left >>> -pc_mg_levels 3 >>> -pc_mg_log >>> -pc_type mg >>> >>> This config is not intended to actually solve the problem, rather it is a stripped down set of options designed to understand what parts of the smoothers are being executed on the GPU. >>> >>> With respect to the log file attached, my first set of questions related to the data reported under "Event Stage 2: MG Apply". >>> >>> [1] Why is the log littered with nan's? >>> * I don't understand how and why "GPU Mflop/s" should be reported as nan when a value is given for "GPU %F" (see MatMult for example). >>> >>> * For events executed on the GPU, I assume the column "Time (sec)" relates to "CPU execute time", this would explain why we see a nan in "Time (sec)" for MatMult. >>> If my assumption is correct, how should I interpret the column "Flop (Max)" which is showing 1.92e+09? >>> I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" >>> >>> [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as 93. I believe this value should be 100 as the smoother (and coarse grid solver) are configured as richardson(2)+none and thus should run entirely on the GPU. >>> Furthermore, when one inspects all events listed under "Event Stage 2: MG Apply" those events which do flops correctly report "GPU %F" as 100. >>> And the events showing "GPU %F" = 0 such as >>> MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync >>> don't do any flops (on the CPU or GPU) - which is also correct (although non GPU events should show nan??) >>> >>> Hence I am wondering what is the explanation for the missing 7% from "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? >>> >>> Does anyone understand this -log_view, or can explain to me how to interpret it? >>> >>> It could simply be that: >>> a) something is messed up with -pc_mg_log >>> b) something is messed up with the PETSc build >>> c) I am putting too much faith in -log_view and should profile the code differently. >>> >>> Either way I'd really like to understand what is going on. >>> >>> >>> Cheers, >>> Dave >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Jan 19 14:22:48 2024 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 19 Jan 2024 12:22:48 -0800 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: <1C095AF6-5F15-42D7-A831-F97D14A23725@petsc.dev> References: <1C095AF6-5F15-42D7-A831-F97D14A23725@petsc.dev> Message-ID: Thanks Barry! On Fri, 19 Jan 2024 at 12:18, Barry Smith wrote: > > Junchao > > I run the following on the CI machine, why does this happen? With > trivial solver options it runs ok. > > bsmith at petsc-gpu-02:/scratch/bsmith/petsc/src/ksp/ksp/tutorials$ ./ex34 > -da_grid_x 192 -da_grid_y 192 -da_grid_z 192 -dm_mat_type seqaijhipsparse > -dm_vec_type seqhip -ksp_max_it 10 -ksp_monitor -ksp_type richardson > -ksp_view -log_view -mg_coarse_ksp_max_it 2 -mg_coarse_ksp_type richardson > -mg_coarse_pc_type none -mg_levels_ksp_type richardson -mg_levels_pc_type > none -options_left -pc_mg_levels 3 -pc_mg_log -pc_type mg > > *[0]PETSC ERROR: --------------------- Error Message > --------------------------------------------------------------* > > [0]PETSC ERROR: GPU error > > [0]PETSC ERROR: hipSPARSE errorcode 3 (HIPSPARSE_STATUS_INVALID_VALUE) > > [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the > program crashed before usage or a spelling mistake, etc! > > [0]PETSC ERROR: Option left: name:-options_left (no value) source: > command line > > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.20.3, unknown > > [0]PETSC ERROR: ./ex34 on a named petsc-gpu-02 by bsmith Fri Jan 19 > 14:15:20 2024 > > [0]PETSC ERROR: Configure options > --package-prefix-hash=/home/bsmith/petsc-hash-pkgs --with-make-np=24 > --with-make-test-np=8 --with-hipc=/opt/rocm-5.4.3/bin/hipcc > --with-hip-dir=/opt/rocm-5.4.3 COPTFLAGS="-g -O" FOPTFLAGS="-g -O" > CXXOPTFLAGS="-g -O" HIPOPTFLAGS="-g -O" --with-cuda=0 --with-hip=1 > --with-precision=double --with-clanguage=c --download-kokkos > --download-kokkos-kernels --download-hypre --download-magma > --with-magma-fortran-bindings=0 --download-mfem --download-metis > --with-strict-petscerrorcode PETSC_ARCH=arch-ci-linux-hip-double > > [0]PETSC ERROR: #1 MatMultAddKernel_SeqAIJHIPSPARSE() at > /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3131 > > [0]PETSC ERROR: #2 MatMultAdd_SeqAIJHIPSPARSE() at > /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3004 > > [0]PETSC ERROR: #3 MatMultAdd() at > /scratch/bsmith/petsc/src/mat/interface/matrix.c:2770 > > [0]PETSC ERROR: #4 MatInterpolateAdd() at > /scratch/bsmith/petsc/src/mat/interface/matrix.c:8603 > > [0]PETSC ERROR: #5 PCMGMCycle_Private() at > /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:87 > > [0]PETSC ERROR: #6 PCMGMCycle_Private() at > /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:83 > > [0]PETSC ERROR: #7 PCApply_MG_Internal() at > /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:611 > > [0]PETSC ERROR: #8 PCApply_MG() at > /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:633 > > [0]PETSC ERROR: #9 PCApply() at > /scratch/bsmith/petsc/src/ksp/pc/interface/precon.c:498 > > [0]PETSC ERROR: #10 KSP_PCApply() at > /scratch/bsmith/petsc/include/petsc/private/kspimpl.h:383 > > [0]PETSC ERROR: #11 KSPSolve_Richardson() at > /scratch/bsmith/petsc/src/ksp/ksp/impls/rich/rich.c:106 > > [0]PETSC ERROR: #12 KSPSolve_Private() at > /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:906 > > [0]PETSC ERROR: #13 KSPSolve() at > /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:1079 > > [0]PETSC ERROR: #14 main() at ex34.c:52 > > [0]PETSC ERROR: PETSc Option Table entries: > > Dave, > > Trying to debug the 7% now, but having trouble running, as you see > above. > > > > On Jan 19, 2024, at 3:02?PM, Dave May wrote: > > Thank you Barry and Junchao for these explanations. I'll turn on > -log_view_gpu_time. > > Do either of you have any thoughts regarding why the percentage of flop's > being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this > solver configuration? > > This number should have nothing to do with timings as it reports the ratio > of operations performed on the GPU and CPU, presumably obtained from > PetscLogFlops() and PetscLogGpuFlops(). > > Cheers, > Dave > > On Fri, 19 Jan 2024 at 11:39, Junchao Zhang > wrote: > >> Try to also add -log_view_gpu_time, >> https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/ >> >> --Junchao Zhang >> >> >> On Fri, Jan 19, 2024 at 11:35?AM Dave May >> wrote: >> >>> Hi all, >>> >>> I am trying to understand the logging information associated with the >>> %flops-performed-on-the-gpu reported by -log_view when running >>> src/ksp/ksp/tutorials/ex34 >>> with the following options >>> -da_grid_x 192 >>> -da_grid_y 192 >>> -da_grid_z 192 >>> -dm_mat_type seqaijhipsparse >>> -dm_vec_type seqhip >>> -ksp_max_it 10 >>> -ksp_monitor >>> -ksp_type richardson >>> -ksp_view >>> -log_view >>> -mg_coarse_ksp_max_it 2 >>> -mg_coarse_ksp_type richardson >>> -mg_coarse_pc_type none >>> -mg_levels_ksp_type richardson >>> -mg_levels_pc_type none >>> -options_left >>> -pc_mg_levels 3 >>> -pc_mg_log >>> -pc_type mg >>> >>> This config is not intended to actually solve the problem, rather it is >>> a stripped down set of options designed to understand what parts of the >>> smoothers are being executed on the GPU. >>> >>> With respect to the log file attached, my first set of questions related >>> to the data reported under "Event Stage 2: MG Apply". >>> >>> [1] Why is the log littered with nan's? >>> * I don't understand how and why "GPU Mflop/s" should be reported as nan >>> when a value is given for "GPU %F" (see MatMult for example). >>> >>> * For events executed on the GPU, I assume the column "Time (sec)" >>> relates to "CPU execute time", this would explain why we see a nan in "Time >>> (sec)" for MatMult. >>> If my assumption is correct, how should I interpret the column "Flop >>> (Max)" which is showing 1.92e+09? >>> I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" >>> should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" >>> >>> [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, >>> MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as >>> 93. I believe this value should be 100 as the smoother (and coarse grid >>> solver) are configured as richardson(2)+none and thus should run entirely >>> on the GPU. >>> Furthermore, when one inspects all events listed under "Event Stage 2: >>> MG Apply" those events which do flops correctly report "GPU %F" as 100. >>> And the events showing "GPU %F" = 0 such as >>> MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync >>> don't do any flops (on the CPU or GPU) - which is also correct >>> (although non GPU events should show nan??) >>> >>> Hence I am wondering what is the explanation for the missing 7% from >>> "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? >>> >>> Does anyone understand this -log_view, or can explain to me how to >>> interpret it? >>> >>> It could simply be that: >>> a) something is messed up with -pc_mg_log >>> b) something is messed up with the PETSc build >>> c) I am putting too much faith in -log_view and should profile the code >>> differently. >>> >>> Either way I'd really like to understand what is going on. >>> >>> >>> Cheers, >>> Dave >>> >>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jan 19 14:49:39 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 19 Jan 2024 15:49:39 -0500 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: <1C095AF6-5F15-42D7-A831-F97D14A23725@petsc.dev> References: <1C095AF6-5F15-42D7-A831-F97D14A23725@petsc.dev> Message-ID: <99023888-F2E7-457C-90D0-6B4E5E3DEFD2@petsc.dev> Junchao, How come vecseqcupm_impl.hpp has PetscCall(PetscLogFlops(n)); instead of logging the flops on the GPU? This could be the root of the problem, the VecShift used to remove the null space from vectors in the solver is logging incorrectly. (For some reason there is no LogEventBegin/End() for VecShift which is why it doesn't get it on line in the -log_view). Barry > On Jan 19, 2024, at 3:17?PM, Barry Smith wrote: > > > Junchao > > I run the following on the CI machine, why does this happen? With trivial solver options it runs ok. > > bsmith at petsc-gpu-02:/scratch/bsmith/petsc/src/ksp/ksp/tutorials$ ./ex34 -da_grid_x 192 -da_grid_y 192 -da_grid_z 192 -dm_mat_type seqaijhipsparse -dm_vec_type seqhip -ksp_max_it 10 -ksp_monitor -ksp_type richardson -ksp_view -log_view -mg_coarse_ksp_max_it 2 -mg_coarse_ksp_type richardson -mg_coarse_pc_type none -mg_levels_ksp_type richardson -mg_levels_pc_type none -options_left -pc_mg_levels 3 -pc_mg_log -pc_type mg > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: GPU error > [0]PETSC ERROR: hipSPARSE errorcode 3 (HIPSPARSE_STATUS_INVALID_VALUE) > [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-options_left (no value) source: command line > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.20.3, unknown > [0]PETSC ERROR: ./ex34 on a named petsc-gpu-02 by bsmith Fri Jan 19 14:15:20 2024 > [0]PETSC ERROR: Configure options --package-prefix-hash=/home/bsmith/petsc-hash-pkgs --with-make-np=24 --with-make-test-np=8 --with-hipc=/opt/rocm-5.4.3/bin/hipcc --with-hip-dir=/opt/rocm-5.4.3 COPTFLAGS="-g -O" FOPTFLAGS="-g -O" CXXOPTFLAGS="-g -O" HIPOPTFLAGS="-g -O" --with-cuda=0 --with-hip=1 --with-precision=double --with-clanguage=c --download-kokkos --download-kokkos-kernels --download-hypre --download-magma --with-magma-fortran-bindings=0 --download-mfem --download-metis --with-strict-petscerrorcode PETSC_ARCH=arch-ci-linux-hip-double > [0]PETSC ERROR: #1 MatMultAddKernel_SeqAIJHIPSPARSE() at /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3131 > [0]PETSC ERROR: #2 MatMultAdd_SeqAIJHIPSPARSE() at /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3004 > [0]PETSC ERROR: #3 MatMultAdd() at /scratch/bsmith/petsc/src/mat/interface/matrix.c:2770 > [0]PETSC ERROR: #4 MatInterpolateAdd() at /scratch/bsmith/petsc/src/mat/interface/matrix.c:8603 > [0]PETSC ERROR: #5 PCMGMCycle_Private() at /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:87 > [0]PETSC ERROR: #6 PCMGMCycle_Private() at /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:83 > [0]PETSC ERROR: #7 PCApply_MG_Internal() at /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:611 > [0]PETSC ERROR: #8 PCApply_MG() at /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:633 > [0]PETSC ERROR: #9 PCApply() at /scratch/bsmith/petsc/src/ksp/pc/interface/precon.c:498 > [0]PETSC ERROR: #10 KSP_PCApply() at /scratch/bsmith/petsc/include/petsc/private/kspimpl.h:383 > [0]PETSC ERROR: #11 KSPSolve_Richardson() at /scratch/bsmith/petsc/src/ksp/ksp/impls/rich/rich.c:106 > [0]PETSC ERROR: #12 KSPSolve_Private() at /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:906 > [0]PETSC ERROR: #13 KSPSolve() at /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:1079 > [0]PETSC ERROR: #14 main() at ex34.c:52 > [0]PETSC ERROR: PETSc Option Table entries: > > Dave, > > Trying to debug the 7% now, but having trouble running, as you see above. > > > >> On Jan 19, 2024, at 3:02?PM, Dave May wrote: >> >> Thank you Barry and Junchao for these explanations. I'll turn on -log_view_gpu_time. >> >> Do either of you have any thoughts regarding why the percentage of flop's being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this solver configuration? >> >> This number should have nothing to do with timings as it reports the ratio of operations performed on the GPU and CPU, presumably obtained from PetscLogFlops() and PetscLogGpuFlops(). >> >> Cheers, >> Dave >> >> On Fri, 19 Jan 2024 at 11:39, Junchao Zhang > wrote: >>> Try to also add -log_view_gpu_time, https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/ >>> >>> --Junchao Zhang >>> >>> >>> On Fri, Jan 19, 2024 at 11:35?AM Dave May > wrote: >>>> Hi all, >>>> >>>> I am trying to understand the logging information associated with the %flops-performed-on-the-gpu reported by -log_view when running >>>> src/ksp/ksp/tutorials/ex34 >>>> with the following options >>>> -da_grid_x 192 >>>> -da_grid_y 192 >>>> -da_grid_z 192 >>>> -dm_mat_type seqaijhipsparse >>>> -dm_vec_type seqhip >>>> -ksp_max_it 10 >>>> -ksp_monitor >>>> -ksp_type richardson >>>> -ksp_view >>>> -log_view >>>> -mg_coarse_ksp_max_it 2 >>>> -mg_coarse_ksp_type richardson >>>> -mg_coarse_pc_type none >>>> -mg_levels_ksp_type richardson >>>> -mg_levels_pc_type none >>>> -options_left >>>> -pc_mg_levels 3 >>>> -pc_mg_log >>>> -pc_type mg >>>> >>>> This config is not intended to actually solve the problem, rather it is a stripped down set of options designed to understand what parts of the smoothers are being executed on the GPU. >>>> >>>> With respect to the log file attached, my first set of questions related to the data reported under "Event Stage 2: MG Apply". >>>> >>>> [1] Why is the log littered with nan's? >>>> * I don't understand how and why "GPU Mflop/s" should be reported as nan when a value is given for "GPU %F" (see MatMult for example). >>>> >>>> * For events executed on the GPU, I assume the column "Time (sec)" relates to "CPU execute time", this would explain why we see a nan in "Time (sec)" for MatMult. >>>> If my assumption is correct, how should I interpret the column "Flop (Max)" which is showing 1.92e+09? >>>> I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" >>>> >>>> [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as 93. I believe this value should be 100 as the smoother (and coarse grid solver) are configured as richardson(2)+none and thus should run entirely on the GPU. >>>> Furthermore, when one inspects all events listed under "Event Stage 2: MG Apply" those events which do flops correctly report "GPU %F" as 100. >>>> And the events showing "GPU %F" = 0 such as >>>> MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync >>>> don't do any flops (on the CPU or GPU) - which is also correct (although non GPU events should show nan??) >>>> >>>> Hence I am wondering what is the explanation for the missing 7% from "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? >>>> >>>> Does anyone understand this -log_view, or can explain to me how to interpret it? >>>> >>>> It could simply be that: >>>> a) something is messed up with -pc_mg_log >>>> b) something is messed up with the PETSc build >>>> c) I am putting too much faith in -log_view and should profile the code differently. >>>> >>>> Either way I'd really like to understand what is going on. >>>> >>>> >>>> Cheers, >>>> Dave >>>> >>>> >>>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 19 15:31:27 2024 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 19 Jan 2024 16:31:27 -0500 Subject: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT In-Reply-To: References: Message-ID: On Fri, Jan 19, 2024 at 4:25?PM Barry Smith wrote: > > Generally fieldsplit is used on problems that have a natural "split" of > the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 > This is often indicated in the vectors and matrices with the "blocksize" > argument, 2 in this case. DM also often provides this information. > > When laying out a vector/matrix with a blocksize one must ensure that > an equal number of of the subsets appears on each MPI process. So, for > example, if the above vector is distributed over 3 MPI processes one could > use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 > v1,u2,v2 u3,v3. Another way to think about it is that one must split up > the vector as indexed by block among the processes. For most multicomponent > problems this type of decomposition is very natural in the logic of the > code. > This blocking is only convenient, not necessary. You can specify your own field division using PCFieldSplitSetIS(). Thanks, Matt > Barry > > > On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear all, > > When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything > works fine. When I turn now to parallel, I observe that the number of ranks > that I can use must divide the number of N without any remainder, where N > is the number of unknowns. Otherwise, an error of the following form > emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". > > Can I do something to overcome this? > > Thanks, > Pantelis > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 19 17:58:07 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 19 Jan 2024 17:58:07 -0600 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: <1C095AF6-5F15-42D7-A831-F97D14A23725@petsc.dev> References: <1C095AF6-5F15-42D7-A831-F97D14A23725@petsc.dev> Message-ID: I reproduced this HIPSPARSE_STATUS_INVALID_VALUE error, but have not yet found obvious input argument errors for this hipsparse call. On Fri, Jan 19, 2024 at 2:18?PM Barry Smith wrote: > > Junchao > > I run the following on the CI machine, why does this happen? With > trivial solver options it runs ok. > > bsmith at petsc-gpu-02:/scratch/bsmith/petsc/src/ksp/ksp/tutorials$ ./ex34 > -da_grid_x 192 -da_grid_y 192 -da_grid_z 192 -dm_mat_type seqaijhipsparse > -dm_vec_type seqhip -ksp_max_it 10 -ksp_monitor -ksp_type richardson > -ksp_view -log_view -mg_coarse_ksp_max_it 2 -mg_coarse_ksp_type richardson > -mg_coarse_pc_type none -mg_levels_ksp_type richardson -mg_levels_pc_type > none -options_left -pc_mg_levels 3 -pc_mg_log -pc_type mg > > *[0]PETSC ERROR: --------------------- Error Message > --------------------------------------------------------------* > > [0]PETSC ERROR: GPU error > > [0]PETSC ERROR: hipSPARSE errorcode 3 (HIPSPARSE_STATUS_INVALID_VALUE) > > [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the > program crashed before usage or a spelling mistake, etc! > > [0]PETSC ERROR: Option left: name:-options_left (no value) source: > command line > > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.20.3, unknown > > [0]PETSC ERROR: ./ex34 on a named petsc-gpu-02 by bsmith Fri Jan 19 > 14:15:20 2024 > > [0]PETSC ERROR: Configure options > --package-prefix-hash=/home/bsmith/petsc-hash-pkgs --with-make-np=24 > --with-make-test-np=8 --with-hipc=/opt/rocm-5.4.3/bin/hipcc > --with-hip-dir=/opt/rocm-5.4.3 COPTFLAGS="-g -O" FOPTFLAGS="-g -O" > CXXOPTFLAGS="-g -O" HIPOPTFLAGS="-g -O" --with-cuda=0 --with-hip=1 > --with-precision=double --with-clanguage=c --download-kokkos > --download-kokkos-kernels --download-hypre --download-magma > --with-magma-fortran-bindings=0 --download-mfem --download-metis > --with-strict-petscerrorcode PETSC_ARCH=arch-ci-linux-hip-double > > [0]PETSC ERROR: #1 MatMultAddKernel_SeqAIJHIPSPARSE() at > /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3131 > > [0]PETSC ERROR: #2 MatMultAdd_SeqAIJHIPSPARSE() at > /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3004 > > [0]PETSC ERROR: #3 MatMultAdd() at > /scratch/bsmith/petsc/src/mat/interface/matrix.c:2770 > > [0]PETSC ERROR: #4 MatInterpolateAdd() at > /scratch/bsmith/petsc/src/mat/interface/matrix.c:8603 > > [0]PETSC ERROR: #5 PCMGMCycle_Private() at > /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:87 > > [0]PETSC ERROR: #6 PCMGMCycle_Private() at > /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:83 > > [0]PETSC ERROR: #7 PCApply_MG_Internal() at > /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:611 > > [0]PETSC ERROR: #8 PCApply_MG() at > /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:633 > > [0]PETSC ERROR: #9 PCApply() at > /scratch/bsmith/petsc/src/ksp/pc/interface/precon.c:498 > > [0]PETSC ERROR: #10 KSP_PCApply() at > /scratch/bsmith/petsc/include/petsc/private/kspimpl.h:383 > > [0]PETSC ERROR: #11 KSPSolve_Richardson() at > /scratch/bsmith/petsc/src/ksp/ksp/impls/rich/rich.c:106 > > [0]PETSC ERROR: #12 KSPSolve_Private() at > /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:906 > > [0]PETSC ERROR: #13 KSPSolve() at > /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:1079 > > [0]PETSC ERROR: #14 main() at ex34.c:52 > > [0]PETSC ERROR: PETSc Option Table entries: > > Dave, > > Trying to debug the 7% now, but having trouble running, as you see > above. > > > > On Jan 19, 2024, at 3:02?PM, Dave May wrote: > > Thank you Barry and Junchao for these explanations. I'll turn on > -log_view_gpu_time. > > Do either of you have any thoughts regarding why the percentage of flop's > being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this > solver configuration? > > This number should have nothing to do with timings as it reports the ratio > of operations performed on the GPU and CPU, presumably obtained from > PetscLogFlops() and PetscLogGpuFlops(). > > Cheers, > Dave > > On Fri, 19 Jan 2024 at 11:39, Junchao Zhang > wrote: > >> Try to also add -log_view_gpu_time, >> https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/ >> >> --Junchao Zhang >> >> >> On Fri, Jan 19, 2024 at 11:35?AM Dave May >> wrote: >> >>> Hi all, >>> >>> I am trying to understand the logging information associated with the >>> %flops-performed-on-the-gpu reported by -log_view when running >>> src/ksp/ksp/tutorials/ex34 >>> with the following options >>> -da_grid_x 192 >>> -da_grid_y 192 >>> -da_grid_z 192 >>> -dm_mat_type seqaijhipsparse >>> -dm_vec_type seqhip >>> -ksp_max_it 10 >>> -ksp_monitor >>> -ksp_type richardson >>> -ksp_view >>> -log_view >>> -mg_coarse_ksp_max_it 2 >>> -mg_coarse_ksp_type richardson >>> -mg_coarse_pc_type none >>> -mg_levels_ksp_type richardson >>> -mg_levels_pc_type none >>> -options_left >>> -pc_mg_levels 3 >>> -pc_mg_log >>> -pc_type mg >>> >>> This config is not intended to actually solve the problem, rather it is >>> a stripped down set of options designed to understand what parts of the >>> smoothers are being executed on the GPU. >>> >>> With respect to the log file attached, my first set of questions related >>> to the data reported under "Event Stage 2: MG Apply". >>> >>> [1] Why is the log littered with nan's? >>> * I don't understand how and why "GPU Mflop/s" should be reported as nan >>> when a value is given for "GPU %F" (see MatMult for example). >>> >>> * For events executed on the GPU, I assume the column "Time (sec)" >>> relates to "CPU execute time", this would explain why we see a nan in "Time >>> (sec)" for MatMult. >>> If my assumption is correct, how should I interpret the column "Flop >>> (Max)" which is showing 1.92e+09? >>> I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" >>> should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" >>> >>> [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, >>> MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as >>> 93. I believe this value should be 100 as the smoother (and coarse grid >>> solver) are configured as richardson(2)+none and thus should run entirely >>> on the GPU. >>> Furthermore, when one inspects all events listed under "Event Stage 2: >>> MG Apply" those events which do flops correctly report "GPU %F" as 100. >>> And the events showing "GPU %F" = 0 such as >>> MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync >>> don't do any flops (on the CPU or GPU) - which is also correct >>> (although non GPU events should show nan??) >>> >>> Hence I am wondering what is the explanation for the missing 7% from >>> "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? >>> >>> Does anyone understand this -log_view, or can explain to me how to >>> interpret it? >>> >>> It could simply be that: >>> a) something is messed up with -pc_mg_log >>> b) something is messed up with the PETSc build >>> c) I am putting too much faith in -log_view and should profile the code >>> differently. >>> >>> Either way I'd really like to understand what is going on. >>> >>> >>> Cheers, >>> Dave >>> >>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Mon Jan 22 09:49:06 2024 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Mon, 22 Jan 2024 16:49:06 +0100 Subject: [petsc-users] Unique number in each element of a DMPlex mesh Message-ID: <0323c633-ceac-7988-5a4e-261feb1259b4@ovgu.de> Dear Petsc-Team, Is there a good way to define a unique integer number in each element (e.g. a cell) of a DMPlex mesh, which is in the same location, regardless of the number of processors or the distribution of the mesh over the processors? So, for instance, if I have a DMPlex box mesh, the top-right-front corner element (e.g. cell) will always have the same unique number, regardless of the number of processors the mesh is distributed over? I want to be able to link the results I have achieved with a mesh from DMPlex on a certain number of cores to the same mesh from a DMPlex on a different number of cores. Of course, I could make a tree based on the distance of each element to a certain point (based on the X,Y,Z co-ordinates of the element), and go through this tree in the same way and define an integer based on this, but that seems rather cumbersome. Thanks and best regards, Berend. From knepley at gmail.com Mon Jan 22 11:58:30 2024 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Jan 2024 12:58:30 -0500 Subject: [petsc-users] Unique number in each element of a DMPlex mesh In-Reply-To: <0323c633-ceac-7988-5a4e-261feb1259b4@ovgu.de> References: <0323c633-ceac-7988-5a4e-261feb1259b4@ovgu.de> Message-ID: On Mon, Jan 22, 2024 at 10:49?AM Berend van Wachem wrote: > Dear Petsc-Team, > > Is there a good way to define a unique integer number in each element > (e.g. a cell) of a DMPlex mesh, which is in the same location, > regardless of the number of processors or the distribution of the mesh > over the processors? > > So, for instance, if I have a DMPlex box mesh, the top-right-front > corner element (e.g. cell) will always have the same unique number, > regardless of the number of processors the mesh is distributed over? > > I want to be able to link the results I have achieved with a mesh from > DMPlex on a certain number of cores to the same mesh from a DMPlex on a > different number of cores. > > Of course, I could make a tree based on the distance of each element to > a certain point (based on the X,Y,Z co-ordinates of the element), and go > through this tree in the same way and define an integer based on this, > but that seems rather cumbersome. > I think this is harder than it sounds. The distance will not work because it can be very degenerate. You could lexicographically sort the coordinates, but this is hard in parallel. It is fine if you are willing to gather everything on one process. You could put down a p4est, use the Morton order to number them since this is stable for a given refinement. And then within each box lexicographically sort the centroids. This is definitely cumbersome, but I cannot think of anything else. This also might have parallel problems since you need to know how much overlap you need to fill each box. Thanks, Matt > Thanks and best regards, Berend. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Mon Jan 22 12:58:30 2024 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Mon, 22 Jan 2024 19:58:30 +0100 Subject: [petsc-users] Unique number in each element of a DMPlex mesh In-Reply-To: References: <0323c633-ceac-7988-5a4e-261feb1259b4@ovgu.de> Message-ID: <2dfbd6f8-af0c-41df-8650-2497c277f76f@ovgu.de> Dear Matt, Thanks for your quick response. I have a DMPlex with a polyhedral mesh, and have defined a number of vectors with data at the cell center. I have generated data for a number of timesteps, and I write the data for each point to a file together with the (x,y,z) co-ordinate of the cell center. When I want to do a restart from the DMPlex, I recreate the DMplex with the polyhedral mesh, redistribute it, and for each cell center find the corresponding (x,y,z) co-ordinate and insert the data that corresponds to it. This is quite expensive, as it means I need to compare doubles very often. But reading your response, this may not be a bad way of doing it? Thanks, Berend. On 1/22/24 18:58, Matthew Knepley wrote: > On Mon, Jan 22, 2024 at 10:49?AM Berend van Wachem > wrote: > > Dear Petsc-Team, > > Is there a good way to define a unique integer number in each element > (e.g. a cell) of a DMPlex mesh, which is in the same location, > regardless of the number of processors or the distribution of the mesh > over the processors? > > So, for instance, if I have a DMPlex box mesh, the top-right-front > corner element (e.g. cell) will always have the same unique number, > regardless of the number of processors the mesh is distributed over? > > I want to be able to link the results I have achieved with a mesh from > DMPlex on a certain number of cores to the same mesh from a DMPlex on a > different number of cores. > > Of course, I could make a tree based on the distance of each element to > a certain point (based on the X,Y,Z co-ordinates of the element), and go > through this tree in the same way and define an integer based on this, > but that seems rather cumbersome. > > > I think this is harder than it sounds. The distance will not work because it can be very degenerate. > You could lexicographically sort the coordinates, but this is hard in parallel. It is fine if you are willing > to gather everything on one process. You could put down a p4est, use the Morton order to number them since this is stable for a > given refinement. And then within each box lexicographically sort the centroids. This is definitely cumbersome, but I cannot > think of anything else. This also might have parallel problems since you need to know how much overlap you need to fill each box. > > ? Thanks, > > ? ? ? Matt > > Thanks and best regards, Berend. > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Mon Jan 22 13:03:40 2024 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Jan 2024 14:03:40 -0500 Subject: [petsc-users] Unique number in each element of a DMPlex mesh In-Reply-To: <2dfbd6f8-af0c-41df-8650-2497c277f76f@ovgu.de> References: <0323c633-ceac-7988-5a4e-261feb1259b4@ovgu.de> <2dfbd6f8-af0c-41df-8650-2497c277f76f@ovgu.de> Message-ID: On Mon, Jan 22, 2024 at 1:57?PM Berend van Wachem wrote: > Dear Matt, > > Thanks for your quick response. > I have a DMPlex with a polyhedral mesh, and have defined a number of > vectors with data at the cell center. I have generated data > for a number of timesteps, and I write the data for each point to a file > together with the (x,y,z) co-ordinate of the cell center. > > When I want to do a restart from the DMPlex, I recreate the DMplex with > the polyhedral mesh, redistribute it, and for each cell > center find the corresponding (x,y,z) co-ordinate and insert the data that > corresponds to it. This is quite expensive, as it > means I need to compare doubles very often. > > But reading your response, this may not be a bad way of doing it? > It always seems to be a game of "what do you want to assume?". I tend to assume that I wrote the DM and Vec in the same order, so when I load them they match. This is how Firedrake I/O works, so that you can load up on a different number of processes (https://arxiv.org/abs/2401.05868). So, are you writing a Vec, and then redistributing and writing another Vec? In the scheme above, you would have to write both DMs. Are you trying to avoid this? Thanks, Matt > Thanks, > > Berend. > > On 1/22/24 18:58, Matthew Knepley wrote: > > On Mon, Jan 22, 2024 at 10:49?AM Berend van Wachem < > berend.vanwachem at ovgu.de > wrote: > > > > Dear Petsc-Team, > > > > Is there a good way to define a unique integer number in each element > > (e.g. a cell) of a DMPlex mesh, which is in the same location, > > regardless of the number of processors or the distribution of the > mesh > > over the processors? > > > > So, for instance, if I have a DMPlex box mesh, the top-right-front > > corner element (e.g. cell) will always have the same unique number, > > regardless of the number of processors the mesh is distributed over? > > > > I want to be able to link the results I have achieved with a mesh > from > > DMPlex on a certain number of cores to the same mesh from a DMPlex > on a > > different number of cores. > > > > Of course, I could make a tree based on the distance of each element > to > > a certain point (based on the X,Y,Z co-ordinates of the element), > and go > > through this tree in the same way and define an integer based on > this, > > but that seems rather cumbersome. > > > > > > I think this is harder than it sounds. The distance will not work > because it can be very degenerate. > > You could lexicographically sort the coordinates, but this is hard in > parallel. It is fine if you are willing > > to gather everything on one process. You could put down a p4est, use the > Morton order to number them since this is stable for a > > given refinement. And then within each box lexicographically sort the > centroids. This is definitely cumbersome, but I cannot > > think of anything else. This also might have parallel problems since you > need to know how much overlap you need to fill each box. > > > > Thanks, > > > > Matt > > > > Thanks and best regards, Berend. > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > > which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Mon Jan 22 13:27:42 2024 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Mon, 22 Jan 2024 20:27:42 +0100 Subject: [petsc-users] Unique number in each element of a DMPlex mesh In-Reply-To: References: <0323c633-ceac-7988-5a4e-261feb1259b4@ovgu.de> <2dfbd6f8-af0c-41df-8650-2497c277f76f@ovgu.de> Message-ID: Dear Matt, The problem is that I haven't figured out how to write a polyhedral DMplex in parallel. So, currently, I can write the Vec data in parallel, but the cones for the cells/faces/edges/nodes for the mesh from just one process to a file (after gathering the DMplex to a single process). From the restart, I can then read the cone information from one process from the file, recreate the DMPlex, and then redistribute it. In this scenario, the Vec data I read in (in parallel) will not match the correct cells of the DMplex. Hence, I need to put it in the right place afterwards. Best, Berend. On 1/22/24 20:03, Matthew Knepley wrote: > On Mon, Jan 22, 2024 at 1:57?PM Berend van Wachem > wrote: > > Dear Matt, > > Thanks for your quick response. > I have a DMPlex with a polyhedral mesh, and have defined a number of vectors with data at the cell center. I have generated > data > for a number of timesteps, and I write the data for each point to a file together with the (x,y,z) co-ordinate of the cell > center. > > When I want to do a restart from the DMPlex, I recreate the DMplex with the polyhedral mesh, redistribute it, and for each cell > center find the corresponding (x,y,z) co-ordinate and insert the data that corresponds to it. This is quite expensive, as it > means I need to compare doubles very often. > > But reading your response, this may not be a bad way of doing it? > > > It always seems to be a game of "what do you want to assume?". I tend to assume that I wrote the DM and Vec in the same order, > so when I load them they match. This is how Firedrake I/O works, so that you can load up on a different number of processes > (https://arxiv.org/abs/2401.05868 ). > > So, are you writing a Vec, and then redistributing and writing another Vec? In the scheme above, you would have to write both > DMs. Are you trying to avoid this? > > ? Thanks, > > ? ? ?Matt > > Thanks, > > Berend. > > On 1/22/24 18:58, Matthew Knepley wrote: > > On Mon, Jan 22, 2024 at 10:49?AM Berend van Wachem > >> wrote: > > > >? ? ?Dear Petsc-Team, > > > >? ? ?Is there a good way to define a unique integer number in each element > >? ? ?(e.g. a cell) of a DMPlex mesh, which is in the same location, > >? ? ?regardless of the number of processors or the distribution of the mesh > >? ? ?over the processors? > > > >? ? ?So, for instance, if I have a DMPlex box mesh, the top-right-front > >? ? ?corner element (e.g. cell) will always have the same unique number, > >? ? ?regardless of the number of processors the mesh is distributed over? > > > >? ? ?I want to be able to link the results I have achieved with a mesh from > >? ? ?DMPlex on a certain number of cores to the same mesh from a DMPlex on a > >? ? ?different number of cores. > > > >? ? ?Of course, I could make a tree based on the distance of each element to > >? ? ?a certain point (based on the X,Y,Z co-ordinates of the element), and go > >? ? ?through this tree in the same way and define an integer based on this, > >? ? ?but that seems rather cumbersome. > > > > > > I think this is harder than it sounds. The distance will not work because it can be very degenerate. > > You could lexicographically sort the coordinates, but this is hard in parallel. It is fine if you are willing > > to gather everything on one process. You could put down a p4est, use the Morton order to number them since this is stable > for a > > given refinement. And then within each box lexicographically sort the centroids. This is definitely cumbersome, but I cannot > > think of anything else. This also might have parallel problems since you need to know how much overlap you need to fill > each box. > > > >? ? Thanks, > > > >? ? ? ? Matt > > > >? ? ?Thanks and best regards, Berend. > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any > results to > > which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Mon Jan 22 13:30:32 2024 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Jan 2024 14:30:32 -0500 Subject: [petsc-users] Unique number in each element of a DMPlex mesh In-Reply-To: References: <0323c633-ceac-7988-5a4e-261feb1259b4@ovgu.de> <2dfbd6f8-af0c-41df-8650-2497c277f76f@ovgu.de> Message-ID: On Mon, Jan 22, 2024 at 2:26?PM Berend van Wachem wrote: > Dear Matt, > > The problem is that I haven't figured out how to write a polyhedral DMplex > in parallel. So, currently, I can write the Vec data > in parallel, but the cones for the cells/faces/edges/nodes for the mesh > from just one process to a file (after gathering the > DMplex to a single process). > Ah shoot. Can you send me a polyhedral mesh (or code to generate one) so I can fix the parallel write problem? Or maybe it is already an issue and I forgot? > From the restart, I can then read the cone information from one process > from the file, recreate the DMPlex, and then > redistribute it. In this scenario, the Vec data I read in (in parallel) > will not match the correct cells of the DMplex. Hence, I > need to put it in the right place afterwards. > Yes, then searching makes sense. You could call DMLocatePoints(), but maybe you are doing that. Thanks, Matt > Best, Berend. > > On 1/22/24 20:03, Matthew Knepley wrote: > > On Mon, Jan 22, 2024 at 1:57?PM Berend van Wachem < > berend.vanwachem at ovgu.de > wrote: > > > > Dear Matt, > > > > Thanks for your quick response. > > I have a DMPlex with a polyhedral mesh, and have defined a number of > vectors with data at the cell center. I have generated > > data > > for a number of timesteps, and I write the data for each point to a > file together with the (x,y,z) co-ordinate of the cell > > center. > > > > When I want to do a restart from the DMPlex, I recreate the DMplex > with the polyhedral mesh, redistribute it, and for each cell > > center find the corresponding (x,y,z) co-ordinate and insert the > data that corresponds to it. This is quite expensive, as it > > means I need to compare doubles very often. > > > > But reading your response, this may not be a bad way of doing it? > > > > > > It always seems to be a game of "what do you want to assume?". I tend to > assume that I wrote the DM and Vec in the same order, > > so when I load them they match. This is how Firedrake I/O works, so that > you can load up on a different number of processes > > (https://arxiv.org/abs/2401.05868 ). > > > > So, are you writing a Vec, and then redistributing and writing another > Vec? In the scheme above, you would have to write both > > DMs. Are you trying to avoid this? > > > > Thanks, > > > > Matt > > > > Thanks, > > > > Berend. > > > > On 1/22/24 18:58, Matthew Knepley wrote: > > > On Mon, Jan 22, 2024 at 10:49?AM Berend van Wachem < > berend.vanwachem at ovgu.de > > >> > wrote: > > > > > > Dear Petsc-Team, > > > > > > Is there a good way to define a unique integer number in each > element > > > (e.g. a cell) of a DMPlex mesh, which is in the same location, > > > regardless of the number of processors or the distribution of > the mesh > > > over the processors? > > > > > > So, for instance, if I have a DMPlex box mesh, the > top-right-front > > > corner element (e.g. cell) will always have the same unique > number, > > > regardless of the number of processors the mesh is > distributed over? > > > > > > I want to be able to link the results I have achieved with a > mesh from > > > DMPlex on a certain number of cores to the same mesh from a > DMPlex on a > > > different number of cores. > > > > > > Of course, I could make a tree based on the distance of each > element to > > > a certain point (based on the X,Y,Z co-ordinates of the > element), and go > > > through this tree in the same way and define an integer based > on this, > > > but that seems rather cumbersome. > > > > > > > > > I think this is harder than it sounds. The distance will not work > because it can be very degenerate. > > > You could lexicographically sort the coordinates, but this is > hard in parallel. It is fine if you are willing > > > to gather everything on one process. You could put down a p4est, > use the Morton order to number them since this is stable > > for a > > > given refinement. And then within each box lexicographically sort > the centroids. This is definitely cumbersome, but I cannot > > > think of anything else. This also might have parallel problems > since you need to know how much overlap you need to fill > > each box. > > > > > > Thanks, > > > > > > Matt > > > > > > Thanks and best regards, Berend. > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any > > results to > > > which their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ < > https://www.cse.buffalo.edu/~knepley/> < > http://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > > which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmoschopoulos at outlook.com Tue Jan 23 03:23:35 2024 From: pmoschopoulos at outlook.com (Pantelis Moschopoulos) Date: Tue, 23 Jan 2024 09:23:35 +0000 Subject: [petsc-users] =?utf-8?q?=CE=91=CF=80=3A__Question_about_a_parall?= =?utf-8?q?el_implementation_of_PCFIELDSPLIT?= In-Reply-To: References: Message-ID: Dear Matt and Dear Barry, I have some follow up questions regarding FieldSplit. Let's assume that I solve again the 3D Stokes flow but now I have also a global constraint that controls the flow rate at the inlet. Now, the matrix has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., but the last line (and the last column) corresponds to the contribution of the global constraint equation. I want to incorporate the last line (and last column) into the local block of velocities (split 0) and the pressure. The problem is how I do that. I have two questions: 1. Now, the block size should be 5 in the matrix and vector creation for this problem? 2. I have to rely entirely on PCFieldSplitSetIS to create the two blocks? Can I augment simply the previously defined block 0 with the last line of the matrix? Up to this moment, I use the following commands to create the Field split: ufields(3) = [0, 1, 2] pfields(1) = [3] call PCSetType(pc, PCFIELDSPLIT, ierr) call PCFieldSplitSetBlockSize(pc, 4,ierr) call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) Thanks, Pantelis ________________________________ ???: Matthew Knepley ????????: ?????????, 19 ?????????? 2024 11:31 ?? ????: Barry Smith ????.: Pantelis Moschopoulos ; petsc-users at mcs.anl.gov ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Fri, Jan 19, 2024 at 4:25?PM Barry Smith > wrote: Generally fieldsplit is used on problems that have a natural "split" of the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 This is often indicated in the vectors and matrices with the "blocksize" argument, 2 in this case. DM also often provides this information. When laying out a vector/matrix with a blocksize one must ensure that an equal number of of the subsets appears on each MPI process. So, for example, if the above vector is distributed over 3 MPI processes one could use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 v1,u2,v2 u3,v3. Another way to think about it is that one must split up the vector as indexed by block among the processes. For most multicomponent problems this type of decomposition is very natural in the logic of the code. This blocking is only convenient, not necessary. You can specify your own field division using PCFieldSplitSetIS(). Thanks, Matt Barry On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos > wrote: Dear all, When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything works fine. When I turn now to parallel, I observe that the number of ranks that I can use must divide the number of N without any remainder, where N is the number of unknowns. Otherwise, an error of the following form emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". Can I do something to overcome this? Thanks, Pantelis -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Tue Jan 23 06:14:44 2024 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Tue, 23 Jan 2024 13:14:44 +0100 Subject: [petsc-users] Unique number in each element of a DMPlex mesh In-Reply-To: References: <0323c633-ceac-7988-5a4e-261feb1259b4@ovgu.de> <2dfbd6f8-af0c-41df-8650-2497c277f76f@ovgu.de> Message-ID: <2b40ecff-cf6b-2e74-9429-3bea48114482@ovgu.de> Dear Matt, Please find attached a test for writing a DMPlex with hanging nodes, which is based on a refined DMForest. I've linked the code with the current main git version of Petsc. When the DMPlex gets written to disc, the code crashes with [0]PETSC ERROR: Unknown discretization type for field 0 although I specifically set the discretization for the DMPlex. The DMPlex based on a DMForest has "double" faces when there is a jump in cell size: the larger cell has one large face towards the refined cells, and the adjacent 4 smaller cells each have a face as well. I have written a function to remove the large face in such instances, rebuilding the DM, which seems to work. But I can only do this on 1 process and therefore lose the connectivity between the DM and the locations of the data points of the vector. I can open an issue on the Petsc git, if you prefer? Thanks and best regards, Berend. On 1/22/24 20:30, Matthew Knepley wrote: > On Mon, Jan 22, 2024 at 2:26?PM Berend van Wachem > > wrote: > > Dear Matt, > > The problem is that I haven't figured out how to write a polyhedral > DMplex in parallel. So, currently, I can write the Vec data > in parallel, but the cones for the cells/faces/edges/nodes for the > mesh from just one process to a file (after gathering the > DMplex to a single process). > > > Ah shoot. Can you send me a polyhedral mesh (or code to generate one) so > I can fix the parallel write problem? Or maybe it is already an issue > and I forgot? > > ?From the restart, I can then read the cone information from one > process from the file, recreate the DMPlex, and then > redistribute it. In this scenario, the Vec data I read in (in > parallel) will not match the correct cells of the DMplex. Hence, I > need to put it in the right place afterwards. > > > Yes, then searching makes sense. You could call DMLocatePoints(), but > maybe you are doing that. > > ? Thanks, > > ? ? ?Matt > > Best, Berend. > > On 1/22/24 20:03, Matthew Knepley wrote: > > On Mon, Jan 22, 2024 at 1:57?PM Berend van Wachem > > >> > wrote: > > > >? ? ?Dear Matt, > > > >? ? ?Thanks for your quick response. > >? ? ?I have a DMPlex with a polyhedral mesh, and have defined a > number of vectors with data at the cell center. I have generated > >? ? ?data > >? ? ?for a number of timesteps, and I write the data for each > point to a file together with the (x,y,z) co-ordinate of the cell > >? ? ?center. > > > >? ? ?When I want to do a restart from the DMPlex, I recreate the > DMplex with the polyhedral mesh, redistribute it, and for each cell > >? ? ?center find the corresponding (x,y,z) co-ordinate and insert > the data that corresponds to it. This is quite expensive, as it > >? ? ?means I need to compare doubles very often. > > > >? ? ?But reading your response, this may not be a bad way of doing it? > > > > > > It always seems to be a game of "what do you want to assume?". I > tend to assume that I wrote the DM and Vec in the same order, > > so when I load them they match. This is how Firedrake I/O works, > so that you can load up on a different number of processes > > (https://arxiv.org/abs/2401.05868 > >). > > > > So, are you writing a Vec, and then redistributing and writing > another Vec? In the scheme above, you would have to write both > > DMs. Are you trying to avoid this? > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Thanks, > > > >? ? ?Berend. > > > >? ? ?On 1/22/24 18:58, Matthew Knepley wrote: > >? ? ? > On Mon, Jan 22, 2024 at 10:49?AM Berend van Wachem > > > > >? ? ? >>> wrote: > >? ? ? > > >? ? ? >? ? ?Dear Petsc-Team, > >? ? ? > > >? ? ? >? ? ?Is there a good way to define a unique integer number > in each element > >? ? ? >? ? ?(e.g. a cell) of a DMPlex mesh, which is in the same > location, > >? ? ? >? ? ?regardless of the number of processors or the > distribution of the mesh > >? ? ? >? ? ?over the processors? > >? ? ? > > >? ? ? >? ? ?So, for instance, if I have a DMPlex box mesh, the > top-right-front > >? ? ? >? ? ?corner element (e.g. cell) will always have the same > unique number, > >? ? ? >? ? ?regardless of the number of processors the mesh is > distributed over? > >? ? ? > > >? ? ? >? ? ?I want to be able to link the results I have achieved > with a mesh from > >? ? ? >? ? ?DMPlex on a certain number of cores to the same mesh > from a DMPlex on a > >? ? ? >? ? ?different number of cores. > >? ? ? > > >? ? ? >? ? ?Of course, I could make a tree based on the distance > of each element to > >? ? ? >? ? ?a certain point (based on the X,Y,Z co-ordinates of > the element), and go > >? ? ? >? ? ?through this tree in the same way and define an > integer based on this, > >? ? ? >? ? ?but that seems rather cumbersome. > >? ? ? > > >? ? ? > > >? ? ? > I think this is harder than it sounds. The distance will > not work because it can be very degenerate. > >? ? ? > You could lexicographically sort the coordinates, but this > is hard in parallel. It is fine if you are willing > >? ? ? > to gather everything on one process. You could put down a > p4est, use the Morton order to number them since this is stable > >? ? ?for a > >? ? ? > given refinement. And then within each box > lexicographically sort the centroids. This is definitely cumbersome, > but I cannot > >? ? ? > think of anything else. This also might have parallel > problems since you need to know how much overlap you need to fill > >? ? ?each box. > >? ? ? > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? ? Matt > >? ? ? > > >? ? ? >? ? ?Thanks and best regards, Berend. > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any > >? ? ?results to > >? ? ? > which their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > > > > > >? ? ? >> > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > > which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- A non-text attachment was scrubbed... Name: dmsavetest.c Type: text/x-csrc Size: 6685 bytes Desc: not available URL: From knepley at gmail.com Tue Jan 23 06:51:33 2024 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Jan 2024 07:51:33 -0500 Subject: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT In-Reply-To: References: Message-ID: On Tue, Jan 23, 2024 at 4:23?AM Pantelis Moschopoulos < pmoschopoulos at outlook.com> wrote: > Dear Matt and Dear Barry, > > I have some follow up questions regarding FieldSplit. > Let's assume that I solve again the 3D Stokes flow but now I have also a > global constraint that controls the flow rate at the inlet. Now, the matrix > has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., > but the last line (and the last column) corresponds to the contribution of > the global constraint equation. I want to incorporate the last line (and > last column) into the local block of velocities (split 0) and the > pressure. The problem is how I do that. I have two questions: > > 1. Now, the block size should be 5 in the matrix and vector creation > for this problem? > > No. Blocksize is only useful when the vector/matrix layout is completely regular, meaning _every_ block looks the same. Here you have a single row to be added in. > > 1. I have to rely entirely on PCFieldSplitSetIS to create the two > blocks? Can I augment simply the previously defined block 0 with the last > line of the matrix? > > If you want to add in a single row, then you have to specify the IS yourself since we cannot generate it from the regular pattern. However, if you know that you will only ever have a single constraint row (which I assume is fairly dense), then I would suggest instead using MatLRC, which Jose developed for SLEPc. This handles the last row/col as a low-rank correction. One step of Sherman-Morrison-Woobury solves this exactly. It requires a solve for A, for which you can use FieldSplit as normal. Thanks, Matt > Up to this moment, I use the following commands to create the Field split: > ufields(3) = [0, 1, 2] > pfields(1) = [3] > > call PCSetType(pc, PCFIELDSPLIT, ierr) > call PCFieldSplitSetBlockSize(pc, 4,ierr) > call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) > call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) > > Thanks, > Pantelis > > > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????????, 19 ?????????? 2024 11:31 ?? > *????:* Barry Smith > *????.:* Pantelis Moschopoulos ; > petsc-users at mcs.anl.gov > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Fri, Jan 19, 2024 at 4:25?PM Barry Smith wrote: > > > Generally fieldsplit is used on problems that have a natural "split" of > the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 > This is often indicated in the vectors and matrices with the "blocksize" > argument, 2 in this case. DM also often provides this information. > > When laying out a vector/matrix with a blocksize one must ensure that > an equal number of of the subsets appears on each MPI process. So, for > example, if the above vector is distributed over 3 MPI processes one could > use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 > v1,u2,v2 u3,v3. Another way to think about it is that one must split up > the vector as indexed by block among the processes. For most multicomponent > problems this type of decomposition is very natural in the logic of the > code. > > > This blocking is only convenient, not necessary. You can specify your own > field division using PCFieldSplitSetIS(). > > Thanks, > > Matt > > > Barry > > > On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear all, > > When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything > works fine. When I turn now to parallel, I observe that the number of ranks > that I can use must divide the number of N without any remainder, where N > is the number of unknowns. Otherwise, an error of the following form > emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". > > Can I do something to overcome this? > > Thanks, > Pantelis > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmoschopoulos at outlook.com Tue Jan 23 07:16:42 2024 From: pmoschopoulos at outlook.com (Pantelis Moschopoulos) Date: Tue, 23 Jan 2024 13:16:42 +0000 Subject: [petsc-users] =?utf-8?q?=CE=91=CF=80=3A__Question_about_a_parall?= =?utf-8?q?el_implementation_of_PCFIELDSPLIT?= In-Reply-To: References: Message-ID: Dear Matt, Thank you for your response. This is an idealized setup where I have only one row/column. Sometimes we might need two or even three constraints based on the application. Thus, I will pursue the user-defined IS. When I supply the IS using the command PCFieldSplitSetIS, I do not specify anything in the matrix set up right? Thanks, Pantelis ________________________________ ???: Matthew Knepley ????????: ?????, 23 ?????????? 2024 2:51 ?? ????: Pantelis Moschopoulos ????.: Barry Smith ; petsc-users at mcs.anl.gov ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 4:23?AM Pantelis Moschopoulos > wrote: Dear Matt and Dear Barry, I have some follow up questions regarding FieldSplit. Let's assume that I solve again the 3D Stokes flow but now I have also a global constraint that controls the flow rate at the inlet. Now, the matrix has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., but the last line (and the last column) corresponds to the contribution of the global constraint equation. I want to incorporate the last line (and last column) into the local block of velocities (split 0) and the pressure. The problem is how I do that. I have two questions: 1. Now, the block size should be 5 in the matrix and vector creation for this problem? No. Blocksize is only useful when the vector/matrix layout is completely regular, meaning _every_ block looks the same. Here you have a single row to be added in. 1. I have to rely entirely on PCFieldSplitSetIS to create the two blocks? Can I augment simply the previously defined block 0 with the last line of the matrix? If you want to add in a single row, then you have to specify the IS yourself since we cannot generate it from the regular pattern. However, if you know that you will only ever have a single constraint row (which I assume is fairly dense), then I would suggest instead using MatLRC, which Jose developed for SLEPc. This handles the last row/col as a low-rank correction. One step of Sherman-Morrison-Woobury solves this exactly. It requires a solve for A, for which you can use FieldSplit as normal. Thanks, Matt Up to this moment, I use the following commands to create the Field split: ufields(3) = [0, 1, 2] pfields(1) = [3] call PCSetType(pc, PCFIELDSPLIT, ierr) call PCFieldSplitSetBlockSize(pc, 4,ierr) call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) Thanks, Pantelis ________________________________ ???: Matthew Knepley > ????????: ?????????, 19 ?????????? 2024 11:31 ?? ????: Barry Smith > ????.: Pantelis Moschopoulos >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Fri, Jan 19, 2024 at 4:25?PM Barry Smith > wrote: Generally fieldsplit is used on problems that have a natural "split" of the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 This is often indicated in the vectors and matrices with the "blocksize" argument, 2 in this case. DM also often provides this information. When laying out a vector/matrix with a blocksize one must ensure that an equal number of of the subsets appears on each MPI process. So, for example, if the above vector is distributed over 3 MPI processes one could use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 v1,u2,v2 u3,v3. Another way to think about it is that one must split up the vector as indexed by block among the processes. For most multicomponent problems this type of decomposition is very natural in the logic of the code. This blocking is only convenient, not necessary. You can specify your own field division using PCFieldSplitSetIS(). Thanks, Matt Barry On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos > wrote: Dear all, When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything works fine. When I turn now to parallel, I observe that the number of ranks that I can use must divide the number of N without any remainder, where N is the number of unknowns. Otherwise, an error of the following form emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". Can I do something to overcome this? Thanks, Pantelis -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 23 07:20:48 2024 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Jan 2024 08:20:48 -0500 Subject: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT In-Reply-To: References: Message-ID: On Tue, Jan 23, 2024 at 8:16?AM Pantelis Moschopoulos < pmoschopoulos at outlook.com> wrote: > Dear Matt, > > Thank you for your response. This is an idealized setup where I have only > one row/column. Sometimes we might need two or even three constraints based > on the application. Thus, I will pursue the user-defined IS. > Anything < 50 I would use MatLRC. The bottleneck is the inversion of a dense matrix of size k x k, where k is the number of constraints. Using an IS is definitely fine, but dense rows can detract from iterative convergence. > When I supply the IS using the command PCFieldSplitSetIS, I do not specify > anything in the matrix set up right? > You should just need to specify the rows for each field as an IS. Thanks, Matt > Thanks, > Pantelis > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????, 23 ?????????? 2024 2:51 ?? > *????:* Pantelis Moschopoulos > *????.:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Tue, Jan 23, 2024 at 4:23?AM Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear Matt and Dear Barry, > > I have some follow up questions regarding FieldSplit. > Let's assume that I solve again the 3D Stokes flow but now I have also a > global constraint that controls the flow rate at the inlet. Now, the matrix > has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., > but the last line (and the last column) corresponds to the contribution of > the global constraint equation. I want to incorporate the last line (and > last column) into the local block of velocities (split 0) and the > pressure. The problem is how I do that. I have two questions: > > 1. Now, the block size should be 5 in the matrix and vector creation > for this problem? > > No. Blocksize is only useful when the vector/matrix layout is completely > regular, meaning _every_ block looks the same. Here you have a single row > to be added in. > > > 1. I have to rely entirely on PCFieldSplitSetIS to create the two > blocks? Can I augment simply the previously defined block 0 with the last > line of the matrix? > > If you want to add in a single row, then you have to specify the IS > yourself since we cannot generate it from the regular pattern. > > However, if you know that you will only ever have a single constraint row > (which I assume is fairly dense), then I would suggest instead using > MatLRC, which Jose developed for SLEPc. This handles the last row/col as a > low-rank correction. One step of Sherman-Morrison-Woobury solves this > exactly. It requires a solve for A, for which you can use FieldSplit as > normal. > > Thanks, > > Matt > > > Up to this moment, I use the following commands to create the Field split: > ufields(3) = [0, 1, 2] > pfields(1) = [3] > > call PCSetType(pc, PCFIELDSPLIT, ierr) > call PCFieldSplitSetBlockSize(pc, 4,ierr) > call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) > call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) > > Thanks, > Pantelis > > > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????????, 19 ?????????? 2024 11:31 ?? > *????:* Barry Smith > *????.:* Pantelis Moschopoulos ; > petsc-users at mcs.anl.gov > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Fri, Jan 19, 2024 at 4:25?PM Barry Smith wrote: > > > Generally fieldsplit is used on problems that have a natural "split" of > the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 > This is often indicated in the vectors and matrices with the "blocksize" > argument, 2 in this case. DM also often provides this information. > > When laying out a vector/matrix with a blocksize one must ensure that > an equal number of of the subsets appears on each MPI process. So, for > example, if the above vector is distributed over 3 MPI processes one could > use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 > v1,u2,v2 u3,v3. Another way to think about it is that one must split up > the vector as indexed by block among the processes. For most multicomponent > problems this type of decomposition is very natural in the logic of the > code. > > > This blocking is only convenient, not necessary. You can specify your own > field division using PCFieldSplitSetIS(). > > Thanks, > > Matt > > > Barry > > > On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear all, > > When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything > works fine. When I turn now to parallel, I observe that the number of ranks > that I can use must divide the number of N without any remainder, where N > is the number of unknowns. Otherwise, an error of the following form > emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". > > Can I do something to overcome this? > > Thanks, > Pantelis > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmoschopoulos at outlook.com Tue Jan 23 08:45:55 2024 From: pmoschopoulos at outlook.com (Pantelis Moschopoulos) Date: Tue, 23 Jan 2024 14:45:55 +0000 Subject: [petsc-users] =?utf-8?q?=CE=91=CF=80=3A__Question_about_a_parall?= =?utf-8?q?el_implementation_of_PCFIELDSPLIT?= In-Reply-To: References: Message-ID: Dear Matt, I read about the MATLRC. However, its correct usage is not clear to me so I have the following questions: 1. The U and V input matrices should be created as dense using MatCreateDense? 2. I use the command MatCreateLRC just to declare the matrix and then MatLRCSetMats to pass the values of the constituents? Then, how do I proceed? How I apply the step of Sherman-Morrison-Woobury formula? I intend to use iterative solvers for A (main matrix) so I will not have its A^-1 at hand which I think is what the Sherman-Morrison-Woobury formula needs. Thanks, Pantelis ?????? ________________________________ ???: Matthew Knepley ????????: ?????, 23 ?????????? 2024 3:20 ?? ????: Pantelis Moschopoulos ????.: Barry Smith ; petsc-users at mcs.anl.gov ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 8:16?AM Pantelis Moschopoulos > wrote: Dear Matt, Thank you for your response. This is an idealized setup where I have only one row/column. Sometimes we might need two or even three constraints based on the application. Thus, I will pursue the user-defined IS. Anything < 50 I would use MatLRC. The bottleneck is the inversion of a dense matrix of size k x k, where k is the number of constraints. Using an IS is definitely fine, but dense rows can detract from iterative convergence. When I supply the IS using the command PCFieldSplitSetIS, I do not specify anything in the matrix set up right? You should just need to specify the rows for each field as an IS. Thanks, Matt Thanks, Pantelis ________________________________ ???: Matthew Knepley > ????????: ?????, 23 ?????????? 2024 2:51 ?? ????: Pantelis Moschopoulos > ????.: Barry Smith >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 4:23?AM Pantelis Moschopoulos > wrote: Dear Matt and Dear Barry, I have some follow up questions regarding FieldSplit. Let's assume that I solve again the 3D Stokes flow but now I have also a global constraint that controls the flow rate at the inlet. Now, the matrix has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., but the last line (and the last column) corresponds to the contribution of the global constraint equation. I want to incorporate the last line (and last column) into the local block of velocities (split 0) and the pressure. The problem is how I do that. I have two questions: 1. Now, the block size should be 5 in the matrix and vector creation for this problem? No. Blocksize is only useful when the vector/matrix layout is completely regular, meaning _every_ block looks the same. Here you have a single row to be added in. 1. I have to rely entirely on PCFieldSplitSetIS to create the two blocks? Can I augment simply the previously defined block 0 with the last line of the matrix? If you want to add in a single row, then you have to specify the IS yourself since we cannot generate it from the regular pattern. However, if you know that you will only ever have a single constraint row (which I assume is fairly dense), then I would suggest instead using MatLRC, which Jose developed for SLEPc. This handles the last row/col as a low-rank correction. One step of Sherman-Morrison-Woobury solves this exactly. It requires a solve for A, for which you can use FieldSplit as normal. Thanks, Matt Up to this moment, I use the following commands to create the Field split: ufields(3) = [0, 1, 2] pfields(1) = [3] call PCSetType(pc, PCFIELDSPLIT, ierr) call PCFieldSplitSetBlockSize(pc, 4,ierr) call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) Thanks, Pantelis ________________________________ ???: Matthew Knepley > ????????: ?????????, 19 ?????????? 2024 11:31 ?? ????: Barry Smith > ????.: Pantelis Moschopoulos >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Fri, Jan 19, 2024 at 4:25?PM Barry Smith > wrote: Generally fieldsplit is used on problems that have a natural "split" of the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 This is often indicated in the vectors and matrices with the "blocksize" argument, 2 in this case. DM also often provides this information. When laying out a vector/matrix with a blocksize one must ensure that an equal number of of the subsets appears on each MPI process. So, for example, if the above vector is distributed over 3 MPI processes one could use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 v1,u2,v2 u3,v3. Another way to think about it is that one must split up the vector as indexed by block among the processes. For most multicomponent problems this type of decomposition is very natural in the logic of the code. This blocking is only convenient, not necessary. You can specify your own field division using PCFieldSplitSetIS(). Thanks, Matt Barry On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos > wrote: Dear all, When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything works fine. When I turn now to parallel, I observe that the number of ranks that I can use must divide the number of N without any remainder, where N is the number of unknowns. Otherwise, an error of the following form emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". Can I do something to overcome this? Thanks, Pantelis -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 23 09:21:49 2024 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Jan 2024 10:21:49 -0500 Subject: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT In-Reply-To: References: Message-ID: On Tue, Jan 23, 2024 at 9:45?AM Pantelis Moschopoulos < pmoschopoulos at outlook.com> wrote: > Dear Matt, > > I read about the MATLRC. However, its correct usage is not clear to me so > I have the following questions: > > 1. The U and V input matrices should be created as dense using > MatCreateDense? > > Yes. If you have one row, it looks like a vector, or a matrix with one column. If you have 1 row on the bottom, then U = [0, 0, ..., 0, 1] V = [the row] C = [1] will give you that. However, you have an extra row and column? > > 1. I use the command MatCreateLRC just to declare the matrix and then > MatLRCSetMats to pass the values of the constituents? > > You can use MatCreate(comm, &M) MatSetSizes(M, ...) MatSetType(M, MATLRC) MatLRCSetMats(M, ...) However, you are right that it is a little more complicated, because A is not just the upper block here. > 1. > > Then, how do I proceed? How I apply the step of Sherman-Morrison-Woobury > formula? I intend to use iterative solvers for A (main matrix) so I will > not have its A^-1 at hand which I think is what the > Sherman-Morrison-Woobury formula needs. > I think I was wrong. MatLRC is not the best fit. We should use MatNest instead. Then you could have A u v^t 0 as your matrix. We could still get an explicit Schur complement automatically using nested FieldSplit. So 1) Make the MATNEST matrix as shown above 2) Use PCFIELDSPLIT for it. This should be an easy IS, since there is only one row in the second one. 3) Select the full Schur complement -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type full -pc_fieldsplit_schur_precondition full 4) Use a recursive FieldSplit (might be able to use -fieldsplit_0_pc_fieldsplit_detect_saddle_point) -fieldsplit_0_pc_type fieldsplit -fieldsplit_0_pc_fieldsplit_0_fields 0,1,2 -fieldsplit_0_pc_fieldsplit_1_fields 3 I think this does what you want, and should be much easier than getting MatLRC to do it. Thanks, Matt > Thanks, > Pantelis > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????, 23 ?????????? 2024 3:20 ?? > *????:* Pantelis Moschopoulos > *????.:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Tue, Jan 23, 2024 at 8:16?AM Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear Matt, > > Thank you for your response. This is an idealized setup where I have only > one row/column. Sometimes we might need two or even three constraints based > on the application. Thus, I will pursue the user-defined IS. > > > Anything < 50 I would use MatLRC. The bottleneck is the inversion of a > dense matrix of size k x k, where k is the number of constraints. Using an > IS is definitely fine, but dense rows can detract from iterative > convergence. > > > When I supply the IS using the command PCFieldSplitSetIS, I do not specify > anything in the matrix set up right? > > > You should just need to specify the rows for each field as an IS. > > Thanks, > > Matt > > > Thanks, > Pantelis > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????, 23 ?????????? 2024 2:51 ?? > *????:* Pantelis Moschopoulos > *????.:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Tue, Jan 23, 2024 at 4:23?AM Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear Matt and Dear Barry, > > I have some follow up questions regarding FieldSplit. > Let's assume that I solve again the 3D Stokes flow but now I have also a > global constraint that controls the flow rate at the inlet. Now, the matrix > has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., > but the last line (and the last column) corresponds to the contribution of > the global constraint equation. I want to incorporate the last line (and > last column) into the local block of velocities (split 0) and the > pressure. The problem is how I do that. I have two questions: > > 1. Now, the block size should be 5 in the matrix and vector creation > for this problem? > > No. Blocksize is only useful when the vector/matrix layout is completely > regular, meaning _every_ block looks the same. Here you have a single row > to be added in. > > > 1. I have to rely entirely on PCFieldSplitSetIS to create the two > blocks? Can I augment simply the previously defined block 0 with the last > line of the matrix? > > If you want to add in a single row, then you have to specify the IS > yourself since we cannot generate it from the regular pattern. > > However, if you know that you will only ever have a single constraint row > (which I assume is fairly dense), then I would suggest instead using > MatLRC, which Jose developed for SLEPc. This handles the last row/col as a > low-rank correction. One step of Sherman-Morrison-Woobury solves this > exactly. It requires a solve for A, for which you can use FieldSplit as > normal. > > Thanks, > > Matt > > > Up to this moment, I use the following commands to create the Field split: > ufields(3) = [0, 1, 2] > pfields(1) = [3] > > call PCSetType(pc, PCFIELDSPLIT, ierr) > call PCFieldSplitSetBlockSize(pc, 4,ierr) > call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) > call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) > > Thanks, > Pantelis > > > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????????, 19 ?????????? 2024 11:31 ?? > *????:* Barry Smith > *????.:* Pantelis Moschopoulos ; > petsc-users at mcs.anl.gov > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Fri, Jan 19, 2024 at 4:25?PM Barry Smith wrote: > > > Generally fieldsplit is used on problems that have a natural "split" of > the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 > This is often indicated in the vectors and matrices with the "blocksize" > argument, 2 in this case. DM also often provides this information. > > When laying out a vector/matrix with a blocksize one must ensure that > an equal number of of the subsets appears on each MPI process. So, for > example, if the above vector is distributed over 3 MPI processes one could > use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 > v1,u2,v2 u3,v3. Another way to think about it is that one must split up > the vector as indexed by block among the processes. For most multicomponent > problems this type of decomposition is very natural in the logic of the > code. > > > This blocking is only convenient, not necessary. You can specify your own > field division using PCFieldSplitSetIS(). > > Thanks, > > Matt > > > Barry > > > On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear all, > > When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything > works fine. When I turn now to parallel, I observe that the number of ranks > that I can use must divide the number of N without any remainder, where N > is the number of unknowns. Otherwise, an error of the following form > emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". > > Can I do something to overcome this? > > Thanks, > Pantelis > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmoschopoulos at outlook.com Tue Jan 23 10:05:59 2024 From: pmoschopoulos at outlook.com (Pantelis Moschopoulos) Date: Tue, 23 Jan 2024 16:05:59 +0000 Subject: [petsc-users] =?utf-8?q?=CE=91=CF=80=3A__Question_about_a_parall?= =?utf-8?q?el_implementation_of_PCFIELDSPLIT?= In-Reply-To: References: Message-ID: Dear Matt, Thank you for your explanation. The new methodology is straightforward to implement. Still, I have one more question . When I use the option -pc_fieldsplit_schur_precondition full, PETSc computes internally the exact Schur complement matrix representation. Based on the example matrix that you send, the Schur complement is: S = -v^t (A^-1) u. How will PETSc will calculate the vector (A^-1) u ? Or it calculates the exact Schur complement matrix differently? Thanks, Pantelis ________________________________ ???: Matthew Knepley ????????: ?????, 23 ?????????? 2024 5:21 ?? ????: Pantelis Moschopoulos ????.: Barry Smith ; petsc-users at mcs.anl.gov ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 9:45?AM Pantelis Moschopoulos > wrote: Dear Matt, I read about the MATLRC. However, its correct usage is not clear to me so I have the following questions: 1. The U and V input matrices should be created as dense using MatCreateDense? Yes. If you have one row, it looks like a vector, or a matrix with one column. If you have 1 row on the bottom, then U = [0, 0, ..., 0, 1] V = [the row] C = [1] will give you that. However, you have an extra row and column? 1. I use the command MatCreateLRC just to declare the matrix and then MatLRCSetMats to pass the values of the constituents? You can use MatCreate(comm, &M) MatSetSizes(M, ...) MatSetType(M, MATLRC) MatLRCSetMats(M, ...) However, you are right that it is a little more complicated, because A is not just the upper block here. 1. Then, how do I proceed? How I apply the step of Sherman-Morrison-Woobury formula? I intend to use iterative solvers for A (main matrix) so I will not have its A^-1 at hand which I think is what the Sherman-Morrison-Woobury formula needs. I think I was wrong. MatLRC is not the best fit. We should use MatNest instead. Then you could have A u v^t 0 as your matrix. We could still get an explicit Schur complement automatically using nested FieldSplit. So 1) Make the MATNEST matrix as shown above 2) Use PCFIELDSPLIT for it. This should be an easy IS, since there is only one row in the second one. 3) Select the full Schur complement -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type full -pc_fieldsplit_schur_precondition full 4) Use a recursive FieldSplit (might be able to use -fieldsplit_0_pc_fieldsplit_detect_saddle_point) -fieldsplit_0_pc_type fieldsplit -fieldsplit_0_pc_fieldsplit_0_fields 0,1,2 -fieldsplit_0_pc_fieldsplit_1_fields 3 I think this does what you want, and should be much easier than getting MatLRC to do it. Thanks, Matt Thanks, Pantelis ?????? ________________________________ ???: Matthew Knepley > ????????: ?????, 23 ?????????? 2024 3:20 ?? ????: Pantelis Moschopoulos > ????.: Barry Smith >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 8:16?AM Pantelis Moschopoulos > wrote: Dear Matt, Thank you for your response. This is an idealized setup where I have only one row/column. Sometimes we might need two or even three constraints based on the application. Thus, I will pursue the user-defined IS. Anything < 50 I would use MatLRC. The bottleneck is the inversion of a dense matrix of size k x k, where k is the number of constraints. Using an IS is definitely fine, but dense rows can detract from iterative convergence. When I supply the IS using the command PCFieldSplitSetIS, I do not specify anything in the matrix set up right? You should just need to specify the rows for each field as an IS. Thanks, Matt Thanks, Pantelis ________________________________ ???: Matthew Knepley > ????????: ?????, 23 ?????????? 2024 2:51 ?? ????: Pantelis Moschopoulos > ????.: Barry Smith >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 4:23?AM Pantelis Moschopoulos > wrote: Dear Matt and Dear Barry, I have some follow up questions regarding FieldSplit. Let's assume that I solve again the 3D Stokes flow but now I have also a global constraint that controls the flow rate at the inlet. Now, the matrix has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., but the last line (and the last column) corresponds to the contribution of the global constraint equation. I want to incorporate the last line (and last column) into the local block of velocities (split 0) and the pressure. The problem is how I do that. I have two questions: 1. Now, the block size should be 5 in the matrix and vector creation for this problem? No. Blocksize is only useful when the vector/matrix layout is completely regular, meaning _every_ block looks the same. Here you have a single row to be added in. 1. I have to rely entirely on PCFieldSplitSetIS to create the two blocks? Can I augment simply the previously defined block 0 with the last line of the matrix? If you want to add in a single row, then you have to specify the IS yourself since we cannot generate it from the regular pattern. However, if you know that you will only ever have a single constraint row (which I assume is fairly dense), then I would suggest instead using MatLRC, which Jose developed for SLEPc. This handles the last row/col as a low-rank correction. One step of Sherman-Morrison-Woobury solves this exactly. It requires a solve for A, for which you can use FieldSplit as normal. Thanks, Matt Up to this moment, I use the following commands to create the Field split: ufields(3) = [0, 1, 2] pfields(1) = [3] call PCSetType(pc, PCFIELDSPLIT, ierr) call PCFieldSplitSetBlockSize(pc, 4,ierr) call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) Thanks, Pantelis ________________________________ ???: Matthew Knepley > ????????: ?????????, 19 ?????????? 2024 11:31 ?? ????: Barry Smith > ????.: Pantelis Moschopoulos >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Fri, Jan 19, 2024 at 4:25?PM Barry Smith > wrote: Generally fieldsplit is used on problems that have a natural "split" of the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 This is often indicated in the vectors and matrices with the "blocksize" argument, 2 in this case. DM also often provides this information. When laying out a vector/matrix with a blocksize one must ensure that an equal number of of the subsets appears on each MPI process. So, for example, if the above vector is distributed over 3 MPI processes one could use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 v1,u2,v2 u3,v3. Another way to think about it is that one must split up the vector as indexed by block among the processes. For most multicomponent problems this type of decomposition is very natural in the logic of the code. This blocking is only convenient, not necessary. You can specify your own field division using PCFieldSplitSetIS(). Thanks, Matt Barry On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos > wrote: Dear all, When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything works fine. When I turn now to parallel, I observe that the number of ranks that I can use must divide the number of N without any remainder, where N is the number of unknowns. Otherwise, an error of the following form emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". Can I do something to overcome this? Thanks, Pantelis -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 23 10:15:28 2024 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Jan 2024 11:15:28 -0500 Subject: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT In-Reply-To: References: Message-ID: On Tue, Jan 23, 2024 at 11:06?AM Pantelis Moschopoulos < pmoschopoulos at outlook.com> wrote: > Dear Matt, > > Thank you for your explanation. The new methodology is straightforward to > implement. > Still, I have one more question . When I use the option > -pc_fieldsplit_schur_precondition full, PETSc computes internally the exact > Schur complement matrix representation. Based on the example matrix that > you send, the Schur complement is: S = -v^t (A^-1) u. How will PETSc will > calculate the vector (A^-1) u ? Or it calculates the exact Schur complement > matrix differently? > FULL calls MatSchurComplementComputeExplicitOperator(), which calls KSPMatSolve() to compute A^{-1} u, which default to KSPSolve for each column if no specialized code is available. So it should just use your solver for the (0,0) block, and then take the dot product with v. Thanks, Matt > Thanks, > Pantelis > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????, 23 ?????????? 2024 5:21 ?? > *????:* Pantelis Moschopoulos > *????.:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Tue, Jan 23, 2024 at 9:45?AM Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear Matt, > > I read about the MATLRC. However, its correct usage is not clear to me so > I have the following questions: > > 1. The U and V input matrices should be created as dense using > MatCreateDense? > > > Yes. If you have one row, it looks like a vector, or a matrix with one > column. > > If you have 1 row on the bottom, then > > U = [0, 0, ..., 0, 1] > V = [the row] > C = [1] > > will give you that. However, you have an extra row and column? > > > > 1. I use the command MatCreateLRC just to declare the matrix and then > MatLRCSetMats to pass the values of the constituents? > > You can use > > MatCreate(comm, &M) > MatSetSizes(M, ...) > MatSetType(M, MATLRC) > MatLRCSetMats(M, ...) > > However, you are right that it is a little more complicated, because A is > not just the upper block here. > > > 1. > > Then, how do I proceed? How I apply the step of Sherman-Morrison-Woobury > formula? I intend to use iterative solvers for A (main matrix) so I will > not have its A^-1 at hand which I think is what the > Sherman-Morrison-Woobury formula needs. > > > I think I was wrong. MatLRC is not the best fit. We should use MatNest > instead. Then you could have > > A u > v^t 0 > > as your matrix. We could still get an explicit Schur complement > automatically using nested FieldSplit. So > > 1) Make the MATNEST matrix as shown above > > 2) Use PCFIELDSPLIT for it. This should be an easy IS, since there is > only one row in the second one. > > 3) Select the full Schur complement > > -pc_type fieldsplit > -pc_fieldsplit_type schur > -pc_fieldsplit_schur_fact_type full > -pc_fieldsplit_schur_precondition full > > 4) Use a recursive FieldSplit (might be able to use > -fieldsplit_0_pc_fieldsplit_detect_saddle_point) > > -fieldsplit_0_pc_type fieldsplit > -fieldsplit_0_pc_fieldsplit_0_fields 0,1,2 > -fieldsplit_0_pc_fieldsplit_1_fields 3 > > I think this does what you want, and should be much easier than getting > MatLRC to do it. > > Thanks, > > Matt > > > Thanks, > Pantelis > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????, 23 ?????????? 2024 3:20 ?? > *????:* Pantelis Moschopoulos > *????.:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Tue, Jan 23, 2024 at 8:16?AM Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear Matt, > > Thank you for your response. This is an idealized setup where I have only > one row/column. Sometimes we might need two or even three constraints based > on the application. Thus, I will pursue the user-defined IS. > > > Anything < 50 I would use MatLRC. The bottleneck is the inversion of a > dense matrix of size k x k, where k is the number of constraints. Using an > IS is definitely fine, but dense rows can detract from iterative > convergence. > > > When I supply the IS using the command PCFieldSplitSetIS, I do not specify > anything in the matrix set up right? > > > You should just need to specify the rows for each field as an IS. > > Thanks, > > Matt > > > Thanks, > Pantelis > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????, 23 ?????????? 2024 2:51 ?? > *????:* Pantelis Moschopoulos > *????.:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Tue, Jan 23, 2024 at 4:23?AM Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear Matt and Dear Barry, > > I have some follow up questions regarding FieldSplit. > Let's assume that I solve again the 3D Stokes flow but now I have also a > global constraint that controls the flow rate at the inlet. Now, the matrix > has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., > but the last line (and the last column) corresponds to the contribution of > the global constraint equation. I want to incorporate the last line (and > last column) into the local block of velocities (split 0) and the > pressure. The problem is how I do that. I have two questions: > > 1. Now, the block size should be 5 in the matrix and vector creation > for this problem? > > No. Blocksize is only useful when the vector/matrix layout is completely > regular, meaning _every_ block looks the same. Here you have a single row > to be added in. > > > 1. I have to rely entirely on PCFieldSplitSetIS to create the two > blocks? Can I augment simply the previously defined block 0 with the last > line of the matrix? > > If you want to add in a single row, then you have to specify the IS > yourself since we cannot generate it from the regular pattern. > > However, if you know that you will only ever have a single constraint row > (which I assume is fairly dense), then I would suggest instead using > MatLRC, which Jose developed for SLEPc. This handles the last row/col as a > low-rank correction. One step of Sherman-Morrison-Woobury solves this > exactly. It requires a solve for A, for which you can use FieldSplit as > normal. > > Thanks, > > Matt > > > Up to this moment, I use the following commands to create the Field split: > ufields(3) = [0, 1, 2] > pfields(1) = [3] > > call PCSetType(pc, PCFIELDSPLIT, ierr) > call PCFieldSplitSetBlockSize(pc, 4,ierr) > call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) > call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) > > Thanks, > Pantelis > > > ------------------------------ > *???:* Matthew Knepley > *????????:* ?????????, 19 ?????????? 2024 11:31 ?? > *????:* Barry Smith > *????.:* Pantelis Moschopoulos ; > petsc-users at mcs.anl.gov > *????:* Re: [petsc-users] Question about a parallel implementation of > PCFIELDSPLIT > > On Fri, Jan 19, 2024 at 4:25?PM Barry Smith wrote: > > > Generally fieldsplit is used on problems that have a natural "split" of > the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 > This is often indicated in the vectors and matrices with the "blocksize" > argument, 2 in this case. DM also often provides this information. > > When laying out a vector/matrix with a blocksize one must ensure that > an equal number of of the subsets appears on each MPI process. So, for > example, if the above vector is distributed over 3 MPI processes one could > use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 > v1,u2,v2 u3,v3. Another way to think about it is that one must split up > the vector as indexed by block among the processes. For most multicomponent > problems this type of decomposition is very natural in the logic of the > code. > > > This blocking is only convenient, not necessary. You can specify your own > field division using PCFieldSplitSetIS(). > > Thanks, > > Matt > > > Barry > > > On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos < > pmoschopoulos at outlook.com> wrote: > > Dear all, > > When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything > works fine. When I turn now to parallel, I observe that the number of ranks > that I can use must divide the number of N without any remainder, where N > is the number of unknowns. Otherwise, an error of the following form > emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". > > Can I do something to overcome this? > > Thanks, > Pantelis > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at paraffinalia.co.uk Tue Jan 23 10:39:46 2024 From: michael at paraffinalia.co.uk (michael at paraffinalia.co.uk) Date: Tue, 23 Jan 2024 16:39:46 +0000 Subject: [petsc-users] Bug in VecNorm, 3.20.3 Message-ID: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> Hello, I have used the GMRES solver in PETSc successfully up to now, but on installing the most recent release, 3.20.3, the solver fails by exiting early. Output from the code is: lt-nbi-solve-laplace: starting PETSc solver [23.0537] 0 KSP Residual norm < 1.e-11 Linear solve converged due to CONVERGED_ATOL iterations 0 lt-nbi-solve-laplace: 0 iterations [23.0542] (22.9678) and tracing execution shows the norm returned by VecNorm to be 0. If I modify the function by commenting out line 217 of src/vec/vec/interface/rvector.c /* if (flg) PetscFunctionReturn(PETSC_SUCCESS); */ the code executes correctly: lt-nbi-solve-laplace: starting PETSc solver [22.9392] 0 KSP Residual norm 1.10836 1 KSP Residual norm 0.0778301 2 KSP Residual norm 0.0125121 3 KSP Residual norm 0.00165836 4 KSP Residual norm 0.000164066 5 KSP Residual norm 2.12824e-05 6 KSP Residual norm 4.50696e-06 7 KSP Residual norm 5.85082e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 My compile options are: PETSC_ARCH=linux-gnu-real ./configure --with-mpi=0 --with-scalar-type=real --with-threadsafety --with-debugging=0 --with-log=0 --with-openmp uname -a returns: 5.15.80 #1 SMP PREEMPT Sun Nov 27 13:28:05 CST 2022 x86_64 Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz GenuineIntel GNU/Linux From junchao.zhang at gmail.com Tue Jan 23 12:09:15 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 23 Jan 2024 12:09:15 -0600 Subject: [petsc-users] Bug in VecNorm, 3.20.3 In-Reply-To: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> References: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> Message-ID: Do you have an example to reproduce it? --Junchao Zhang On Tue, Jan 23, 2024 at 10:49?AM wrote: > Hello, > > I have used the GMRES solver in PETSc successfully up to now, but on > installing the most recent release, 3.20.3, the solver fails by exiting > early. Output from the code is: > > lt-nbi-solve-laplace: starting PETSc solver [23.0537] > 0 KSP Residual norm < 1.e-11 > Linear solve converged due to CONVERGED_ATOL iterations 0 > lt-nbi-solve-laplace: 0 iterations [23.0542] (22.9678) > > and tracing execution shows the norm returned by VecNorm to be 0. > > If I modify the function by commenting out line 217 of > > src/vec/vec/interface/rvector.c > > /* if (flg) PetscFunctionReturn(PETSC_SUCCESS); */ > > the code executes correctly: > > lt-nbi-solve-laplace: starting PETSc solver [22.9392] > 0 KSP Residual norm 1.10836 > 1 KSP Residual norm 0.0778301 > 2 KSP Residual norm 0.0125121 > 3 KSP Residual norm 0.00165836 > 4 KSP Residual norm 0.000164066 > 5 KSP Residual norm 2.12824e-05 > 6 KSP Residual norm 4.50696e-06 > 7 KSP Residual norm 5.85082e-07 > Linear solve converged due to CONVERGED_RTOL iterations 7 > > My compile options are: > > PETSC_ARCH=linux-gnu-real ./configure --with-mpi=0 > --with-scalar-type=real --with-threadsafety --with-debugging=0 > --with-log=0 --with-openmp > > uname -a returns: > > 5.15.80 #1 SMP PREEMPT Sun Nov 27 13:28:05 CST 2022 x86_64 Intel(R) > Core(TM) i5-6200U CPU @ 2.30GHz GenuineIntel GNU/Linux > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 23 12:59:11 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 23 Jan 2024 13:59:11 -0500 Subject: [petsc-users] Bug in VecNorm, 3.20.3 In-Reply-To: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> References: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> Message-ID: <76FE6626-3B55-4CCE-955E-7D246B83869E@petsc.dev> This could happen if the values in the vector get changed but the PetscObjectState does not get updated. Normally this is impossible, any action that changes a vectors values changes its state (so for example calling VecGetArray()/VecRestoreArray() updates the state. Are you accessing the vector values in any non-standard way? Barry > On Jan 23, 2024, at 11:39?AM, michael at paraffinalia.co.uk wrote: > > Hello, > > I have used the GMRES solver in PETSc successfully up to now, but on installing the most recent release, 3.20.3, the solver fails by exiting early. Output from the code is: > > lt-nbi-solve-laplace: starting PETSc solver [23.0537] > 0 KSP Residual norm < 1.e-11 > Linear solve converged due to CONVERGED_ATOL iterations 0 > lt-nbi-solve-laplace: 0 iterations [23.0542] (22.9678) > > and tracing execution shows the norm returned by VecNorm to be 0. > > If I modify the function by commenting out line 217 of > > src/vec/vec/interface/rvector.c > > /* if (flg) PetscFunctionReturn(PETSC_SUCCESS); */ > > the code executes correctly: > > lt-nbi-solve-laplace: starting PETSc solver [22.9392] > 0 KSP Residual norm 1.10836 > 1 KSP Residual norm 0.0778301 > 2 KSP Residual norm 0.0125121 > 3 KSP Residual norm 0.00165836 > 4 KSP Residual norm 0.000164066 > 5 KSP Residual norm 2.12824e-05 > 6 KSP Residual norm 4.50696e-06 > 7 KSP Residual norm 5.85082e-07 > Linear solve converged due to CONVERGED_RTOL iterations 7 > > My compile options are: > > PETSC_ARCH=linux-gnu-real ./configure --with-mpi=0 --with-scalar-type=real --with-threadsafety --with-debugging=0 --with-log=0 --with-openmp > > uname -a returns: > > 5.15.80 #1 SMP PREEMPT Sun Nov 27 13:28:05 CST 2022 x86_64 Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz GenuineIntel GNU/Linux > From stefano.zampini at gmail.com Tue Jan 23 13:48:24 2024 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Tue, 23 Jan 2024 22:48:24 +0300 Subject: [petsc-users] Bug in VecNorm, 3.20.3 In-Reply-To: <76FE6626-3B55-4CCE-955E-7D246B83869E@petsc.dev> References: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> <76FE6626-3B55-4CCE-955E-7D246B83869E@petsc.dev> Message-ID: petsc main in debug mode has some additional checks for this cases. Can you run with the main branch and configure petsc using --with-debugging=1? Il giorno mar 23 gen 2024 alle ore 22:35 Barry Smith ha scritto: > > This could happen if the values in the vector get changed but the > PetscObjectState does not get updated. Normally this is impossible, any > action that changes a vectors values changes its state (so for example > calling VecGetArray()/VecRestoreArray() updates the state. > > Are you accessing the vector values in any non-standard way? > > Barry > > > > On Jan 23, 2024, at 11:39?AM, michael at paraffinalia.co.uk wrote: > > > > Hello, > > > > I have used the GMRES solver in PETSc successfully up to now, but on > installing the most recent release, 3.20.3, the solver fails by exiting > early. Output from the code is: > > > > lt-nbi-solve-laplace: starting PETSc solver [23.0537] > > 0 KSP Residual norm < 1.e-11 > > Linear solve converged due to CONVERGED_ATOL iterations 0 > > lt-nbi-solve-laplace: 0 iterations [23.0542] (22.9678) > > > > and tracing execution shows the norm returned by VecNorm to be 0. > > > > If I modify the function by commenting out line 217 of > > > > src/vec/vec/interface/rvector.c > > > > /* if (flg) PetscFunctionReturn(PETSC_SUCCESS); */ > > > > the code executes correctly: > > > > lt-nbi-solve-laplace: starting PETSc solver [22.9392] > > 0 KSP Residual norm 1.10836 > > 1 KSP Residual norm 0.0778301 > > 2 KSP Residual norm 0.0125121 > > 3 KSP Residual norm 0.00165836 > > 4 KSP Residual norm 0.000164066 > > 5 KSP Residual norm 2.12824e-05 > > 6 KSP Residual norm 4.50696e-06 > > 7 KSP Residual norm 5.85082e-07 > > Linear solve converged due to CONVERGED_RTOL iterations 7 > > > > My compile options are: > > > > PETSC_ARCH=linux-gnu-real ./configure --with-mpi=0 > --with-scalar-type=real --with-threadsafety --with-debugging=0 --with-log=0 > --with-openmp > > > > uname -a returns: > > > > 5.15.80 #1 SMP PREEMPT Sun Nov 27 13:28:05 CST 2022 x86_64 Intel(R) > Core(TM) i5-6200U CPU @ 2.30GHz GenuineIntel GNU/Linux > > > > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmoschopoulos at outlook.com Wed Jan 24 00:29:59 2024 From: pmoschopoulos at outlook.com (Pantelis Moschopoulos) Date: Wed, 24 Jan 2024 06:29:59 +0000 Subject: [petsc-users] =?utf-8?q?=CE=91=CF=80=3A__Question_about_a_parall?= =?utf-8?q?el_implementation_of_PCFIELDSPLIT?= In-Reply-To: References: Message-ID: Thanks very much Matt for the detailed explanations. I was asking about the Schur complement because I have tried a "manual" version of this procedure without the field split. Eventually, it needs the solution of three linear systems, just like A^{-1} u. If you have the LU of A, then everything is perfect. However, if you use iterative solvers, things slow down considerably and I have found that it is faster to just add the constraint in matrix A and solve it with an iterative solver, albeit the larger iteration count that the augmented matrix needs. I will incorporate the suggestions and I will come back with the results. Thanks, Pantelis ________________________________ ???: Matthew Knepley ????????: ?????, 23 ?????????? 2024 6:15 ?? ????: Pantelis Moschopoulos ????.: Barry Smith ; petsc-users at mcs.anl.gov ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 11:06?AM Pantelis Moschopoulos > wrote: Dear Matt, Thank you for your explanation. The new methodology is straightforward to implement. Still, I have one more question . When I use the option -pc_fieldsplit_schur_precondition full, PETSc computes internally the exact Schur complement matrix representation. Based on the example matrix that you send, the Schur complement is: S = -v^t (A^-1) u. How will PETSc will calculate the vector (A^-1) u ? Or it calculates the exact Schur complement matrix differently? FULL calls MatSchurComplementComputeExplicitOperator(), which calls KSPMatSolve() to compute A^{-1} u, which default to KSPSolve for each column if no specialized code is available. So it should just use your solver for the (0,0) block, and then take the dot product with v. Thanks, Matt Thanks, Pantelis ________________________________ ???: Matthew Knepley > ????????: ?????, 23 ?????????? 2024 5:21 ?? ????: Pantelis Moschopoulos > ????.: Barry Smith >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 9:45?AM Pantelis Moschopoulos > wrote: Dear Matt, I read about the MATLRC. However, its correct usage is not clear to me so I have the following questions: 1. The U and V input matrices should be created as dense using MatCreateDense? Yes. If you have one row, it looks like a vector, or a matrix with one column. If you have 1 row on the bottom, then U = [0, 0, ..., 0, 1] V = [the row] C = [1] will give you that. However, you have an extra row and column? 1. I use the command MatCreateLRC just to declare the matrix and then MatLRCSetMats to pass the values of the constituents? You can use MatCreate(comm, &M) MatSetSizes(M, ...) MatSetType(M, MATLRC) MatLRCSetMats(M, ...) However, you are right that it is a little more complicated, because A is not just the upper block here. 1. Then, how do I proceed? How I apply the step of Sherman-Morrison-Woobury formula? I intend to use iterative solvers for A (main matrix) so I will not have its A^-1 at hand which I think is what the Sherman-Morrison-Woobury formula needs. I think I was wrong. MatLRC is not the best fit. We should use MatNest instead. Then you could have A u v^t 0 as your matrix. We could still get an explicit Schur complement automatically using nested FieldSplit. So 1) Make the MATNEST matrix as shown above 2) Use PCFIELDSPLIT for it. This should be an easy IS, since there is only one row in the second one. 3) Select the full Schur complement -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type full -pc_fieldsplit_schur_precondition full 4) Use a recursive FieldSplit (might be able to use -fieldsplit_0_pc_fieldsplit_detect_saddle_point) -fieldsplit_0_pc_type fieldsplit -fieldsplit_0_pc_fieldsplit_0_fields 0,1,2 -fieldsplit_0_pc_fieldsplit_1_fields 3 I think this does what you want, and should be much easier than getting MatLRC to do it. Thanks, Matt Thanks, Pantelis ?????? ________________________________ ???: Matthew Knepley > ????????: ?????, 23 ?????????? 2024 3:20 ?? ????: Pantelis Moschopoulos > ????.: Barry Smith >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 8:16?AM Pantelis Moschopoulos > wrote: Dear Matt, Thank you for your response. This is an idealized setup where I have only one row/column. Sometimes we might need two or even three constraints based on the application. Thus, I will pursue the user-defined IS. Anything < 50 I would use MatLRC. The bottleneck is the inversion of a dense matrix of size k x k, where k is the number of constraints. Using an IS is definitely fine, but dense rows can detract from iterative convergence. When I supply the IS using the command PCFieldSplitSetIS, I do not specify anything in the matrix set up right? You should just need to specify the rows for each field as an IS. Thanks, Matt Thanks, Pantelis ________________________________ ???: Matthew Knepley > ????????: ?????, 23 ?????????? 2024 2:51 ?? ????: Pantelis Moschopoulos > ????.: Barry Smith >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Tue, Jan 23, 2024 at 4:23?AM Pantelis Moschopoulos > wrote: Dear Matt and Dear Barry, I have some follow up questions regarding FieldSplit. Let's assume that I solve again the 3D Stokes flow but now I have also a global constraint that controls the flow rate at the inlet. Now, the matrix has the same unknowns as before, i.e. ux0,uy0,uz0,p0//ux1,uy1,uz1,p1//..., but the last line (and the last column) corresponds to the contribution of the global constraint equation. I want to incorporate the last line (and last column) into the local block of velocities (split 0) and the pressure. The problem is how I do that. I have two questions: 1. Now, the block size should be 5 in the matrix and vector creation for this problem? No. Blocksize is only useful when the vector/matrix layout is completely regular, meaning _every_ block looks the same. Here you have a single row to be added in. 1. I have to rely entirely on PCFieldSplitSetIS to create the two blocks? Can I augment simply the previously defined block 0 with the last line of the matrix? If you want to add in a single row, then you have to specify the IS yourself since we cannot generate it from the regular pattern. However, if you know that you will only ever have a single constraint row (which I assume is fairly dense), then I would suggest instead using MatLRC, which Jose developed for SLEPc. This handles the last row/col as a low-rank correction. One step of Sherman-Morrison-Woobury solves this exactly. It requires a solve for A, for which you can use FieldSplit as normal. Thanks, Matt Up to this moment, I use the following commands to create the Field split: ufields(3) = [0, 1, 2] pfields(1) = [3] call PCSetType(pc, PCFIELDSPLIT, ierr) call PCFieldSplitSetBlockSize(pc, 4,ierr) call PCFieldSplitSetFields(pc, "0", 3, ufields, ufields,ierr) call PCFieldSplitSetFields(pc, "1", 1, pfields, pfields,ierr) Thanks, Pantelis ________________________________ ???: Matthew Knepley > ????????: ?????????, 19 ?????????? 2024 11:31 ?? ????: Barry Smith > ????.: Pantelis Moschopoulos >; petsc-users at mcs.anl.gov > ????: Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT On Fri, Jan 19, 2024 at 4:25?PM Barry Smith > wrote: Generally fieldsplit is used on problems that have a natural "split" of the variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 This is often indicated in the vectors and matrices with the "blocksize" argument, 2 in this case. DM also often provides this information. When laying out a vector/matrix with a blocksize one must ensure that an equal number of of the subsets appears on each MPI process. So, for example, if the above vector is distributed over 3 MPI processes one could use u0,v0,u1,v1 u2,v2 u3,v3 but one cannot use u0,v0,u1 v1,u2,v2 u3,v3. Another way to think about it is that one must split up the vector as indexed by block among the processes. For most multicomponent problems this type of decomposition is very natural in the logic of the code. This blocking is only convenient, not necessary. You can specify your own field division using PCFieldSplitSetIS(). Thanks, Matt Barry On Jan 19, 2024, at 3:19?AM, Pantelis Moschopoulos > wrote: Dear all, When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything works fine. When I turn now to parallel, I observe that the number of ranks that I can use must divide the number of N without any remainder, where N is the number of unknowns. Otherwise, an error of the following form emerges: "Local columns of A10 3473 do not equal local rows of A00 3471". Can I do something to overcome this? Thanks, Pantelis -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jourdon.anthon at gmail.com Fri Jan 26 05:48:17 2024 From: jourdon.anthon at gmail.com (Anthony Jourdon) Date: Fri, 26 Jan 2024 12:48:17 +0100 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: <0D321AB8-9E8F-4484-8D52-AE39FCCD8644@petsc.dev> References: <0D321AB8-9E8F-4484-8D52-AE39FCCD8644@petsc.dev> Message-ID: Hello, Thank you for your answers. I am working with Dave May on this topic. Still running src/ksp/ksp/tutorials/ex34 with the same options reported by Dave, I added the option -log_view_gpu_time. Now the log provides gpu flop/s instead of nans. However, I have trouble understanding the numbers reported in the log (file attached). 1. The numbers reported for Total Mflop/s and GPU Mflop/s are different even when 100% of the work is supposed to be done on the GPU. 2. The numbers reported for GPU Mflop/s are always higher than the numbers reported for Total Mflop/s. As I understand, the Total Mflop/s should be the sum of both GPU and CPU flop/s, but if the gpu does 100% of the work, why are there different numbers reported by the GPU and Total flop/s columns and why the GPU flop/s are always higher than the Total flop/s ? Or am I missing something? Thank you for your attention. Anthony Jourdon Le sam. 20 janv. 2024 ? 02:25, Barry Smith a ?crit : > > Nans indicate we do not have valid computational times for these > operations; think of them as Not Available. Providing valid times for the > "inner" operations listed with Nans requires inaccurate times (higher) for > the outer operations, since extra synchronization between the CPU and GPU > must be done to get valid times for the inner options. We opted to have the > best valid times for the outer operations since those times reflect the > time of the application. > > > > > > > On Jan 19, 2024, at 12:35?PM, Dave May wrote: > > > > Hi all, > > > > I am trying to understand the logging information associated with the > %flops-performed-on-the-gpu reported by -log_view when running > > src/ksp/ksp/tutorials/ex34 > > with the following options > > -da_grid_x 192 > > -da_grid_y 192 > > -da_grid_z 192 > > -dm_mat_type seqaijhipsparse > > -dm_vec_type seqhip > > -ksp_max_it 10 > > -ksp_monitor > > -ksp_type richardson > > -ksp_view > > -log_view > > -mg_coarse_ksp_max_it 2 > > -mg_coarse_ksp_type richardson > > -mg_coarse_pc_type none > > -mg_levels_ksp_type richardson > > -mg_levels_pc_type none > > -options_left > > -pc_mg_levels 3 > > -pc_mg_log > > -pc_type mg > > > > This config is not intended to actually solve the problem, rather it is > a stripped down set of options designed to understand what parts of the > smoothers are being executed on the GPU. > > > > With respect to the log file attached, my first set of questions related > to the data reported under "Event Stage 2: MG Apply". > > > > [1] Why is the log littered with nan's? > > * I don't understand how and why "GPU Mflop/s" should be reported as nan > when a value is given for "GPU %F" (see MatMult for example). > > > > * For events executed on the GPU, I assume the column "Time (sec)" > relates to "CPU execute time", this would explain why we see a nan in "Time > (sec)" for MatMult. > > If my assumption is correct, how should I interpret the column "Flop > (Max)" which is showing 1.92e+09? > > I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" > should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" > > > > [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, > MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as > 93. I believe this value should be 100 as the smoother (and coarse grid > solver) are configured as richardson(2)+none and thus should run entirely > on the GPU. > > Furthermore, when one inspects all events listed under "Event Stage 2: > MG Apply" those events which do flops correctly report "GPU %F" as 100. > > And the events showing "GPU %F" = 0 such as > > MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync > > don't do any flops (on the CPU or GPU) - which is also correct (although > non GPU events should show nan??) > > > > Hence I am wondering what is the explanation for the missing 7% from > "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? > > > > Does anyone understand this -log_view, or can explain to me how to > interpret it? > > > > It could simply be that: > > a) something is messed up with -pc_mg_log > > b) something is messed up with the PETSc build > > c) I am putting too much faith in -log_view and should profile the code > differently. > > > > Either way I'd really like to understand what is going on. > > > > > > Cheers, > > Dave > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex34_192_mg_seqhip_richardson_pcnone_gpulog.out Type: application/octet-stream Size: 22577 bytes Desc: not available URL: From martin.diehl at kuleuven.be Fri Jan 26 07:35:20 2024 From: martin.diehl at kuleuven.be (Martin Diehl) Date: Fri, 26 Jan 2024 13:35:20 +0000 Subject: [petsc-users] User meeting in Cologne: Visit to KU Leuven Message-ID: Dear PETSc team, I was wondering if any of you would be interested to visit the NUMA research group at KU Leuven (https://wms.cs.kuleuven.be/groups/NUMA) for a after or before the user meeting in Cologne. We have/had several PETSc users and in general there seems to be a big overlap between scientific interests. with best regards Martin -- KU Leuven Department of Computer Science Department of Materials Engineering Celestijnenlaan 200a 3001 Leuven, Belgium -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: This is a digitally signed message part URL: From michael at paraffinalia.co.uk Fri Jan 26 08:03:01 2024 From: michael at paraffinalia.co.uk (michael at paraffinalia.co.uk) Date: Fri, 26 Jan 2024 14:03:01 +0000 Subject: [petsc-users] Bug in VecNorm, 3.20.3 In-Reply-To: References: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> Message-ID: <13593fc52714c640b907dbca572dc8dd@paraffinalia.co.uk> On 2024-01-23 18:09, Junchao Zhang wrote: > Do you have an example to reproduce it? > > --Junchao Zhang I have put a minimum example on github: https://github.com/mjcarley/petsc-test It does seem that the problem occurs if I do not use the PETSc interface to do a matrix multiplication. In the original code, the PETSc matrix is a wrapper for a Fast Multipole Method evaluation; in the minimum example I have simulated this by using an array as a matrix. The sample code generates a randomised matrix A and reference solution vector ref, and generates a right hand side b = A*ref which is then supplied as the right hand side for the GMRES solver. If I use the PETSc matrix multiplication, the solver behaves as expected; if I generate b directly from the underlying array for the matrix, I get the result 0 KSP Residual norm < 1.e-11 Linear solve converged due to CONVERGED_ATOL iterations 0 From pierre at joliv.et Fri Jan 26 08:11:23 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 26 Jan 2024 15:11:23 +0100 Subject: [petsc-users] Bug in VecNorm, 3.20.3 In-Reply-To: <13593fc52714c640b907dbca572dc8dd@paraffinalia.co.uk> References: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> <13593fc52714c640b907dbca572dc8dd@paraffinalia.co.uk> Message-ID: <3213C1AB-2C98-4373-9289-9CA05D705908@joliv.et> > On 26 Jan 2024, at 3:03?PM, michael at paraffinalia.co.uk wrote: > > On 2024-01-23 18:09, Junchao Zhang wrote: >> Do you have an example to reproduce it? >> --Junchao Zhang > > I have put a minimum example on github: > > https://github.com/mjcarley/petsc-test > > It does seem that the problem occurs if I do not use the PETSc interface to do a matrix multiplication. > > In the original code, the PETSc matrix is a wrapper for a Fast Multipole Method evaluation; in the minimum example I have simulated this by using an array as a matrix. The sample code generates a randomised matrix A and reference solution vector ref, and generates a right hand side > > b = A*ref > > which is then supplied as the right hand side for the GMRES solver. If I use the PETSc matrix multiplication, the solver behaves as expected; if I generate b directly from the underlying array for the matrix, I get the result You should not use VecGetArrayRead() if you change the Vec, but instead, VecGetArrayWrite(). Does that solve the issue? Thanks, Pierre > 0 KSP Residual norm < 1.e-11 > Linear solve converged due to CONVERGED_ATOL iterations 0 From pierre at joliv.et Fri Jan 26 08:23:54 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 26 Jan 2024 15:23:54 +0100 Subject: [petsc-users] Bug in VecNorm, 3.20.3 In-Reply-To: <3213C1AB-2C98-4373-9289-9CA05D705908@joliv.et> References: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> <13593fc52714c640b907dbca572dc8dd@paraffinalia.co.uk> <3213C1AB-2C98-4373-9289-9CA05D705908@joliv.et> Message-ID: <800C216A-25B5-43B4-9EF5-682AA6D17900@joliv.et> > On 26 Jan 2024, at 3:11?PM, Pierre Jolivet wrote: > >> >> On 26 Jan 2024, at 3:03?PM, michael at paraffinalia.co.uk wrote: >> >> On 2024-01-23 18:09, Junchao Zhang wrote: >>> Do you have an example to reproduce it? >>> --Junchao Zhang >> >> I have put a minimum example on github: >> >> https://github.com/mjcarley/petsc-test >> >> It does seem that the problem occurs if I do not use the PETSc interface to do a matrix multiplication. >> >> In the original code, the PETSc matrix is a wrapper for a Fast Multipole Method evaluation; in the minimum example I have simulated this by using an array as a matrix. The sample code generates a randomised matrix A and reference solution vector ref, and generates a right hand side >> >> b = A*ref >> >> which is then supplied as the right hand side for the GMRES solver. If I use the PETSc matrix multiplication, the solver behaves as expected; if I generate b directly from the underlying array for the matrix, I get the result > > You should not use VecGetArrayRead() if you change the Vec, but instead, VecGetArrayWrite(). > Does that solve the issue? Sorry, I sent the message too fast, you are also missing a couple of calls to VecRestoreArray[Read,Write](). These are the ones which will let PETSc know that the Vec has had its state being increased, and that the cached norm are not valid anymore, see https://petsc.org/release/src/vec/vec/interface/rvector.c.html#line2177 Thanks, Pierre > Thanks, > Pierre > >> 0 KSP Residual norm < 1.e-11 >> Linear solve converged due to CONVERGED_ATOL iterations 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Jan 26 09:27:12 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 26 Jan 2024 10:27:12 -0500 Subject: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34) In-Reply-To: References: <0D321AB8-9E8F-4484-8D52-AE39FCCD8644@petsc.dev> Message-ID: When run with -log_view_gpu_time each event has two times: the time of kernel on the GPU (computed directly on the GPU using GPU timers) and the time of the CPU clock. The time on the CPU for the event always encloses the entire kernel (hence its time is always at least as large as the time of the kernel). Basically the CPU time in an event where all the action happens on the GPU is the time of kernel launch plus the time to run the kernel and confirm it is finished. So the GPU flop is the flop rate actually achieved on the GPU while the CPU flop rate is the effective flop rate the user is getting on the application. > On Jan 26, 2024, at 6:48?AM, Anthony Jourdon wrote: > > Hello, > > Thank you for your answers. > I am working with Dave May on this topic. > > Still running src/ksp/ksp/tutorials/ex34 with the same options reported by Dave, I added the option -log_view_gpu_time. > Now the log provides gpu flop/s instead of nans. > However, I have trouble understanding the numbers reported in the log (file attached). > The numbers reported for Total Mflop/s and GPU Mflop/s are different even when 100% of the work is supposed to be done on the GPU. > The numbers reported for GPU Mflop/s are always higher than the numbers reported for Total Mflop/s. > As I understand, the Total Mflop/s should be the sum of both GPU and CPU flop/s, but if the gpu does 100% of the work, why are there different numbers reported by the GPU and Total flop/s columns and why the GPU flop/s are always higher than the Total flop/s ? > Or am I missing something? > > Thank you for your attention. > Anthony Jourdon > > > > Le sam. 20 janv. 2024 ? 02:25, Barry Smith > a ?crit : >> >> Nans indicate we do not have valid computational times for these operations; think of them as Not Available. Providing valid times for the "inner" operations listed with Nans requires inaccurate times (higher) for the outer operations, since extra synchronization between the CPU and GPU must be done to get valid times for the inner options. We opted to have the best valid times for the outer operations since those times reflect the time of the application. >> >> >> >> >> >> > On Jan 19, 2024, at 12:35?PM, Dave May > wrote: >> > >> > Hi all, >> > >> > I am trying to understand the logging information associated with the %flops-performed-on-the-gpu reported by -log_view when running >> > src/ksp/ksp/tutorials/ex34 >> > with the following options >> > -da_grid_x 192 >> > -da_grid_y 192 >> > -da_grid_z 192 >> > -dm_mat_type seqaijhipsparse >> > -dm_vec_type seqhip >> > -ksp_max_it 10 >> > -ksp_monitor >> > -ksp_type richardson >> > -ksp_view >> > -log_view >> > -mg_coarse_ksp_max_it 2 >> > -mg_coarse_ksp_type richardson >> > -mg_coarse_pc_type none >> > -mg_levels_ksp_type richardson >> > -mg_levels_pc_type none >> > -options_left >> > -pc_mg_levels 3 >> > -pc_mg_log >> > -pc_type mg >> > >> > This config is not intended to actually solve the problem, rather it is a stripped down set of options designed to understand what parts of the smoothers are being executed on the GPU. >> > >> > With respect to the log file attached, my first set of questions related to the data reported under "Event Stage 2: MG Apply". >> > >> > [1] Why is the log littered with nan's? >> > * I don't understand how and why "GPU Mflop/s" should be reported as nan when a value is given for "GPU %F" (see MatMult for example). >> > >> > * For events executed on the GPU, I assume the column "Time (sec)" relates to "CPU execute time", this would explain why we see a nan in "Time (sec)" for MatMult. >> > If my assumption is correct, how should I interpret the column "Flop (Max)" which is showing 1.92e+09? >> > I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" >> > >> > [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as 93. I believe this value should be 100 as the smoother (and coarse grid solver) are configured as richardson(2)+none and thus should run entirely on the GPU. >> > Furthermore, when one inspects all events listed under "Event Stage 2: MG Apply" those events which do flops correctly report "GPU %F" as 100. >> > And the events showing "GPU %F" = 0 such as >> > MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync >> > don't do any flops (on the CPU or GPU) - which is also correct (although non GPU events should show nan??) >> > >> > Hence I am wondering what is the explanation for the missing 7% from "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? >> > >> > Does anyone understand this -log_view, or can explain to me how to interpret it? >> > >> > It could simply be that: >> > a) something is messed up with -pc_mg_log >> > b) something is messed up with the PETSc build >> > c) I am putting too much faith in -log_view and should profile the code differently. >> > >> > Either way I'd really like to understand what is going on. >> > >> > >> > Cheers, >> > Dave >> > >> > >> > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 26 10:12:22 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 26 Jan 2024 10:12:22 -0600 Subject: [petsc-users] Bug in VecNorm, 3.20.3 In-Reply-To: <800C216A-25B5-43B4-9EF5-682AA6D17900@joliv.et> References: <713a24497b89ae5b6a304173730e1a8c@paraffinalia.co.uk> <13593fc52714c640b907dbca572dc8dd@paraffinalia.co.uk> <3213C1AB-2C98-4373-9289-9CA05D705908@joliv.et> <800C216A-25B5-43B4-9EF5-682AA6D17900@joliv.et> Message-ID: Yes, VecRestoreArray[Read,Write] should be called after you finish using the array. This casting to (const double **) is bad. Otherwise, compiler could catch the error when you use a const pointer in a non-const way. PetscCall(VecGetArrayRead(ref, (const double **)(&pr))) ; After I used the right VecGet/RestoreArrayWrite(), the code worked. --Junchao Zhang On Fri, Jan 26, 2024 at 8:40?AM Pierre Jolivet wrote: > > On 26 Jan 2024, at 3:11?PM, Pierre Jolivet wrote: > > > On 26 Jan 2024, at 3:03?PM, michael at paraffinalia.co.uk wrote: > > On 2024-01-23 18:09, Junchao Zhang wrote: > > Do you have an example to reproduce it? > --Junchao Zhang > > > I have put a minimum example on github: > > https://github.com/mjcarley/petsc-test > > It does seem that the problem occurs if I do not use the PETSc interface > to do a matrix multiplication. > > In the original code, the PETSc matrix is a wrapper for a Fast Multipole > Method evaluation; in the minimum example I have simulated this by using an > array as a matrix. The sample code generates a randomised matrix A and > reference solution vector ref, and generates a right hand side > > b = A*ref > > which is then supplied as the right hand side for the GMRES solver. If I > use the PETSc matrix multiplication, the solver behaves as expected; if I > generate b directly from the underlying array for the matrix, I get the > result > > > You should not use VecGetArrayRead() if you change the Vec, but instead, > VecGetArrayWrite(). > Does that solve the issue? > > > Sorry, I sent the message too fast, you are also missing a couple of calls > to VecRestoreArray[Read,Write](). > These are the ones which will let PETSc know that the Vec has had its > state being increased, and that the cached norm are not valid anymore, see > https://petsc.org/release/src/vec/vec/interface/rvector.c.html#line2177 > > Thanks, > Pierre > > Thanks, > Pierre > > 0 KSP Residual norm < 1.e-11 > Linear solve converged due to CONVERGED_ATOL iterations 0 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Jan 27 08:43:10 2024 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 27 Jan 2024 09:43:10 -0500 Subject: [petsc-users] pc_redistribute issue Message-ID: I am not getting ksp_rtol 1e-12 into pc_redistribute correctly? * Linear redistribute_ solve converged due to CONVERGED_RTOL iterations 1 0 KSP Residual norm 2.182384017537e+02 1 KSP Residual norm 1.889764161573e-04 * Number of iterations = 1 N = 47628 Residual norm 8.65917e-07 #PETSc Option Table entries: -f S.bin # (source: command line) -ksp_monitor # (source: command line) -ksp_type preonly # (source: command line) -mat_block_size 36 # (source: command line) -mat_view ascii::ascii_info # (source: command line) -options_left # (source: command line) -pc_type redistribute # (source: command line) -redistribute_ksp_converged_reason # (source: command line) *-redistribute_ksp_rtol 1e-12 # (source: command line)*-redistribute_ksp_type gmres # (source: command line) -redistribute_pc_type bjacobi # (source: command line) -redistribute_sub_pc_factor_mat_solver_type mumps # (source: command line) -redistribute_sub_pc_type lu # (source: command line) #End of PETSc Option Table entries *There are no unused options.* -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jan 27 09:24:22 2024 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 27 Jan 2024 10:24:22 -0500 Subject: [petsc-users] pc_redistribute issue In-Reply-To: References: Message-ID: View the solver. Matt On Sat, Jan 27, 2024 at 9:43?AM Mark Adams wrote: > I am not getting ksp_rtol 1e-12 into pc_redistribute correctly? > > > > * Linear redistribute_ solve converged due to CONVERGED_RTOL iterations 1 > 0 KSP Residual norm 2.182384017537e+02 1 KSP Residual norm > 1.889764161573e-04 * > Number of iterations = 1 N = 47628 > Residual norm 8.65917e-07 > #PETSc Option Table entries: > -f S.bin # (source: command line) > -ksp_monitor # (source: command line) > -ksp_type preonly # (source: command line) > -mat_block_size 36 # (source: command line) > -mat_view ascii::ascii_info # (source: command line) > -options_left # (source: command line) > -pc_type redistribute # (source: command line) > -redistribute_ksp_converged_reason # (source: command line) > > *-redistribute_ksp_rtol 1e-12 # (source: command line)*-redistribute_ksp_type > gmres # (source: command line) > -redistribute_pc_type bjacobi # (source: command line) > -redistribute_sub_pc_factor_mat_solver_type mumps # (source: command line) > -redistribute_sub_pc_type lu # (source: command line) > #End of PETSc Option Table entries > *There are no unused options.* > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Jan 27 10:44:13 2024 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 27 Jan 2024 11:44:13 -0500 Subject: [petsc-users] pc_redistribute issue In-Reply-To: References: Message-ID: KSP Object: (redistribute_) 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero * tolerances: relative=1e-12, absolute=1e-50, divergence=10000.* left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (redistribute_) 1 MPI process type: bjacobi number of blocks = 1 Local solver information for first block is in the following KSP and PC objects on rank 0: Use -redistribute_ksp_view ::ascii_info_detail to display information for all blocks KSP Object: (redistribute_sub_) 1 MPI process type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (redistribute_sub_) 1 MPI process type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 On Sat, Jan 27, 2024 at 10:24?AM Matthew Knepley wrote: > View the solver. > > Matt > > On Sat, Jan 27, 2024 at 9:43?AM Mark Adams wrote: > >> I am not getting ksp_rtol 1e-12 into pc_redistribute correctly? >> >> >> >> * Linear redistribute_ solve converged due to CONVERGED_RTOL iterations >> 1 0 KSP Residual norm 2.182384017537e+02 1 KSP Residual norm >> 1.889764161573e-04 * >> Number of iterations = 1 N = 47628 >> Residual norm 8.65917e-07 >> #PETSc Option Table entries: >> -f S.bin # (source: command line) >> -ksp_monitor # (source: command line) >> -ksp_type preonly # (source: command line) >> -mat_block_size 36 # (source: command line) >> -mat_view ascii::ascii_info # (source: command line) >> -options_left # (source: command line) >> -pc_type redistribute # (source: command line) >> -redistribute_ksp_converged_reason # (source: command line) >> >> *-redistribute_ksp_rtol 1e-12 # (source: command line)*-redistribute_ksp_type >> gmres # (source: command line) >> -redistribute_pc_type bjacobi # (source: command line) >> -redistribute_sub_pc_factor_mat_solver_type mumps # (source: command line) >> -redistribute_sub_pc_type lu # (source: command line) >> #End of PETSc Option Table entries >> *There are no unused options.* >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jan 27 11:51:42 2024 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 27 Jan 2024 12:51:42 -0500 Subject: [petsc-users] pc_redistribute issue In-Reply-To: References: Message-ID: Okay, so the tolerance is right. It must be using ||b|| instead of ||r0||. Run with -redistribute_ksp_monitor_true_residual You might have to force r0. Thanks, Matt On Sat, Jan 27, 2024 at 11:44?AM Mark Adams wrote: > KSP Object: (redistribute_) 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > > * tolerances: relative=1e-12, absolute=1e-50, divergence=10000.* left > preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (redistribute_) 1 MPI process > type: bjacobi > number of blocks = 1 > Local solver information for first block is in the following KSP and > PC objects on rank 0: > Use -redistribute_ksp_view ::ascii_info_detail to display information > for all blocks > KSP Object: (redistribute_sub_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (redistribute_sub_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > > On Sat, Jan 27, 2024 at 10:24?AM Matthew Knepley > wrote: > >> View the solver. >> >> Matt >> >> On Sat, Jan 27, 2024 at 9:43?AM Mark Adams wrote: >> >>> I am not getting ksp_rtol 1e-12 into pc_redistribute correctly? >>> >>> >>> >>> * Linear redistribute_ solve converged due to CONVERGED_RTOL iterations >>> 1 0 KSP Residual norm 2.182384017537e+02 1 KSP Residual norm >>> 1.889764161573e-04 * >>> Number of iterations = 1 N = 47628 >>> Residual norm 8.65917e-07 >>> #PETSc Option Table entries: >>> -f S.bin # (source: command line) >>> -ksp_monitor # (source: command line) >>> -ksp_type preonly # (source: command line) >>> -mat_block_size 36 # (source: command line) >>> -mat_view ascii::ascii_info # (source: command line) >>> -options_left # (source: command line) >>> -pc_type redistribute # (source: command line) >>> -redistribute_ksp_converged_reason # (source: command line) >>> >>> *-redistribute_ksp_rtol 1e-12 # (source: command line)*-redistribute_ksp_type >>> gmres # (source: command line) >>> -redistribute_pc_type bjacobi # (source: command line) >>> -redistribute_sub_pc_factor_mat_solver_type mumps # (source: command >>> line) >>> -redistribute_sub_pc_type lu # (source: command line) >>> #End of PETSc Option Table entries >>> *There are no unused options.* >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Jan 27 12:26:16 2024 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 27 Jan 2024 13:26:16 -0500 Subject: [petsc-users] pc_redistribute issue In-Reply-To: References: Message-ID: Well, that puts the reason after the iterations, which is progress. Oh, I see the preconditioned norm goes down a lot, but the reported residual that you would think is used for testing (see first post) does not go down 12 digits. This matrix is very ill conditioned. LU just gets about 7 digits. Thanks, Mark Residual norms for redistribute_ solve. 0 KSP preconditioned resid norm 3.988887683909e+16 true resid norm 6.646245659859e+06 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.257912040767e+02 true resid norm 1.741027565497e-04 ||r(i)||/||b|| 2.619565472898e-11 Linear redistribute_ solve converged due to CONVERGED_RTOL iterations 1 KSP Object: (redistribute_) 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (redistribute_) 1 MPI process type: bjacobi number of blocks = 1 Local solver information for first block is in the following KSP and PC objects on rank 0: Use -redistribute_ksp_view ::ascii_info_detail to display information for all blocks KSP Object: (redistribute_sub_) 1 MPI process type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (redistribute_sub_) 1 MPI process type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: external factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: (redistribute_sub_) 1 MPI process type: mumps rows=44378, cols=44378 package used to perform factorization: mumps total: nonzeros=50309372, allocated nonzeros=50309372 MUMPS run parameters: On Sat, Jan 27, 2024 at 12:51?PM Matthew Knepley wrote: > Okay, so the tolerance is right. It must be using ||b|| instead of ||r0||. > Run with > > -redistribute_ksp_monitor_true_residual > > You might have to force r0. > > Thanks, > > Matt > > On Sat, Jan 27, 2024 at 11:44?AM Mark Adams wrote: > >> KSP Object: (redistribute_) 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> >> * tolerances: relative=1e-12, absolute=1e-50, divergence=10000.* left >> preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (redistribute_) 1 MPI process >> type: bjacobi >> number of blocks = 1 >> Local solver information for first block is in the following KSP and >> PC objects on rank 0: >> Use -redistribute_ksp_view ::ascii_info_detail to display information >> for all blocks >> KSP Object: (redistribute_sub_) 1 MPI process >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (redistribute_sub_) 1 MPI process >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> >> On Sat, Jan 27, 2024 at 10:24?AM Matthew Knepley >> wrote: >> >>> View the solver. >>> >>> Matt >>> >>> On Sat, Jan 27, 2024 at 9:43?AM Mark Adams wrote: >>> >>>> I am not getting ksp_rtol 1e-12 into pc_redistribute correctly? >>>> >>>> >>>> >>>> * Linear redistribute_ solve converged due to CONVERGED_RTOL iterations >>>> 1 0 KSP Residual norm 2.182384017537e+02 1 KSP Residual norm >>>> 1.889764161573e-04 * >>>> Number of iterations = 1 N = 47628 >>>> Residual norm 8.65917e-07 >>>> #PETSc Option Table entries: >>>> -f S.bin # (source: command line) >>>> -ksp_monitor # (source: command line) >>>> -ksp_type preonly # (source: command line) >>>> -mat_block_size 36 # (source: command line) >>>> -mat_view ascii::ascii_info # (source: command line) >>>> -options_left # (source: command line) >>>> -pc_type redistribute # (source: command line) >>>> -redistribute_ksp_converged_reason # (source: command line) >>>> >>>> *-redistribute_ksp_rtol 1e-12 # (source: command line)*-redistribute_ksp_type >>>> gmres # (source: command line) >>>> -redistribute_pc_type bjacobi # (source: command line) >>>> -redistribute_sub_pc_factor_mat_solver_type mumps # (source: command >>>> line) >>>> -redistribute_sub_pc_type lu # (source: command line) >>>> #End of PETSc Option Table entries >>>> *There are no unused options.* >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Jan 27 14:37:55 2024 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 27 Jan 2024 15:37:55 -0500 Subject: [petsc-users] pc_redistribute issue In-Reply-To: References: Message-ID: Note, pc_redistibute is a great idea but you lose the block size, which is obvious after you realize it, but is error prone. Maybe it would be better to throw an error if bs > 1 and add a -pc_redistribute_ignore_block_size or something for users that want to press on. Thanks, Mark On Sat, Jan 27, 2024 at 1:26?PM Mark Adams wrote: > Well, that puts the reason after the iterations, which is progress. > > Oh, I see the preconditioned norm goes down a lot, but the reported > residual that you would think is used for testing (see first post) does not > go down 12 digits. > This matrix is very ill conditioned. LU just gets about 7 digits. > > Thanks, > Mark > > Residual norms for redistribute_ solve. > 0 KSP preconditioned resid norm 3.988887683909e+16 true resid norm > 6.646245659859e+06 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 3.257912040767e+02 true resid norm > 1.741027565497e-04 ||r(i)||/||b|| 2.619565472898e-11 > Linear redistribute_ solve converged due to CONVERGED_RTOL iterations 1 > KSP Object: (redistribute_) 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (redistribute_) 1 MPI process > type: bjacobi > number of blocks = 1 > Local solver information for first block is in the following KSP and > PC objects on rank 0: > Use -redistribute_ksp_view ::ascii_info_detail to display information > for all blocks > KSP Object: (redistribute_sub_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (redistribute_sub_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: external > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: (redistribute_sub_) 1 MPI process > type: mumps > rows=44378, cols=44378 > package used to perform factorization: mumps > total: nonzeros=50309372, allocated nonzeros=50309372 > MUMPS run parameters: > > On Sat, Jan 27, 2024 at 12:51?PM Matthew Knepley > wrote: > >> Okay, so the tolerance is right. It must be using ||b|| instead of >> ||r0||. Run with >> >> -redistribute_ksp_monitor_true_residual >> >> You might have to force r0. >> >> Thanks, >> >> Matt >> >> On Sat, Jan 27, 2024 at 11:44?AM Mark Adams wrote: >> >>> KSP Object: (redistribute_) 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=10000, initial guess is zero >>> >>> * tolerances: relative=1e-12, absolute=1e-50, divergence=10000.* >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: (redistribute_) 1 MPI process >>> type: bjacobi >>> number of blocks = 1 >>> Local solver information for first block is in the following KSP and >>> PC objects on rank 0: >>> Use -redistribute_ksp_view ::ascii_info_detail to display >>> information for all blocks >>> KSP Object: (redistribute_sub_) 1 MPI process >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (redistribute_sub_) 1 MPI process >>> type: lu >>> out-of-place factorization >>> tolerance for zero pivot 2.22045e-14 >>> >>> On Sat, Jan 27, 2024 at 10:24?AM Matthew Knepley >>> wrote: >>> >>>> View the solver. >>>> >>>> Matt >>>> >>>> On Sat, Jan 27, 2024 at 9:43?AM Mark Adams wrote: >>>> >>>>> I am not getting ksp_rtol 1e-12 into pc_redistribute correctly? >>>>> >>>>> >>>>> >>>>> * Linear redistribute_ solve converged due to CONVERGED_RTOL >>>>> iterations 1 0 KSP Residual norm 2.182384017537e+02 1 KSP Residual norm >>>>> 1.889764161573e-04 * >>>>> Number of iterations = 1 N = 47628 >>>>> Residual norm 8.65917e-07 >>>>> #PETSc Option Table entries: >>>>> -f S.bin # (source: command line) >>>>> -ksp_monitor # (source: command line) >>>>> -ksp_type preonly # (source: command line) >>>>> -mat_block_size 36 # (source: command line) >>>>> -mat_view ascii::ascii_info # (source: command line) >>>>> -options_left # (source: command line) >>>>> -pc_type redistribute # (source: command line) >>>>> -redistribute_ksp_converged_reason # (source: command line) >>>>> >>>>> *-redistribute_ksp_rtol 1e-12 # (source: command line)*-redistribute_ksp_type >>>>> gmres # (source: command line) >>>>> -redistribute_pc_type bjacobi # (source: command line) >>>>> -redistribute_sub_pc_factor_mat_solver_type mumps # (source: command >>>>> line) >>>>> -redistribute_sub_pc_type lu # (source: command line) >>>>> #End of PETSc Option Table entries >>>>> *There are no unused options.* >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Jan 29 11:55:09 2024 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 29 Jan 2024 12:55:09 -0500 Subject: [petsc-users] pc_redistribute issue In-Reply-To: References: Message-ID: <788D51A7-9810-4FFF-A5B2-07D721FB79A1@petsc.dev> Document the change in behavior for matrices with a block size greater than one https://gitlab.com/petsc/petsc/-/merge_requests/7246 > On Jan 27, 2024, at 3:37?PM, Mark Adams wrote: > > Note, pc_redistibute is a great idea but you lose the block size, which is obvious after you realize it, but is error prone. > Maybe it would be better to throw an error if bs > 1 and add a -pc_redistribute_ignore_block_size or something for users that want to press on. > > Thanks, > Mark > > On Sat, Jan 27, 2024 at 1:26?PM Mark Adams > wrote: >> Well, that puts the reason after the iterations, which is progress. >> >> Oh, I see the preconditioned norm goes down a lot, but the reported residual that you would think is used for testing (see first post) does not go down 12 digits. >> This matrix is very ill conditioned. LU just gets about 7 digits. >> >> Thanks, >> Mark >> >> Residual norms for redistribute_ solve. >> 0 KSP preconditioned resid norm 3.988887683909e+16 true resid norm 6.646245659859e+06 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP preconditioned resid norm 3.257912040767e+02 true resid norm 1.741027565497e-04 ||r(i)||/||b|| 2.619565472898e-11 >> Linear redistribute_ solve converged due to CONVERGED_RTOL iterations 1 >> KSP Object: (redistribute_) 1 MPI process >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (redistribute_) 1 MPI process >> type: bjacobi >> number of blocks = 1 >> Local solver information for first block is in the following KSP and PC objects on rank 0: >> Use -redistribute_ksp_view ::ascii_info_detail to display information for all blocks >> KSP Object: (redistribute_sub_) 1 MPI process >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (redistribute_sub_) 1 MPI process >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: external >> factor fill ratio given 0., needed 0. >> Factored matrix follows: >> Mat Object: (redistribute_sub_) 1 MPI process >> type: mumps >> rows=44378, cols=44378 >> package used to perform factorization: mumps >> total: nonzeros=50309372, allocated nonzeros=50309372 >> MUMPS run parameters: >> >> On Sat, Jan 27, 2024 at 12:51?PM Matthew Knepley > wrote: >>> Okay, so the tolerance is right. It must be using ||b|| instead of ||r0||. Run with >>> >>> -redistribute_ksp_monitor_true_residual >>> >>> You might have to force r0. >>> >>> Thanks, >>> >>> Matt >>> >>> On Sat, Jan 27, 2024 at 11:44?AM Mark Adams > wrote: >>>> KSP Object: (redistribute_) 1 MPI process >>>> type: gmres >>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>> happy breakdown tolerance 1e-30 >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> PC Object: (redistribute_) 1 MPI process >>>> type: bjacobi >>>> number of blocks = 1 >>>> Local solver information for first block is in the following KSP and PC objects on rank 0: >>>> Use -redistribute_ksp_view ::ascii_info_detail to display information for all blocks >>>> KSP Object: (redistribute_sub_) 1 MPI process >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (redistribute_sub_) 1 MPI process >>>> type: lu >>>> out-of-place factorization >>>> tolerance for zero pivot 2.22045e-14 >>>> >>>> On Sat, Jan 27, 2024 at 10:24?AM Matthew Knepley > wrote: >>>>> View the solver. >>>>> >>>>> Matt >>>>> >>>>> On Sat, Jan 27, 2024 at 9:43?AM Mark Adams > wrote: >>>>>> I am not getting ksp_rtol 1e-12 into pc_redistribute correctly? >>>>>> >>>>>> Linear redistribute_ solve converged due to CONVERGED_RTOL iterations 1 >>>>>> 0 KSP Residual norm 2.182384017537e+02 >>>>>> 1 KSP Residual norm 1.889764161573e-04 >>>>>> Number of iterations = 1 N = 47628 >>>>>> Residual norm 8.65917e-07 >>>>>> #PETSc Option Table entries: >>>>>> -f S.bin # (source: command line) >>>>>> -ksp_monitor # (source: command line) >>>>>> -ksp_type preonly # (source: command line) >>>>>> -mat_block_size 36 # (source: command line) >>>>>> -mat_view ascii::ascii_info # (source: command line) >>>>>> -options_left # (source: command line) >>>>>> -pc_type redistribute # (source: command line) >>>>>> -redistribute_ksp_converged_reason # (source: command line) >>>>>> -redistribute_ksp_rtol 1e-12 # (source: command line) >>>>>> -redistribute_ksp_type gmres # (source: command line) >>>>>> -redistribute_pc_type bjacobi # (source: command line) >>>>>> -redistribute_sub_pc_factor_mat_solver_type mumps # (source: command line) >>>>>> -redistribute_sub_pc_type lu # (source: command line) >>>>>> #End of PETSc Option Table entries >>>>>> There are no unused options. >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guenther5 at llnl.gov Tue Jan 30 09:53:07 2024 From: guenther5 at llnl.gov (Guenther, Stefanie) Date: Tue, 30 Jan 2024 15:53:07 +0000 Subject: [petsc-users] Parallel vector layout for TAO optimization with separable state/design structure Message-ID: <98B55EC6-EC4A-4973-B06E-7E28E49EA1B8@llnl.gov> Hi Petsc team, I have a question regarding parallel layout of a Petsc vector to be used in TAO optimizers for cases where the optimization variables split into ?design? and ?state? variables (e.g. such as in PDE-constrained optimization as in tao_lcl). In our case, the state variable naturally parallelizes evenly amongst multiple processors and this distribution is fixed. The ?design? vector however does not, it is very short compared to the state vector and it is required on all state-processors when evaluating the objective function and gradient. My question would be how the TAO optimization vector x = [design,state] should be created in such a way that the ?state? part is distributed as needed in our solver, while the design part is not. My only idea so far was to copy the design variables to all processors and augment / interleave the optimization vector as x = [state_proc1,design, state_proc2, design, ? ] . When creating this vector in parallel on PETSC_COMM_WORLD, each processor would then own the same number of variables ( [state_proc, design] ), as long as the numbers match up, and I would only need to be careful when gathering the gradient wrt the design parts from all processors. This seems cumbersome however, and I would be worried whether the optimization problem is harder to solve this way. Is there any other way to achieve this splitting, that I am missing here? Note that the distribution of the state itself is given and can not be changed, and that the state vs design vectors have very different (and independent) dimensions. Thanks for your help and thoughts! Best, Stefanie -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Tue Jan 30 10:19:44 2024 From: y.hu at mpie.de (Yi Hu) Date: Tue, 30 Jan 2024 17:19:44 +0100 Subject: [petsc-users] KSP has an extra iteration when use shell matrix Message-ID: <5ea9026c-d1cc-486a-b3e6-fa6294c13bcd@mpie.de> Dear PETSc team, I am still trying to sort out my previous thread https://lists.mcs.anl.gov/pipermail/petsc-users/2024-January/050079.html using a minimal working example. However, I encountered another problem. Basically I combined the basic usage of SNES solver and shell matrix and tried to make it work. The jacobian of my snes is replaced by a customized MATOP_MULT. The minimal example code can be viewed here https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90 When running with -ksp_monitor -ksp_converged_reason, it shows an extra mymult step, and hence ruin my previous converged KSP result. Implement a customized converged call-back also does not help. I am wondering how to skip this extra ksp iteration. Could anyone help me on this? Thanks for your help. Best wishes, Yi ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 30 10:30:24 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 30 Jan 2024 11:30:24 -0500 Subject: [petsc-users] Parallel vector layout for TAO optimization with separable state/design structure In-Reply-To: <98B55EC6-EC4A-4973-B06E-7E28E49EA1B8@llnl.gov> References: <98B55EC6-EC4A-4973-B06E-7E28E49EA1B8@llnl.gov> Message-ID: <4F06E5BB-6504-42DE-ACAB-3C760F125906@petsc.dev> This is a problem with MPI programming and optimization; I am unaware of a perfect solution. Put the design variables into the solution vector on MPI rank 0, and when doing your objective/gradient, send the values to all the MPI processes where you use them. You can use a VecScatter to handle the communication you need or MPI_Scatter() etc whatever makes the most sense in your code. Barry > On Jan 30, 2024, at 10:53?AM, Guenther, Stefanie via petsc-users wrote: > > Hi Petsc team, > > I have a question regarding parallel layout of a Petsc vector to be used in TAO optimizers for cases where the optimization variables split into ?design? and ?state? variables (e.g. such as in PDE-constrained optimization as in tao_lcl). In our case, the state variable naturally parallelizes evenly amongst multiple processors and this distribution is fixed. The ?design? vector however does not, it is very short compared to the state vector and it is required on all state-processors when evaluating the objective function and gradient. My question would be how the TAO optimization vector x = [design,state] should be created in such a way that the ?state? part is distributed as needed in our solver, while the design part is not. > > My only idea so far was to copy the design variables to all processors and augment / interleave the optimization vector as x = [state_proc1,design, state_proc2, design, ? ] . When creating this vector in parallel on PETSC_COMM_WORLD, each processor would then own the same number of variables ( [state_proc, design] ), as long as the numbers match up, and I would only need to be careful when gathering the gradient wrt the design parts from all processors. > > This seems cumbersome however, and I would be worried whether the optimization problem is harder to solve this way. Is there any other way to achieve this splitting, that I am missing here? Note that the distribution of the state itself is given and can not be changed, and that the state vs design vectors have very different (and independent) dimensions. > > Thanks for your help and thoughts! > Best, > Stefanie -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Jan 30 10:47:07 2024 From: jed at jedbrown.org (Jed Brown) Date: Tue, 30 Jan 2024 09:47:07 -0700 Subject: [petsc-users] Parallel vector layout for TAO optimization with separable state/design structure In-Reply-To: <4F06E5BB-6504-42DE-ACAB-3C760F125906@petsc.dev> References: <98B55EC6-EC4A-4973-B06E-7E28E49EA1B8@llnl.gov> <4F06E5BB-6504-42DE-ACAB-3C760F125906@petsc.dev> Message-ID: <87plxiydis.fsf@jedbrown.org> For a bit of assistance, you can use DMComposite and DMRedundantCreate; see src/snes/tutorials/ex21.c and ex22.c. Note that when computing redundantly, it's critical that the computation be deterministic (i.e., not using atomics or randomness without matching seeds) so the logic stays collective. This merge request may also be relevant and comments related to your needs would be welcome in the discussion. https://gitlab.com/petsc/petsc/-/merge_requests/6531 Barry Smith writes: > This is a problem with MPI programming and optimization; I am unaware of a perfect solution. > > Put the design variables into the solution vector on MPI rank 0, and when doing your objective/gradient, send the values to all the MPI processes where you use them. You can use a VecScatter to handle the communication you need or MPI_Scatter() etc whatever makes the most sense in your code. > > Barry > > >> On Jan 30, 2024, at 10:53?AM, Guenther, Stefanie via petsc-users wrote: >> >> Hi Petsc team, >> >> I have a question regarding parallel layout of a Petsc vector to be used in TAO optimizers for cases where the optimization variables split into ?design? and ?state? variables (e.g. such as in PDE-constrained optimization as in tao_lcl). In our case, the state variable naturally parallelizes evenly amongst multiple processors and this distribution is fixed. The ?design? vector however does not, it is very short compared to the state vector and it is required on all state-processors when evaluating the objective function and gradient. My question would be how the TAO optimization vector x = [design,state] should be created in such a way that the ?state? part is distributed as needed in our solver, while the design part is not. >> >> My only idea so far was to copy the design variables to all processors and augment / interleave the optimization vector as x = [state_proc1,design, state_proc2, design, ? ] . When creating this vector in parallel on PETSC_COMM_WORLD, each processor would then own the same number of variables ( [state_proc, design] ), as long as the numbers match up, and I would only need to be careful when gathering the gradient wrt the design parts from all processors. >> >> This seems cumbersome however, and I would be worried whether the optimization problem is harder to solve this way. Is there any other way to achieve this splitting, that I am missing here? Note that the distribution of the state itself is given and can not be changed, and that the state vs design vectors have very different (and independent) dimensions. >> >> Thanks for your help and thoughts! >> Best, >> Stefanie From bsmith at petsc.dev Tue Jan 30 11:35:35 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 30 Jan 2024 12:35:35 -0500 Subject: [petsc-users] KSP has an extra iteration when use shell matrix In-Reply-To: <5ea9026c-d1cc-486a-b3e6-fa6294c13bcd@mpie.de> References: <5ea9026c-d1cc-486a-b3e6-fa6294c13bcd@mpie.de> Message-ID: How do I see a difference? What does "hence ruin my previous converged KSP result" mean? A different answer at the end of the KSP solve? $ ./joe > joe.basic ~/Src/petsc/src/ksp/ksp/tutorials (barry/2023-09-15/fix-log-pcmpi=) arch-fix-log-pcmpi $ ./joe -ksp_monitor -ksp_converged_reason -snes_monitor > joe.monitor ~/Src/petsc/src/ksp/ksp/tutorials (barry/2023-09-15/fix-log-pcmpi=) arch-fix-log-pcmpi $ diff joe.basic joe.monitor 0a1,36 > 0 SNES Function norm 6.041522986797e+00 > 0 KSP Residual norm 6.041522986797e+00 > 1 KSP Residual norm 5.065392549852e-16 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > 1 SNES Function norm 3.512662245652e+00 > 0 KSP Residual norm 3.512662245652e+00 > 1 KSP Residual norm 6.230314124713e-16 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > 2 SNES Function norm 8.969285922373e-01 > 0 KSP Residual norm 8.969285922373e-01 > 1 KSP Residual norm 0.000000000000e+00 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > 3 SNES Function norm 4.863816734540e-01 > 0 KSP Residual norm 4.863816734540e-01 > 1 KSP Residual norm 0.000000000000e+00 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > 4 SNES Function norm 3.512070785520e-01 > 0 KSP Residual norm 3.512070785520e-01 > 1 KSP Residual norm 0.000000000000e+00 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > 5 SNES Function norm 2.769700293115e-01 > 0 KSP Residual norm 2.769700293115e-01 > 1 KSP Residual norm 1.104778916974e-16 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > 6 SNES Function norm 2.055345318150e-01 > 0 KSP Residual norm 2.055345318150e-01 > 1 KSP Residual norm 1.535110861002e-17 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > 7 SNES Function norm 1.267482220786e-01 > 0 KSP Residual norm 1.267482220786e-01 > 1 KSP Residual norm 1.498679601680e-17 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > 8 SNES Function norm 3.468150619264e-02 > 0 KSP Residual norm 3.468150619264e-02 > 1 KSP Residual norm 5.944160522951e-18 > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > On Jan 30, 2024, at 11:19?AM, Yi Hu wrote: > > Dear PETSc team, > > I am still trying to sort out my previous thread https://lists.mcs.anl.gov/pipermail/petsc-users/2024-January/050079.html using a minimal working example. However, I encountered another problem. Basically I combined the basic usage of SNES solver and shell matrix and tried to make it work. The jacobian of my snes is replaced by a customized MATOP_MULT. The minimal example code can be viewed here https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90 > > When running with -ksp_monitor -ksp_converged_reason, it shows an extra mymult step, and hence ruin my previous converged KSP result. Implement a customized converged call-back also does not help. I am wondering how to skip this extra ksp iteration. Could anyone help me on this? > > Thanks for your help. > > Best wishes, > Yi > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Tue Jan 30 14:19:18 2024 From: y.hu at mpie.de (Yi Hu) Date: Tue, 30 Jan 2024 21:19:18 +0100 Subject: [petsc-users] KSP has an extra iteration when use shell matrix In-Reply-To: References: <5ea9026c-d1cc-486a-b3e6-fa6294c13bcd@mpie.de> Message-ID: Hello Barry, Thanks for your reply. The monitor options are fine. I actually meant my modification of snes tutorial ex1f.F90 does not work and has some unexpected behavior. I basically wanted to test if I can use a shell matrix as my jacobian (code is here https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90). After compile my modified version and run with these monitor options, it gives me the following, ?( in rhs ) ?( leave rhs ) ? 0 SNES Function norm 6.041522986797e+00 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 6.041522986797e+00 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 5.065392549852e-16 ? Linear solve converged due to CONVERGED_RTOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ? 1 SNES Function norm 3.512662245652e+00 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 3.512662245652e+00 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 6.230314124713e-16 ? Linear solve converged due to CONVERGED_RTOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ? 2 SNES Function norm 8.969285922373e-01 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 8.969285922373e-01 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 0.000000000000e+00 ? Linear solve converged due to CONVERGED_ATOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ? 3 SNES Function norm 4.863816734540e-01 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 4.863816734540e-01 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 0.000000000000e+00 ? Linear solve converged due to CONVERGED_ATOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ? 4 SNES Function norm 3.512070785520e-01 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 3.512070785520e-01 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 0.000000000000e+00 ? Linear solve converged due to CONVERGED_ATOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ? 5 SNES Function norm 2.769700293115e-01 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 2.769700293115e-01 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 1.104778916974e-16 ? Linear solve converged due to CONVERGED_RTOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ? 6 SNES Function norm 2.055345318150e-01 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 2.055345318150e-01 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 0.000000000000e+00 ? Linear solve converged due to CONVERGED_ATOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ? 7 SNES Function norm 1.267482220786e-01 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 1.267482220786e-01 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 1.498679601680e-17 ? Linear solve converged due to CONVERGED_RTOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ? 8 SNES Function norm 3.468150619264e-02 ?++++++++++++ in jac shell +++++++++++ ??? 0 KSP Residual norm 3.468150619264e-02 ?=== start mymult === ?=== done mymult === ??? 1 KSP Residual norm 5.944160522951e-18 ? Linear solve converged due to CONVERGED_RTOL iterations 1 ?=== start mymult === ?=== done mymult === ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?( in rhs ) ?( leave rhs ) ?=== start mymult === ?=== done mymult === Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 8 Number of SNES iterations =???? 8 After each "Linear solve converged due to CONVERGED_ATOL iterations", the code starts to do mymult again. So I thought it did an extra (unwanted) KSP iteration. I would like to ask if this extra iteration could be disabled, or maybe I am wrong about it. Best regards, Yi On 1/30/24 18:35, Barry Smith wrote: > > ? How do I see a difference? What does "hence ruin my previous > converged KSP result" mean? A different answer at the end of the KSP > solve? > > $ ./joe > joe.basic > > ~/Src/petsc/src/ksp/ksp/tutorials*(barry/2023-09-15/fix-log-pcmpi=)*arch-fix-log-pcmpi > > $ ./joe -ksp_monitor -ksp_converged_reason -snes_monitor > joe.monitor > > ~/Src/petsc/src/ksp/ksp/tutorials*(barry/2023-09-15/fix-log-pcmpi=)*arch-fix-log-pcmpi > > $ diff joe.basic joe.monitor > > 0a1,36 > > > ? 0 SNES Function norm 6.041522986797e+00 > > > 0 KSP Residual norm 6.041522986797e+00 > > > 1 KSP Residual norm 5.065392549852e-16 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > ? 1 SNES Function norm 3.512662245652e+00 > > > 0 KSP Residual norm 3.512662245652e+00 > > > 1 KSP Residual norm 6.230314124713e-16 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > ? 2 SNES Function norm 8.969285922373e-01 > > > 0 KSP Residual norm 8.969285922373e-01 > > > 1 KSP Residual norm 0.000000000000e+00 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > ? 3 SNES Function norm 4.863816734540e-01 > > > 0 KSP Residual norm 4.863816734540e-01 > > > 1 KSP Residual norm 0.000000000000e+00 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > ? 4 SNES Function norm 3.512070785520e-01 > > > 0 KSP Residual norm 3.512070785520e-01 > > > 1 KSP Residual norm 0.000000000000e+00 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > ? 5 SNES Function norm 2.769700293115e-01 > > > 0 KSP Residual norm 2.769700293115e-01 > > > 1 KSP Residual norm 1.104778916974e-16 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > ? 6 SNES Function norm 2.055345318150e-01 > > > 0 KSP Residual norm 2.055345318150e-01 > > > 1 KSP Residual norm 1.535110861002e-17 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > ? 7 SNES Function norm 1.267482220786e-01 > > > 0 KSP Residual norm 1.267482220786e-01 > > > 1 KSP Residual norm 1.498679601680e-17 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > ? 8 SNES Function norm 3.468150619264e-02 > > > 0 KSP Residual norm 3.468150619264e-02 > > > 1 KSP Residual norm 5.944160522951e-18 > > > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 > > > > >> On Jan 30, 2024, at 11:19?AM, Yi Hu wrote: >> >> Dear PETSc team, >> I am still trying to sort out my previous >> threadhttps://lists.mcs.anl.gov/pipermail/petsc-users/2024-January/050079.htmlusing >> a minimal working example. However, I encountered another problem. >> Basically I combined the basic usage of SNES solver and shell matrix >> and tried to make it work. The jacobian of my snes is replaced by a >> customized MATOP_MULT. The minimal example code can be viewed >> herehttps://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90 >> When running with -ksp_monitor -ksp_converged_reason, it shows an >> extra mymult step, and hence ruin my previous converged KSP result. >> Implement a customized converged call-back also does not help. I am >> wondering how to skip this extra ksp iteration. Could anyone help me >> on this? >> Thanks for your help. >> Best wishes, >> Yi >> >> >> ------------------------------------------------------------------------ >> ------------------------------------------------- >> Stay?up?to?date?and?follow?us?on?LinkedIn,?Twitter?and?YouTube. >> >> Max-Planck-Institut?f?r?Eisenforschung?GmbH >> Max-Planck-Stra?e?1 >> D-40237?D?sseldorf >> >> Handelsregister?B?2533 >> Amtsgericht?D?sseldorf >> >> Gesch?ftsf?hrung >> Prof.?Dr.?Gerhard?Dehm >> Prof.?Dr.?J?rg?Neugebauer >> Prof.?Dr.?Dierk?Raabe >> Dr.?Kai?de?Weldige >> >> Ust.-Id.-Nr.:?DE?11?93?58?514 >> Steuernummer:?105?5891?1000 >> >> >> Please?consider?that?invitations?and?e-mails?of?our?institute?are >> only?valid?if?they?end?with??@mpie.de. >> If?you?are?not?sure?of?the?validity?please?contact rco at mpie.de >> >> Bitte?beachten?Sie,?dass?Einladungen?zu?Veranstaltungen?und?E-Mails >> aus?unserem?Haus?nur?mit?der?Endung??@mpie.de?g?ltig?sind. >> In?Zweifelsf?llen?wenden?Sie?sich?bitte?an rco at mpie.de >> ------------------------------------------------- > ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 30 20:18:35 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 30 Jan 2024 21:18:35 -0500 Subject: [petsc-users] KSP has an extra iteration when use shell matrix In-Reply-To: References: <5ea9026c-d1cc-486a-b3e6-fa6294c13bcd@mpie.de> Message-ID: <6C071B22-8D59-4DB8-BCF9-6F31300A648B@petsc.dev> It is not running an extra KSP iteration. This "extra" matmult is normal and occurs in many of the SNESLineSearchApply_* functions, for example, https://petsc.org/release/src/snes/linesearch/impls/bt/linesearchbt.c.html#SNESLineSearchApply_BT It is used to decide if the Newton step results in sufficient decrease of the function value. Barry > On Jan 30, 2024, at 3:19?PM, Yi Hu wrote: > > Hello Barry, > > Thanks for your reply. The monitor options are fine. I actually meant my modification of snes tutorial ex1f.F90 does not work and has some unexpected behavior. I basically wanted to test if I can use a shell matrix as my jacobian (code is here https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90). After compile my modified version and run with these monitor options, it gives me the following, > > ( in rhs ) > ( leave rhs ) > 0 SNES Function norm 6.041522986797e+00 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 6.041522986797e+00 > === start mymult === > === done mymult === > 1 KSP Residual norm 5.065392549852e-16 > Linear solve converged due to CONVERGED_RTOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > 1 SNES Function norm 3.512662245652e+00 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 3.512662245652e+00 > === start mymult === > === done mymult === > 1 KSP Residual norm 6.230314124713e-16 > Linear solve converged due to CONVERGED_RTOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > 2 SNES Function norm 8.969285922373e-01 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 8.969285922373e-01 > === start mymult === > === done mymult === > 1 KSP Residual norm 0.000000000000e+00 > Linear solve converged due to CONVERGED_ATOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > 3 SNES Function norm 4.863816734540e-01 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 4.863816734540e-01 > === start mymult === > === done mymult === > 1 KSP Residual norm 0.000000000000e+00 > Linear solve converged due to CONVERGED_ATOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > 4 SNES Function norm 3.512070785520e-01 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 3.512070785520e-01 > === start mymult === > === done mymult === > 1 KSP Residual norm 0.000000000000e+00 > Linear solve converged due to CONVERGED_ATOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > 5 SNES Function norm 2.769700293115e-01 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 2.769700293115e-01 > === start mymult === > === done mymult === > 1 KSP Residual norm 1.104778916974e-16 > Linear solve converged due to CONVERGED_RTOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > 6 SNES Function norm 2.055345318150e-01 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 2.055345318150e-01 > === start mymult === > === done mymult === > 1 KSP Residual norm 0.000000000000e+00 > Linear solve converged due to CONVERGED_ATOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > 7 SNES Function norm 1.267482220786e-01 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 1.267482220786e-01 > === start mymult === > === done mymult === > 1 KSP Residual norm 1.498679601680e-17 > Linear solve converged due to CONVERGED_RTOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > 8 SNES Function norm 3.468150619264e-02 > ++++++++++++ in jac shell +++++++++++ > 0 KSP Residual norm 3.468150619264e-02 > === start mymult === > === done mymult === > 1 KSP Residual norm 5.944160522951e-18 > Linear solve converged due to CONVERGED_RTOL iterations 1 > === start mymult === > === done mymult === > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > ( in rhs ) > ( leave rhs ) > === start mymult === > === done mymult === > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 8 > Number of SNES iterations = 8 > > After each "Linear solve converged due to CONVERGED_ATOL iterations", the code starts to do mymult again. So I thought it did an extra (unwanted) KSP iteration. I would like to ask if this extra iteration could be disabled, or maybe I am wrong about it. > > Best regards, > > Yi > > On 1/30/24 18:35, Barry Smith wrote: >> >> How do I see a difference? What does "hence ruin my previous converged KSP result" mean? A different answer at the end of the KSP solve? >> >> $ ./joe > joe.basic >> ~/Src/petsc/src/ksp/ksp/tutorials (barry/2023-09-15/fix-log-pcmpi=) arch-fix-log-pcmpi >> $ ./joe -ksp_monitor -ksp_converged_reason -snes_monitor > joe.monitor >> ~/Src/petsc/src/ksp/ksp/tutorials (barry/2023-09-15/fix-log-pcmpi=) arch-fix-log-pcmpi >> $ diff joe.basic joe.monitor >> 0a1,36 >> > 0 SNES Function norm 6.041522986797e+00 >> > 0 KSP Residual norm 6.041522986797e+00 >> > 1 KSP Residual norm 5.065392549852e-16 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> > 1 SNES Function norm 3.512662245652e+00 >> > 0 KSP Residual norm 3.512662245652e+00 >> > 1 KSP Residual norm 6.230314124713e-16 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> > 2 SNES Function norm 8.969285922373e-01 >> > 0 KSP Residual norm 8.969285922373e-01 >> > 1 KSP Residual norm 0.000000000000e+00 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> > 3 SNES Function norm 4.863816734540e-01 >> > 0 KSP Residual norm 4.863816734540e-01 >> > 1 KSP Residual norm 0.000000000000e+00 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> > 4 SNES Function norm 3.512070785520e-01 >> > 0 KSP Residual norm 3.512070785520e-01 >> > 1 KSP Residual norm 0.000000000000e+00 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> > 5 SNES Function norm 2.769700293115e-01 >> > 0 KSP Residual norm 2.769700293115e-01 >> > 1 KSP Residual norm 1.104778916974e-16 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> > 6 SNES Function norm 2.055345318150e-01 >> > 0 KSP Residual norm 2.055345318150e-01 >> > 1 KSP Residual norm 1.535110861002e-17 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> > 7 SNES Function norm 1.267482220786e-01 >> > 0 KSP Residual norm 1.267482220786e-01 >> > 1 KSP Residual norm 1.498679601680e-17 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> > 8 SNES Function norm 3.468150619264e-02 >> > 0 KSP Residual norm 3.468150619264e-02 >> > 1 KSP Residual norm 5.944160522951e-18 >> > Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1 >> >> >> >>> On Jan 30, 2024, at 11:19?AM, Yi Hu wrote: >>> >>> Dear PETSc team, >>> >>> I am still trying to sort out my previous thread https://lists.mcs.anl.gov/pipermail/petsc-users/2024-January/050079.html using a minimal working example. However, I encountered another problem. Basically I combined the basic usage of SNES solver and shell matrix and tried to make it work. The jacobian of my snes is replaced by a customized MATOP_MULT. The minimal example code can be viewed here https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90 >>> >>> When running with -ksp_monitor -ksp_converged_reason, it shows an extra mymult step, and hence ruin my previous converged KSP result. Implement a customized converged call-back also does not help. I am wondering how to skip this extra ksp iteration. Could anyone help me on this? >>> >>> Thanks for your help. >>> >>> Best wishes, >>> Yi >>> >>> >>> ------------------------------------------------- >>> Stay up to date and follow us on LinkedIn, Twitter and YouTube. >>> >>> Max-Planck-Institut f?r Eisenforschung GmbH >>> Max-Planck-Stra?e 1 >>> D-40237 D?sseldorf >>> >>> Handelsregister B 2533 >>> Amtsgericht D?sseldorf >>> >>> Gesch?ftsf?hrung >>> Prof. Dr. Gerhard Dehm >>> Prof. Dr. J?rg Neugebauer >>> Prof. Dr. Dierk Raabe >>> Dr. Kai de Weldige >>> >>> Ust.-Id.-Nr.: DE 11 93 58 514 >>> Steuernummer: 105 5891 1000 >>> >>> >>> Please consider that invitations and e-mails of our institute are >>> only valid if they end with ?@mpie.de. >>> If you are not sure of the validity please contact rco at mpie.de >>> >>> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails >>> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. >>> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de >>> ------------------------------------------------- >> > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ngoetting at itp.uni-bremen.de Wed Jan 31 05:51:34 2024 From: ngoetting at itp.uni-bremen.de (=?UTF-8?Q?Niclas_G=C3=B6tting?=) Date: Wed, 31 Jan 2024 12:51:34 +0100 Subject: [petsc-users] Preconditioning of Liouvillian Superoperator Message-ID: <0bc882a8-97e5-4e52-9889-c9e94245afd9@itp.uni-bremen.de> Hi all, I've been trying for the last couple of days to solve a linear system using iterative methods. The system size itself scales exponentially (64^N) with the number of components, so I receive sizes of * (64, 64) for one component * (4096, 4096) for two components * (262144, 262144) for three components I can solve the first two cases with direct solvers and don't run into any problems; however, the last case is the first nontrivial and it's too large for a direct solution, which is why I believe that I need an iterative solver. As I know the solution for the first two cases, I tried to reproduce them using GMRES and failed on the second, because GMRES didn't converge and seems to have been going in the wrong direction (the vector to which it "tries" to converge is a totally different one than the correct solution). I went as far as -ksp_max_it 1000000, which takes orders of magnitude longer than the LU solution and I'd intuitively think that GMRES should not take *that* much longer than LU. Here is the information I have about this (4096, 4096) system: * not symmetric (which is why I went for GMRES) * not singular (SVD: condition number 1.427743623238e+06, 0 of 4096 singular values are (nearly) zero) * solving without preconditioning does not converge (DIVERGED_ITS) * solving with iLU and natural ordering fails due to zeros on the diagonal * solving with iLU and RCM ordering does not converge (DIVERGED_ITS) After some searching I also found [this](http://arxiv.org/abs/1504.06768) paper, which mentions the use of ILUTP, which I believe in PETSc should be used via hypre, which, however, threw a SEGV for me, and I'm not sure if it's worth debugging at this point in time, because I might be missing something entirely different. Does anybody have an idea how this system could be solved in finite time, such that the method also scales to the three component problem? Thank you all very much in advance! Best regards Niclas From mfadams at lbl.gov Wed Jan 31 07:21:18 2024 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 31 Jan 2024 08:21:18 -0500 Subject: [petsc-users] Preconditioning of Liouvillian Superoperator In-Reply-To: <0bc882a8-97e5-4e52-9889-c9e94245afd9@itp.uni-bremen.de> References: <0bc882a8-97e5-4e52-9889-c9e94245afd9@itp.uni-bremen.de> Message-ID: Iterative solvers have to be designed for your particular operator. You want to look in your field to see how people solve these problems. (eg, zeros on the diagonal will need something like a block solver or maybe ILU with a particular ordering) I don't personally know anything about this operator. Perhaps someone else can help you, but you will probably need to find this yourself. Also, hypre's ILUTP is not well supported. You could use our (serial) ILU on one processor to experiment with ( https://petsc.org/main/manualpages/PC/PCILU). Mark On Wed, Jan 31, 2024 at 6:51?AM Niclas G?tting wrote: > Hi all, > > I've been trying for the last couple of days to solve a linear system > using iterative methods. The system size itself scales exponentially > (64^N) with the number of components, so I receive sizes of > > * (64, 64) for one component > * (4096, 4096) for two components > * (262144, 262144) for three components > > I can solve the first two cases with direct solvers and don't run into > any problems; however, the last case is the first nontrivial and it's > too large for a direct solution, which is why I believe that I need an > iterative solver. > > As I know the solution for the first two cases, I tried to reproduce > them using GMRES and failed on the second, because GMRES didn't converge > and seems to have been going in the wrong direction (the vector to which > it "tries" to converge is a totally different one than the correct > solution). I went as far as -ksp_max_it 1000000, which takes orders of > magnitude longer than the LU solution and I'd intuitively think that > GMRES should not take *that* much longer than LU. Here is the > information I have about this (4096, 4096) system: > > * not symmetric (which is why I went for GMRES) > * not singular (SVD: condition number 1.427743623238e+06, 0 of 4096 > singular values are (nearly) zero) > * solving without preconditioning does not converge (DIVERGED_ITS) > * solving with iLU and natural ordering fails due to zeros on the diagonal > * solving with iLU and RCM ordering does not converge (DIVERGED_ITS) > > After some searching I also found > [this](http://arxiv.org/abs/1504.06768) paper, which mentions the use of > ILUTP, which I believe in PETSc should be used via hypre, which, > however, threw a SEGV for me, and I'm not sure if it's worth debugging > at this point in time, because I might be missing something entirely > different. > > Does anybody have an idea how this system could be solved in finite > time, such that the method also scales to the three component problem? > > Thank you all very much in advance! > > Best regards > Niclas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jan 31 08:01:05 2024 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Jan 2024 09:01:05 -0500 Subject: [petsc-users] Preconditioning of Liouvillian Superoperator In-Reply-To: References: <0bc882a8-97e5-4e52-9889-c9e94245afd9@itp.uni-bremen.de> Message-ID: On Wed, Jan 31, 2024 at 8:21?AM Mark Adams wrote: > Iterative solvers have to be designed for your particular operator. > You want to look in your field to see how people solve these problems. > (eg, zeros on the diagonal will need something like a block solver or maybe > ILU with a particular ordering) > I don't personally know anything about this operator. Perhaps someone else > can help you, but you will probably need to find this yourself. > Also, hypre's ILUTP is not well supported. You could use our (serial) ILU > on one processor to experiment with ( > https://petsc.org/main/manualpages/PC/PCILU). > As Mark says, understanding your operator is key here. However, some comments about GMRES. Full GMRES is guaranteed to converge. By default you are using GMRES(30), which has no guarantees. You could look at the effect of increasing the subspace size. This is probably not worth it without first understanding at least the spectrum of the operator, and other analytic characteristics (say is it a PDE, or BIE, etc) Thanks, Matt > Mark > > > On Wed, Jan 31, 2024 at 6:51?AM Niclas G?tting < > ngoetting at itp.uni-bremen.de> wrote: > >> Hi all, >> >> I've been trying for the last couple of days to solve a linear system >> using iterative methods. The system size itself scales exponentially >> (64^N) with the number of components, so I receive sizes of >> >> * (64, 64) for one component >> * (4096, 4096) for two components >> * (262144, 262144) for three components >> >> I can solve the first two cases with direct solvers and don't run into >> any problems; however, the last case is the first nontrivial and it's >> too large for a direct solution, which is why I believe that I need an >> iterative solver. >> >> As I know the solution for the first two cases, I tried to reproduce >> them using GMRES and failed on the second, because GMRES didn't converge >> and seems to have been going in the wrong direction (the vector to which >> it "tries" to converge is a totally different one than the correct >> solution). I went as far as -ksp_max_it 1000000, which takes orders of >> magnitude longer than the LU solution and I'd intuitively think that >> GMRES should not take *that* much longer than LU. Here is the >> information I have about this (4096, 4096) system: >> >> * not symmetric (which is why I went for GMRES) >> * not singular (SVD: condition number 1.427743623238e+06, 0 of 4096 >> singular values are (nearly) zero) >> * solving without preconditioning does not converge (DIVERGED_ITS) >> * solving with iLU and natural ordering fails due to zeros on the diagonal >> * solving with iLU and RCM ordering does not converge (DIVERGED_ITS) >> >> After some searching I also found >> [this](http://arxiv.org/abs/1504.06768) paper, which mentions the use of >> ILUTP, which I believe in PETSc should be used via hypre, which, >> however, threw a SEGV for me, and I'm not sure if it's worth debugging >> at this point in time, because I might be missing something entirely >> different. >> >> Does anybody have an idea how this system could be solved in finite >> time, such that the method also scales to the three component problem? >> >> Thank you all very much in advance! >> >> Best regards >> Niclas >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alain.miniussi at oca.eu Wed Jan 31 04:31:23 2024 From: alain.miniussi at oca.eu (Alain O' Miniussi) Date: Wed, 31 Jan 2024 11:31:23 +0100 (CET) Subject: [petsc-users] PETSc init question Message-ID: <1806087555.702818746.1706697083345.JavaMail.zimbra@oca.eu> Hi, It is indicated in: https://petsc.org/release/manualpages/Sys/PetscInitialize/ that the init function will call MPI_Init. What if MPI_Init was already called (as it is the case in my application) and what about MPI_Init_thread. Wouldn't it be more convenient to have a Petsc init function taking a already initialized communicator as argument ? Also, that initialization seems to imply that it is not possible to have multiple instance of PETSc on different communicators. Is that the case ? Thanks ---- Alain Miniussi DSI, P?les Calcul et Genie Log. Observatoire de la C?te d'Azur T?l. : +33609650665 From pierre at joliv.et Wed Jan 31 09:14:52 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Wed, 31 Jan 2024 16:14:52 +0100 Subject: [petsc-users] PETSc init question In-Reply-To: <1806087555.702818746.1706697083345.JavaMail.zimbra@oca.eu> References: <1806087555.702818746.1706697083345.JavaMail.zimbra@oca.eu> Message-ID: <6CAC7B4C-BBB0-4E12-8F5D-2C681341BEDF@joliv.et> > On 31 Jan 2024, at 11:31?AM, Alain O' Miniussi wrote: > > Hi, > > It is indicated in: > https://petsc.org/release/manualpages/Sys/PetscInitialize/ > that the init function will call MPI_Init. > > What if MPI_Init was already called (as it is the case in my application) and what about MPI_Init_thread. Then, MPI_Init() is not called, see the call to MPI_Initialized() in https://petsc.org/release/src/sys/objects/pinit.c.html#PetscInitialize. > Wouldn't it be more convenient to have a Petsc init function taking a already initialized communicator as argument ? > > Also, that initialization seems to imply that it is not possible to have multiple instance of PETSc on different communicators. Is that the case ? No, you can initialize MPI yourself and then set PETSC_COMM_WORLD to whatever you need before calling PetscInitialize(). Thanks, Pierre > Thanks > > ---- > Alain Miniussi > DSI, P?les Calcul et Genie Log. > Observatoire de la C?te d'Azur > T?l. : +33609650665 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jan 31 09:17:47 2024 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Jan 2024 10:17:47 -0500 Subject: [petsc-users] PETSc init question In-Reply-To: <1806087555.702818746.1706697083345.JavaMail.zimbra@oca.eu> References: <1806087555.702818746.1706697083345.JavaMail.zimbra@oca.eu> Message-ID: On Wed, Jan 31, 2024 at 10:10?AM Alain O' Miniussi wrote: > Hi, > > It is indicated in: > https://petsc.org/release/manualpages/Sys/PetscInitialize/ > that the init function will call MPI_Init. > > What if MPI_Init was already called (as it is the case in my application) >From the page: " PetscInitialize() calls MPI_Init() if that has yet to be called,". Also "Note If for some reason you must call MPI_Init() separately, call it before PetscInitialize()." > and what about MPI_Init_thread. > https://petsc.org/release/manualpages/Sys/PETSC_MPI_THREAD_REQUIRED/ > Wouldn't it be more convenient to have a Petsc init function taking a > already initialized communicator as argument ? > Probably not. > Also, that initialization seems to imply that it is not possible to have > multiple instance of PETSc on different communicators. Is that the case ? > No, this is possible. We have examples for this. You call MPI_init() yourself, set PETSC_COMM_WORLD to the communicator you want, and then call PetscInitialize(). See https://petsc.org/release/manualpages/Sys/PETSC_COMM_WORLD/ Thanks, Matt Thanks, Matt > Thanks > > ---- > Alain Miniussi > DSI, P?les Calcul et Genie Log. > Observatoire de la C?te d'Azur > T?l. : +33609650665 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Jan 31 11:45:23 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 31 Jan 2024 12:45:23 -0500 Subject: [petsc-users] Preconditioning of Liouvillian Superoperator In-Reply-To: <0bc882a8-97e5-4e52-9889-c9e94245afd9@itp.uni-bremen.de> References: <0bc882a8-97e5-4e52-9889-c9e94245afd9@itp.uni-bremen.de> Message-ID: <5B5E9848-631C-46B2-8DE7-07EA69E3A4A9@petsc.dev> For large problems, preconditioners have to take advantage of some underlying mathematical structure of the operator to perform well (require few iterations). Just black-boxing the system with simple preconditioners will not be effective. So, one needs to look at the Liouvillian Superoperator's structure to see what one can take advantage of. I first noticed that it can be represented as a Kronecker product: A x I or a combination of Kronecker products? In theory, one can take advantage of Kronecker structure to solve such systems much more efficiently than just directly solving the huge system naively as a huge system. In addition it may be possible to use the Kronecker structure of the operator to perform matrix-vector products with the operator much more efficiently than by first explicitly forming the huge matrix representation and doing the multiplies with that. I suggest some googling with linear solver, preconditioning, Kronecker product. > On Jan 31, 2024, at 6:51?AM, Niclas G?tting wrote: > > Hi all, > > I've been trying for the last couple of days to solve a linear system using iterative methods. The system size itself scales exponentially (64^N) with the number of components, so I receive sizes of > > * (64, 64) for one component > * (4096, 4096) for two components > * (262144, 262144) for three components > > I can solve the first two cases with direct solvers and don't run into any problems; however, the last case is the first nontrivial and it's too large for a direct solution, which is why I believe that I need an iterative solver. > > As I know the solution for the first two cases, I tried to reproduce them using GMRES and failed on the second, because GMRES didn't converge and seems to have been going in the wrong direction (the vector to which it "tries" to converge is a totally different one than the correct solution). I went as far as -ksp_max_it 1000000, which takes orders of magnitude longer than the LU solution and I'd intuitively think that GMRES should not take *that* much longer than LU. Here is the information I have about this (4096, 4096) system: > > * not symmetric (which is why I went for GMRES) > * not singular (SVD: condition number 1.427743623238e+06, 0 of 4096 singular values are (nearly) zero) > * solving without preconditioning does not converge (DIVERGED_ITS) > * solving with iLU and natural ordering fails due to zeros on the diagonal > * solving with iLU and RCM ordering does not converge (DIVERGED_ITS) > > After some searching I also found [this](http://arxiv.org/abs/1504.06768) paper, which mentions the use of ILUTP, which I believe in PETSc should be used via hypre, which, however, threw a SEGV for me, and I'm not sure if it's worth debugging at this point in time, because I might be missing something entirely different. > > Does anybody have an idea how this system could be solved in finite time, such that the method also scales to the three component problem? > > Thank you all very much in advance! > > Best regards > Niclas > From anna at oden.utexas.edu Wed Jan 31 16:06:51 2024 From: anna at oden.utexas.edu (Yesypenko, Anna) Date: Wed, 31 Jan 2024 22:06:51 +0000 Subject: [petsc-users] errors with hypre with MPI and multiple GPUs on a node Message-ID: Dear Petsc devs, I'm encountering an error running hypre on a single node with multiple GPUs. The issue is in the setup phase. I'm trying to troubleshoot, but don't know where to start. Are the system routines PetScCUDAInitialize and PetScCUDAInitializeCheck available in python? How do I verify that GPUs are assigned properly to each MPI process? In this case, I have 3 tasks and 3 GPUs. The code works with pc-type hypre on a single GPU. Any suggestions are appreciated! Below is the error trace: `` TACC: Starting up job 1490124 TACC: Setting up parallel environment for MVAPICH2+mpispawn. TACC: Starting parallel tasks... [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/ [0]PETSC ERROR: or try https://docs.nvidia.com/cuda/cuda-memcheck/index.html on NVIDIA CUDA systems to find memory corruption errors [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: The line numbers in the error traceback are not always exact. [0]PETSC ERROR: #1 hypre_ParCSRMatrixMigrate() [0]PETSC ERROR: #2 MatBindToCPU_HYPRE() at /work/06368/annayesy/ls6/petsc/src/mat/impls/hypre/mhypre.c:1394 [0]PETSC ERROR: #3 MatAssemblyEnd_HYPRE() at /work/06368/annayesy/ls6/petsc/src/mat/impls/hypre/mhypre.c:1471 [0]PETSC ERROR: #4 MatAssemblyEnd() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:5773 [0]PETSC ERROR: #5 MatConvert_AIJ_HYPRE() at /work/06368/annayesy/ls6/petsc/src/mat/impls/hypre/mhypre.c:660 [0]PETSC ERROR: #6 MatConvert() at /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:4421 [0]PETSC ERROR: #7 PCSetUp_HYPRE() at /work/06368/annayesy/ls6/petsc/src/ksp/pc/impls/hypre/hypre.c:245 [0]PETSC ERROR: #8 PCSetUp() at /work/06368/annayesy/ls6/petsc/src/ksp/pc/interface/precon.c:1080 [0]PETSC ERROR: #9 KSPSetUp() at /work/06368/annayesy/ls6/petsc/src/ksp/ksp/interface/itfunc.c:415 [0]PETSC ERROR: #10 KSPSolve_Private() at /work/06368/annayesy/ls6/petsc/src/ksp/ksp/interface/itfunc.c:833 [0]PETSC ERROR: #11 KSPSolve() at /work/06368/annayesy/ls6/petsc/src/ksp/ksp/interface/itfunc.c:1080 application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 `` Below is a minimum working example: `` import numpy,petsc4py,sys,time petsc4py.init(sys.argv) from petsc4py import PETSc from time import time n = int(5e5); comm = PETSc.COMM_WORLD pA = PETSc.Mat(comm=comm) pA.create(comm=comm) pA.setSizes((n,n)) pA.setType(PETSc.Mat.Type.AIJ) pA.setPreallocationNNZ(3) rstart,rend=pA.getOwnershipRange() print("\t Processor %d of %d gets indices %d:%d"%(comm.Get_rank(),comm.Get_size(),rstart,rend)) if (rstart == 0): pA.setValue(0,0,2); pA.setValue(0,1,-1) if (rend == n): pA.setValue(n-1,n-2,-1); pA.setValue(n-1,n-1,2) for index in range(rstart,rend): if (rstart > 0): pA.setValue(index,index-1,-1) pA.setValue(index,index,2) if (rend < n): pA.setValue(index,index+1,-1) pA.assemble() pA = pA.convert(mat_type='aijcusparse') px,pb = pA.createVecs() pb.set(1.0); px.set(1.0) ksp = PETSc.KSP().create() ksp.setOperators(pA) ksp.setConvergenceHistory() ksp.setType('cg') ksp.getPC().setType('hypre') ksp.setTolerances(rtol=1e-10) ksp.solve(pb, px) # error is generated here `` Best, Anna -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Jan 31 17:36:36 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 31 Jan 2024 17:36:36 -0600 Subject: [petsc-users] errors with hypre with MPI and multiple GPUs on a node In-Reply-To: References: Message-ID: Hi Anna, Since you said "The code works with pc-type hypre on a single GPU.", I was wondering if this is a CUDA devices to MPI ranks binding problem. You can search TACC documentation to find how its job scheduler binds GPUs to MPI ranks (usually via manipulating the CUDA_VISIBLE_DEVICES environment variable) Please follow up if you could not solve it. Thanks. --Junchao Zhang On Wed, Jan 31, 2024 at 4:07?PM Yesypenko, Anna wrote: > Dear Petsc devs, > > I'm encountering an error running hypre on a single node with multiple > GPUs. > The issue is in the setup phase. I'm trying to troubleshoot, but don't > know where to start. > Are the system routines PetScCUDAInitialize and PetScCUDAInitializeCheck > available in python? > How do I verify that GPUs are assigned properly to each MPI process? In > this case, I have 3 tasks and 3 GPUs. > > The code works with pc-type hypre on a single GPU. > Any suggestions are appreciated! > > Below is the error trace: > `` > TACC: Starting up job 1490124 > TACC: Setting up parallel environment for MVAPICH2+mpispawn. > TACC: Starting parallel tasks... > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and > https://petsc.org/release/faq/ > [0]PETSC ERROR: or try > https://docs.nvidia.com/cuda/cuda-memcheck/index.html on NVIDIA CUDA > systems to find memory corruption errors > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: The line numbers in the error traceback are not always > exact. > [0]PETSC ERROR: #1 hypre_ParCSRMatrixMigrate() > [0]PETSC ERROR: #2 MatBindToCPU_HYPRE() at > /work/06368/annayesy/ls6/petsc/src/mat/impls/hypre/mhypre.c:1394 > [0]PETSC ERROR: #3 MatAssemblyEnd_HYPRE() at > /work/06368/annayesy/ls6/petsc/src/mat/impls/hypre/mhypre.c:1471 > [0]PETSC ERROR: #4 MatAssemblyEnd() at > /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:5773 > [0]PETSC ERROR: #5 MatConvert_AIJ_HYPRE() at > /work/06368/annayesy/ls6/petsc/src/mat/impls/hypre/mhypre.c:660 > [0]PETSC ERROR: #6 MatConvert() at > /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:4421 > [0]PETSC ERROR: #7 PCSetUp_HYPRE() at > /work/06368/annayesy/ls6/petsc/src/ksp/pc/impls/hypre/hypre.c:245 > [0]PETSC ERROR: #8 PCSetUp() at > /work/06368/annayesy/ls6/petsc/src/ksp/pc/interface/precon.c:1080 > [0]PETSC ERROR: #9 KSPSetUp() at > /work/06368/annayesy/ls6/petsc/src/ksp/ksp/interface/itfunc.c:415 > [0]PETSC ERROR: #10 KSPSolve_Private() at > /work/06368/annayesy/ls6/petsc/src/ksp/ksp/interface/itfunc.c:833 > [0]PETSC ERROR: #11 KSPSolve() at > /work/06368/annayesy/ls6/petsc/src/ksp/ksp/interface/itfunc.c:1080 > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > `` > > Below is a minimum working example: > `` > import numpy,petsc4py,sys,time > petsc4py.init(sys.argv) > from petsc4py import PETSc > from time import time > > n = int(5e5); > comm = PETSc.COMM_WORLD > > pA = PETSc.Mat(comm=comm) > pA.create(comm=comm) > pA.setSizes((n,n)) > pA.setType(PETSc.Mat.Type.AIJ) > pA.setPreallocationNNZ(3) > rstart,rend=pA.getOwnershipRange() > > print("\t Processor %d of %d gets indices > %d:%d"%(comm.Get_rank(),comm.Get_size(),rstart,rend)) > if (rstart == 0): > pA.setValue(0,0,2); pA.setValue(0,1,-1) > if (rend == n): > pA.setValue(n-1,n-2,-1); pA.setValue(n-1,n-1,2) > > for index in range(rstart,rend): > if (rstart > 0): > pA.setValue(index,index-1,-1) > pA.setValue(index,index,2) > if (rend < n): > pA.setValue(index,index+1,-1) > > pA.assemble() > pA = pA.convert(mat_type='aijcusparse') > > px,pb = pA.createVecs() > pb.set(1.0); px.set(1.0) > > ksp = PETSc.KSP().create() > ksp.setOperators(pA) > ksp.setConvergenceHistory() > ksp.setType('cg') > ksp.getPC().setType('hypre') > ksp.setTolerances(rtol=1e-10) > > ksp.solve(pb, px) # error is generated here > `` > > Best, > Anna > -------------- next part -------------- An HTML attachment was scrubbed... URL: