From ihabmohsen at proton.me Mon Dec 4 06:15:48 2023 From: ihabmohsen at proton.me (Ihab Mohsen) Date: Mon, 04 Dec 2023 12:15:48 +0000 Subject: [petsc-users] Elevate Safety with ATISystems.com: Your Destination for Cutting-Edge Giant Voice and Outdoor Warning Systems Message-ID: <_uQDVlKGw8oM8VcCteUDjOlfFA9AQJAlQvMGSyZVJRoX5gaVwkBFqQGOGL5Sb16R0VkrSMDbFUgTvonB0i0zCHAiDKddImmX-vMQgHuiOe4=@proton.me> Ensuring safety and security in outdoor spaces stands as a paramount concern, and[ATISystems.com](http://atisystems.com/)leads the way with innovative solutions crafted to safeguard communities and businesses. Specializing in state-of-the-art giant voice systems and outdoor warning sirens,[ATISystems.com](http://atisystems.com/)offers comprehensive solutions tailored to meet your safety requirements. What's Available at[ATISystems.com](http://atisystems.com/)? 1. Giant Voice Systems [ATISystems.com](http://atisystems.com/)stands out in the industry by providing robust giant voice systems designed to effectively broadcast emergency messages across expansive outdoor areas. These systems play a crucial role in emergency preparedness, facilitating clear and immediate communication during critical situations. 2. Outdoor Warning Sirens Offering a range of outdoor warning sirens,[ATISystems.com](http://atisystems.com/)ensures that communities and facilities have access to resilient alerting mechanisms. These sirens are engineered to emit high-decibel warnings, alerting individuals outdoors to potential threats or emergencies, thus enhancing overall safety protocols. Why Opt for[ATISystems.com](http://atisystems.com/)for Your Safety Solutions? 1. Cutting-Edge Technology [ATISystems.com](http://atisystems.com/)leverages cutting-edge technology in the development and deployment of their giant voice systems and outdoor warning sirens. The integration of advanced features ensures reliability and effectiveness precisely when it matters most. 2. Tailored Solutions Recognizing the uniqueness of each location and scenario,[ATISystems.com](http://atisystems.com/)offers customized solutions to match specific safety requirements. Whether for municipalities, industrial sites, or educational campuses, their systems can be tailored for optimal performance. 3. Dedication to Safety At the heart of[ATISystems.com](http://atisystems.com/)lies an unwavering commitment to safety. Their solutions are engineered to offer peace of mind, enabling swift and efficient communication during emergencies, thereby minimizing potential risks. Discover Unmatched Safety Solutions at[ATISystems.com](http://atisystems.com/) The dedication of[ATISystems.com](http://atisystems.com/)in delivering top-tier giant voice systems and outdoor warning sirens positions them as the go-to resource for enhancing outdoor safety measures. Their comprehensive range of products and services equips you with the necessary tools to mitigate risks and safeguard lives. Secure Your Environment Today Explore the cutting-edge solutions offered by[ATISystems.com](http://atisystems.com/)and take essential steps to fortify safety in your outdoor spaces. Delve into their giant voice systems and outdoor warning sirens to enhance your emergency preparedness. For more information, please visit: [Giant Voice System]([https://atisystems.com](https://atisystems.com/)), [Outdoor Warning System]([https://atisystems.com](https://atisystems.com/)), [Outdoor Warning Siren]([https://atisystems.com](https://atisystems.com/)) Visit [[ATISystems.com](http://atisystems.com/)]([https://atisystems.com](https://atisystems.com/)) today and empower your organization or community with reliable and effective safety solutions. With[ATISystems.com](http://atisystems.com/), safety is priority. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ehabmohsen66 at gmail.com Mon Dec 4 06:28:46 2023 From: ehabmohsen66 at gmail.com (Ehab Mohsen) Date: Mon, 4 Dec 2023 14:28:46 +0200 Subject: [petsc-users] Why Choose Digitology.co as Your Digital Marketing Agency in Egypt Message-ID: Looking for a standout [digital marketing agency](https://digitology.co) in Egypt? Look no further than [Digitology.co ]( https://digitology.co)! We take pride in being Egypt's premier digital marketing agency, delivering top-tier services that enhance your online presence and drive unparalleled success for your business. What Sets [Digitology.co ](https://digitology.co) Apart as Egypt's Best Digital Marketing Agency? 1. Unrivaled Expertise in [Digital Marketing](https://digitology.co) At [Digitology.co ](https://digitology.co), our team comprises seasoned professionals with extensive expertise in various digital marketing facets. From tailored SEO strategies for the Egyptian market to impactful social media campaigns, our experts craft personalized solutions to meet your unique business needs. 2. Proven Track Record of Success Being Egypt's [top digital marketing agency](https://digitology.co), our track record speaks volumes. Consistently delivering outstanding results for clients, we've helped achieve higher visibility, increased traffic, and amplified conversions. Our success stories testify to our commitment to excellence. 3. Comprehensive [SEO and Online Marketing](https://digitology.co) Approach We recognize the significance of a holistic approach to SEO and online marketing. Our strategies encompass diverse techniques like content optimization, link building, and technical SEO, ensuring prominent search engine rankings for your website. Why [Digitology.co ](https://digitology.co) Stands Out Among Egypt's [SEO Agencies](https://digitology.co) As a leading [SEO agency in Egypt](https://digitology.co), we focus on driving organic growth and maximizing online presence. Our tailored SEO strategies aim to improve website visibility, increase organic traffic, and ultimately enhance conversions. Understanding the local market well, we implement resonating strategies for Egyptian audiences. Choose [Digitology.co ](https://digitology.co) for Unmatched Digital Marketing Solutions Choosing us as your digital marketing partner grants you access to cutting-edge strategies, personalized solutions, and a dedicated team committed to your success. Our mission is to propel your business to new heights through innovative, results-oriented digital marketing strategies. Take the Next Step Towards Success Ready to elevate your digital presence? Partner with [Digitology.co ](https://digitology.co), Egypt's [best digital marketing agency](https://digitology.co). Contact us today to explore how our tailored solutions can revolutionize your online presence and drive tangible business growth. Feel free to explore [Digitology.co ]( https://digitology.co)'s services and witness firsthand how we transform your digital marketing endeavors. At [Digitology.co ](https://digitology.co), your success is our priority! -- [image: Best regards,] Create your own email signature ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordi.manyer at monash.edu Mon Dec 4 08:44:41 2023 From: jordi.manyer at monash.edu (Jordi Manyer Fuertes) Date: Tue, 5 Dec 2023 01:44:41 +1100 Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody Message-ID: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> Dear PETSc users/developpers, I am currently trying to use the method `MatNullSpaceCreateRigidBody` together with `PCGAMG` to efficiently precondition an elasticity solver in 2D/3D. I have managed to make it work in serial (or with 1 MPI rank) with h-independent number of iterations (which is great), but the solver diverges in parallel. I assume it has to do with the coordinate vector I am building the null-space with not being correctly setup. The documentation is not that clear on which nodes exactly have to be set in each partition. Does it require nodes corresponding to owned dofs, or all dofs in each partition (owned+ghost)? What ghost layout should the `Vec` have? Any other tips about what I might be doing wrong? Thanks, Jordi From bsmith at petsc.dev Mon Dec 4 11:37:15 2023 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 4 Dec 2023 12:37:15 -0500 Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody In-Reply-To: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> Message-ID: <8CBF0F2E-3FE5-4E0D-8924-63CE130F7C6B@petsc.dev> To owned DOF. Any ghosting of the problem is not relevant since the null space created is purely a linear algebra thing that is related to the global vector (not local vectors). Barry > On Dec 4, 2023, at 9:44?AM, Jordi Manyer Fuertes via petsc-users wrote: > > Dear PETSc users/developpers, > > I am currently trying to use the method `MatNullSpaceCreateRigidBody` together with `PCGAMG` to efficiently precondition an elasticity solver in 2D/3D. > > I have managed to make it work in serial (or with 1 MPI rank) with h-independent number of iterations (which is great), but the solver diverges in parallel. > > I assume it has to do with the coordinate vector I am building the null-space with not being correctly setup. The documentation is not that clear on which nodes exactly have to be set in each partition. Does it require nodes corresponding to owned dofs, or all dofs in each partition (owned+ghost)? What ghost layout should the `Vec` have? > > Any other tips about what I might be doing wrong? > > Thanks, > > Jordi > From knepley at gmail.com Mon Dec 4 11:46:52 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 4 Dec 2023 12:46:52 -0500 Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody In-Reply-To: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> Message-ID: On Mon, Dec 4, 2023 at 12:01?PM Jordi Manyer Fuertes via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc users/developpers, > > I am currently trying to use the method `MatNullSpaceCreateRigidBody` > together with `PCGAMG` to efficiently precondition an elasticity solver > in 2D/3D. > > I have managed to make it work in serial (or with 1 MPI rank) with > h-independent number of iterations (which is great), but the solver > diverges in parallel. > > I assume it has to do with the coordinate vector I am building the > null-space with not being correctly setup. The documentation is not that > clear on which nodes exactly have to be set in each partition. Does it > require nodes corresponding to owned dofs, or all dofs in each partition > (owned+ghost)? What ghost layout should the `Vec` have? > > Any other tips about what I might be doing wrong? > What we assume is that you have some elastic problem formulated in primal unknowns (displacements) so that the solution vector looks like this: [ d^0_x d^0_y d^0_z d^1_x ..... ] or whatever spatial dimension you have. We expect to get a global vector that looks like that, but instead of displacements, we get the coordinates that each displacement corresponds to. We make the generators of translations: [ 1 0 0 1 0 0 1 0 0 1 0 0... ] [ 0 1 0 0 1 0 0 1 0 0 1 0... ] [ 0 0 1 0 0 1 0 0 1 0 0 1... ] for which we do not need the coordinates, and then the generators of rotations about each axis, for which we _do_ need the coordinates, since we need to know how much each point moves if you rotate about some center. Does that make sense? Thanks, Matt > Thanks, > > Jordi > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordi.manyer at monash.edu Tue Dec 5 06:57:23 2023 From: jordi.manyer at monash.edu (Jordi Manyer Fuertes) Date: Tue, 5 Dec 2023 23:57:23 +1100 Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody In-Reply-To: References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> Message-ID: <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu> Thanks for the prompt response. Both answers look like what I'm doing. After playing a bit more with solver, I managed to make it run in parallel with different boundary conditions (full dirichlet bcs, vs mixed newmann + dirichlet). This raises two questions: - How relevant are boundary conditions (eliminating dirichlet rows/cols vs weak newmann bcs) to the solver? Should I modify something when changing boundary conditions? - Also, the solver did well with the old bcs when run in a single processor (but not in parallel). This seems odd, since parallel and serial behavior should be consistent (or not?). Could it be fault of the PCGAMG? I believe the default local solver is ILU, shoud I be changing it to LU or something else for these kind of problems? Thank you both again, Jordi On 5/12/23 04:46, Matthew Knepley wrote: > On Mon, Dec 4, 2023 at 12:01?PM Jordi Manyer Fuertes via petsc-users > wrote: > > Dear PETSc users/developpers, > > I am currently trying to use the method `MatNullSpaceCreateRigidBody` > together with `PCGAMG` to efficiently precondition an elasticity > solver > in 2D/3D. > > I have managed to make it work in serial (or with 1 MPI rank) with > h-independent number of iterations (which is great), but the solver > diverges in parallel. > > I assume it has to do with the coordinate vector I am building the > null-space with not being correctly setup. The documentation is > not that > clear on which nodes exactly have to be set in each partition. > Does it > require nodes corresponding to owned dofs, or all dofs in each > partition > (owned+ghost)? What ghost layout should the `Vec` have? > > Any other tips about what I might be doing wrong? > > > What we assume is that you have some elastic problem formulated in > primal unknowns (displacements) so that the solution vector looks like > this: > > ? [ d^0_x d^0_y d^0_z d^1_x ..... ] > > or whatever spatial dimension you have. We expect to get a global > vector that looks like that, but instead > of displacements, we get the coordinates that each displacement > corresponds to. We make the generators of translations: > > ? [ 1 0 0 1 0 0 1 0 0 1 0 0... ] > ? [ 0 1 0 0 1 0 0 1 0 0 1 0... ] > ? [ 0 0 1 0 0 1 0 0 1 0 0 1... ] > > for which we do not need the coordinates, and then the generators of > rotations about each axis, for which > we _do_ need the coordinates, since we need to know how much each > point moves if you rotate about some center. > > ? Does that make sense? > > ? ?Thanks, > > ? ? ? Matt > > Thanks, > > Jordi > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 5 07:35:15 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Dec 2023 08:35:15 -0500 Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody In-Reply-To: <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu> References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu> Message-ID: On Tue, Dec 5, 2023 at 7:57?AM Jordi Manyer Fuertes wrote: > Thanks for the prompt response. Both answers look like what I'm doing. > > After playing a bit more with solver, I managed to make it run in parallel > with different boundary conditions (full dirichlet bcs, vs mixed newmann + > dirichlet). This raises two questions: > > - How relevant are boundary conditions (eliminating dirichlet rows/cols vs > weak newmann bcs) to the solver? Should I modify something when changing > boundary conditions? > > The rigid body kernel is independent of boundary conditions, and is only really important for the coarse grids. However, it is really easy to ruin a solve with inconsistent boundary conditions, or with conditions which cause a singularity at a change point. > - Also, the solver did well with the old bcs when run in a single > processor (but not in parallel). This seems odd, since parallel and serial > behavior should be consistent (or not?). Could it be fault of the PCGAMG? > This is unlikely. We have many parallel tests of elasticity (SNES ex17, ex56, ex77, etc). We do not see problems. It seems more likely that the system might not be assembled correctly in parallel. Did you check that the matrices match? > I believe the default local solver is ILU, shoud I be changing it to LU or > something else for these kind of problems? > Do you mean the smoother for AMG? No, the default is Chebyshev/Jacobi, which is the same in parallel. Thanks, Matt Thank you both again, > > Jordi > > > On 5/12/23 04:46, Matthew Knepley wrote: > > On Mon, Dec 4, 2023 at 12:01?PM Jordi Manyer Fuertes via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear PETSc users/developpers, >> >> I am currently trying to use the method `MatNullSpaceCreateRigidBody` >> together with `PCGAMG` to efficiently precondition an elasticity solver >> in 2D/3D. >> >> I have managed to make it work in serial (or with 1 MPI rank) with >> h-independent number of iterations (which is great), but the solver >> diverges in parallel. >> >> I assume it has to do with the coordinate vector I am building the >> null-space with not being correctly setup. The documentation is not that >> clear on which nodes exactly have to be set in each partition. Does it >> require nodes corresponding to owned dofs, or all dofs in each partition >> (owned+ghost)? What ghost layout should the `Vec` have? >> >> Any other tips about what I might be doing wrong? >> > > What we assume is that you have some elastic problem formulated in primal > unknowns (displacements) so that the solution vector looks like this: > > [ d^0_x d^0_y d^0_z d^1_x ..... ] > > or whatever spatial dimension you have. We expect to get a global vector > that looks like that, but instead > of displacements, we get the coordinates that each displacement > corresponds to. We make the generators of translations: > > [ 1 0 0 1 0 0 1 0 0 1 0 0... ] > [ 0 1 0 0 1 0 0 1 0 0 1 0... ] > [ 0 0 1 0 0 1 0 0 1 0 0 1... ] > > for which we do not need the coordinates, and then the generators of > rotations about each axis, for which > we _do_ need the coordinates, since we need to know how much each point > moves if you rotate about some center. > > Does that make sense? > > Thanks, > > Matt > > > >> Thanks, >> >> Jordi >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Dec 5 10:09:27 2023 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 5 Dec 2023 11:09:27 -0500 Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody In-Reply-To: References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu> Message-ID: I would suggest (excuse me if I missed something): *** Test a simple Jacobi solver in serial and parallel and verify that the convergence history (ksp_monitor) are identical to round-off error *** Test GAMG, serial and parallel, without MatNullSpaceCreateRigidBody and verify that the convergence is close, say well within 20% in the number of iterations *** Next you can get the null space vectors (v_i) and compute q = A*v_i and verify that each q is zero except for the BCs. - You could remove the BCs from A, temporarily, and the q should have norm machine epsilon to make this test simpler. - No need to solve this no-BC A solve. Mark On Tue, Dec 5, 2023 at 8:46?AM Matthew Knepley wrote: > On Tue, Dec 5, 2023 at 7:57?AM Jordi Manyer Fuertes < > jordi.manyer at monash.edu> wrote: > >> Thanks for the prompt response. Both answers look like what I'm doing. >> >> After playing a bit more with solver, I managed to make it run in >> parallel with different boundary conditions (full dirichlet bcs, vs mixed >> newmann + dirichlet). This raises two questions: >> >> - How relevant are boundary conditions (eliminating dirichlet rows/cols >> vs weak newmann bcs) to the solver? Should I modify something when changing >> boundary conditions? >> >> The rigid body kernel is independent of boundary conditions, and is only > really important for the coarse grids. However, it is really easy to ruin a > solve with inconsistent boundary conditions, or with conditions which cause > a singularity at a change point. > >> - Also, the solver did well with the old bcs when run in a single >> processor (but not in parallel). This seems odd, since parallel and serial >> behavior should be consistent (or not?). Could it be fault of the PCGAMG? >> > This is unlikely. We have many parallel tests of elasticity (SNES ex17, > ex56, ex77, etc). We do not see problems. It seems more likely that the > system might not be assembled correctly in parallel. Did you check that the > matrices match? > >> I believe the default local solver is ILU, shoud I be changing it to LU >> or something else for these kind of problems? >> > Do you mean the smoother for AMG? No, the default is Chebyshev/Jacobi, > which is the same in parallel. > > Thanks, > > Matt > > Thank you both again, >> >> Jordi >> >> >> On 5/12/23 04:46, Matthew Knepley wrote: >> >> On Mon, Dec 4, 2023 at 12:01?PM Jordi Manyer Fuertes via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Dear PETSc users/developpers, >>> >>> I am currently trying to use the method `MatNullSpaceCreateRigidBody` >>> together with `PCGAMG` to efficiently precondition an elasticity solver >>> in 2D/3D. >>> >>> I have managed to make it work in serial (or with 1 MPI rank) with >>> h-independent number of iterations (which is great), but the solver >>> diverges in parallel. >>> >>> I assume it has to do with the coordinate vector I am building the >>> null-space with not being correctly setup. The documentation is not that >>> clear on which nodes exactly have to be set in each partition. Does it >>> require nodes corresponding to owned dofs, or all dofs in each partition >>> (owned+ghost)? What ghost layout should the `Vec` have? >>> >>> Any other tips about what I might be doing wrong? >>> >> >> What we assume is that you have some elastic problem formulated in primal >> unknowns (displacements) so that the solution vector looks like this: >> >> [ d^0_x d^0_y d^0_z d^1_x ..... ] >> >> or whatever spatial dimension you have. We expect to get a global vector >> that looks like that, but instead >> of displacements, we get the coordinates that each displacement >> corresponds to. We make the generators of translations: >> >> [ 1 0 0 1 0 0 1 0 0 1 0 0... ] >> [ 0 1 0 0 1 0 0 1 0 0 1 0... ] >> [ 0 0 1 0 0 1 0 0 1 0 0 1... ] >> >> for which we do not need the coordinates, and then the generators of >> rotations about each axis, for which >> we _do_ need the coordinates, since we need to know how much each point >> moves if you rotate about some center. >> >> Does that make sense? >> >> Thanks, >> >> Matt >> >> >> >>> Thanks, >>> >>> Jordi >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexlindsay239 at gmail.com Tue Dec 5 12:58:54 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Tue, 5 Dec 2023 10:58:54 -0800 Subject: [petsc-users] Pre-check before each line search function evaluation In-Reply-To: References: <80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev> Message-ID: Thanks Matt. For the immediate present I will probably use a basic line search with a precheck, but if I want true line searches in the future I will pursue option 2 On Thu, Nov 30, 2023 at 2:27?PM Matthew Knepley wrote: > On Thu, Nov 30, 2023 at 5:08?PM Alexander Lindsay < > alexlindsay239 at gmail.com> wrote: > >> Hi Matt, your derivation is spot on. However, the problem is not linear, >> which is why I am using SNES. So you need to replace >> >> U = A^{-1} f - A^{-1} B L >> >> with >> >> dU = A^{-1} f - A^{-1} B dL >> > > I see two cases: > > 1) There is an easy nonlinear elimination for U. In this case, you do > this to get U_1. > > 2) There is only a linear elimination. In this case, I don't think the > nonlinear system should be phrased > only on L, but rather on (U, L) itself. The linear elimination can > be used as an excellent preconditioner > for the Newton system. > > Thanks, > > Matt > > >> On Thu, Nov 30, 2023 at 1:47?PM Matthew Knepley >> wrote: >> >>> On Thu, Nov 30, 2023 at 4:23?PM Alexander Lindsay < >>> alexlindsay239 at gmail.com> wrote: >>> >>>> If someone passes me just L, where L represents the "global" degrees of >>>> freedom, in this case they represent unknowns on the trace of the mesh, >>>> this is insufficient information for me to evaluate my function. Because in >>>> truth my degrees of freedom are the sum of the trace unknowns (the unknowns >>>> in the global solution vector) and the eliminated unknowns which are >>>> entirely local to each element. So I will say my dofs are L + U. >>>> >>> >>> I want to try and reduce this to the simplest possible thing so that I >>> can understand. We have some system which has two parts to the solution, L >>> and U. If this problem is linear, we have >>> >>> / A B \ / U \ = / f \ >>> \ C D / \ L / \ g / >>> >>> and we assume that A is easily invertible, so that >>> >>> U + A^{-1} B L = f >>> U = f - A^{-1} B L >>> >>> C U + D L = g >>> C (f - A^{-1} B L) + D L = g >>> (D - C A^{-1} B) L = g - C f >>> >>> where I have reproduced the Schur complement derivation. Here, given any >>> L, I can construct the corresponding U by inverting A. I know your system >>> may be different, but if you are only solving for L, >>> it should have this property I think. >>> >>> Thus, if the line search generates a new L, say L_1, I should be able to >>> get U_1 by just plugging in. If this is not so, can you write out the >>> equations so we can see why this is not true? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> I start with some initial guess L0 and U0. I perform a finite element >>>> assembly procedure on each element which gives me things like K_LL, K_UL, >>>> K_LU, K_UU, F_U, and F_L. I can do some math: >>>> >>>> K_LL = -K_LU * K_UU^-1 * K_UL + K_LL >>>> F_L = -K_LU * K_UU^-1 * F_U + F_L >>>> >>>> And then I feed K_LL and F_L into the global system matrix and vector >>>> respectively. I do something (like a linear solve) which gives me an >>>> increment to L, I'll call it dL. I loop back through and do a finite >>>> element assembly again using **L0 and U0** (or one could in theory save off >>>> the previous assemblies) to once again obtain the same K_LL, K_UL, K_LU, >>>> K_UU, F_U, F_L. And now I can compute the increment for U, dU, according to >>>> >>>> dU = K_UU^-1 * (-F_U - K_UL * dL) >>>> >>>> Armed now with both dL and dU, I am ready to perform a new residual >>>> evaluation with (L0 + dL, U0 + dU) = (L1, U1). >>>> >>>> The key part is that I cannot get U1 (or more generally an arbitrary U) >>>> just given L1 (or more generally an arbitrary L). In order to get U1, I >>>> must know both L0 and dL (and U0 of course). This is because at its core U >>>> is not some auxiliary vector; it represents true degrees of freedom. >>>> >>>> On Thu, Nov 30, 2023 at 12:32?PM Barry Smith wrote: >>>> >>>>> >>>>> Why is this all not part of the function evaluation? >>>>> >>>>> >>>>> > On Nov 30, 2023, at 3:25?PM, Alexander Lindsay < >>>>> alexlindsay239 at gmail.com> wrote: >>>>> > >>>>> > Hi I'm looking at the sources, and I believe the answer is no, but >>>>> is there a dedicated callback that is akin to SNESLineSearchPrecheck but is >>>>> called before *each* function evaluation in a line search method? I am >>>>> using a Hybridized Discontinuous Galerkin method in which most of the >>>>> degrees of freedom are eliminated from the global system. However, an >>>>> accurate function evaluation requires that an update to the "global" dofs >>>>> also trigger an update to the eliminated dofs. >>>>> > >>>>> > I can almost certainly do this manually but I believe it would be >>>>> more prone to error than a dedicated callback. >>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeremy.theler-ext at ansys.com Tue Dec 5 14:22:25 2023 From: jeremy.theler-ext at ansys.com (Jeremy Theler (External)) Date: Tue, 5 Dec 2023 20:22:25 +0000 Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody In-Reply-To: <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu> References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu> <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu> Message-ID: just in case it helps, here's one way I have to create the near nullspace: https://github.com/seamplex/feenox/blob/main/src/pdes/mechanical/init.c#L468 -- jeremy ________________________________ From: petsc-users on behalf of Jordi Manyer Fuertes via petsc-users Sent: Tuesday, December 5, 2023 9:57 AM To: Matthew Knepley ; bsmith at petsc.dev Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Help for MatNullSpaceCreateRigidBody [External Sender] Thanks for the prompt response. Both answers look like what I'm doing. After playing a bit more with solver, I managed to make it run in parallel with different boundary conditions (full dirichlet bcs, vs mixed newmann + dirichlet). This raises two questions: - How relevant are boundary conditions (eliminating dirichlet rows/cols vs weak newmann bcs) to the solver? Should I modify something when changing boundary conditions? - Also, the solver did well with the old bcs when run in a single processor (but not in parallel). This seems odd, since parallel and serial behavior should be consistent (or not?). Could it be fault of the PCGAMG? I believe the default local solver is ILU, shoud I be changing it to LU or something else for these kind of problems? Thank you both again, Jordi -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Tue Dec 5 17:15:37 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Tue, 5 Dec 2023 17:15:37 -0600 Subject: [petsc-users] Scattering a vector to/from a subset of processors In-Reply-To: References: Message-ID: Hi, I have a follow up question on this. Now, I'm trying to do a scatter and permutation of the vector. Under the same setup as the original example, here are the new Start and Finish states I want to achieve: Start Finish Proc | Entries Proc | Entries 0 | 0,...,8 0 | 0, 12, 24 1 | 9,...,17 1 | 1, 13, 25 2 | 18,...,26 2 | 2, 14, 26 3 | 27,...,35 3 | 3, 15, 27 4 | None 4 | 4, 16, 28 5 | None 5 | 5, 17, 29 6 | None 6 | 6, 18, 30 7 | None 7 | 7, 19, 31 8 | None 8 | 8, 20, 32 9 | None 9 | 9, 21, 33 10 | None 10 | 10, 22, 34 11 | None 11 | 11, 23, 35 So far, I've tried to use ISCreateGeneral(), with each process giving an idx array corresponding to the indices it wants (i.e. idx on P0 looks like [0,12,24] P1 -> [1,13, 25], and so on). Then I use this to create the VecScatter with VecScatterCreate(x, is, y, NULL, &scatter). However, when I try to do the scatter, I get some illegal memory access errors. Is there something wrong with how I define the index sets? Thanks, Sreeram On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat wrote: > Thank you. This works for me. > > Sreeram > > On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang > wrote: > >> Hi, Sreeram, >> You can try this code. Since x, y are both MPI vectors, we just need to >> say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your >> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35], >> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index sets >> [0..17], [18..35], .., []. PETSc will figure out how to do the >> communication. >> >> PetscInt rstart, rend, N; >> IS ix; >> VecScatter vscat; >> Vec y; >> MPI_Comm comm; >> VecType type; >> >> PetscObjectGetComm((PetscObject)x, &comm); >> VecGetType(x, &type); >> VecGetSize(x, &N); >> VecGetOwnershipRange(x, &rstart, &rend); >> >> VecCreate(comm, &y); >> VecSetSizes(y, PETSC_DECIDE, N); >> VecSetType(y, type); >> >> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart, 1, >> &ix); >> VecScatterCreate(x, ix, y, ix, &vscat); >> >> --Junchao Zhang >> >> >> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat >> wrote: >> >>> Suppose I am running on 12 processors, and I have a vector "v" of size >>> 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so it >>> has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to >>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3, >>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure >>> what IndexSets to use for the sender and receiver. >>> >>> The result I am trying to achieve is this: >>> >>> Assume the vector is v = <0, 1, 2, ..., 35> >>> >>> Start Finish >>> Proc | Entries Proc | Entries >>> 0 | 0,...,8 0 | 0, 1, 2 >>> 1 | 9,...,17 1 | 3, 4, 5 >>> 2 | 18,...,26 2 | 6, 7, 8 >>> 3 | 27,...,35 3 | 9, 10, 11 >>> 4 | None 4 | 12, 13, 14 >>> 5 | None 5 | 15, 16, 17 >>> 6 | None 6 | 18, 19, 20 >>> 7 | None 7 | 21, 22, 23 >>> 8 | None 8 | 24, 25, 26 >>> 9 | None 9 | 27, 28, 29 >>> 10 | None 10 | 30, 31, 32 >>> 11 | None 11 | 33, 34, 35 >>> >>> Appreciate any help you can provide on this. >>> >>> Thanks, >>> Sreeram >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Tue Dec 5 21:29:59 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 5 Dec 2023 21:29:59 -0600 Subject: [petsc-users] Scattering a vector to/from a subset of processors In-Reply-To: References: Message-ID: I think your approach is correct. Do you have an example code? --Junchao Zhang On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat wrote: > Hi, I have a follow up question on this. > > Now, I'm trying to do a scatter and permutation of the vector. Under the > same setup as the original example, here are the new Start and Finish > states I want to achieve: > Start Finish > Proc | Entries Proc | Entries > 0 | 0,...,8 0 | 0, 12, 24 > 1 | 9,...,17 1 | 1, 13, 25 > 2 | 18,...,26 2 | 2, 14, 26 > 3 | 27,...,35 3 | 3, 15, 27 > 4 | None 4 | 4, 16, 28 > 5 | None 5 | 5, 17, 29 > 6 | None 6 | 6, 18, 30 > 7 | None 7 | 7, 19, 31 > 8 | None 8 | 8, 20, 32 > 9 | None 9 | 9, 21, 33 > 10 | None 10 | 10, 22, 34 > 11 | None 11 | 11, 23, 35 > > So far, I've tried to use ISCreateGeneral(), with each process giving an > idx array corresponding to the indices it wants (i.e. idx on P0 looks like > [0,12,24] P1 -> [1,13, 25], and so on). > Then I use this to create the VecScatter with VecScatterCreate(x, is, y, > NULL, &scatter). > > However, when I try to do the scatter, I get some illegal memory access > errors. > > Is there something wrong with how I define the index sets? > > Thanks, > Sreeram > > > > > > On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat > wrote: > >> Thank you. This works for me. >> >> Sreeram >> >> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang >> wrote: >> >>> Hi, Sreeram, >>> You can try this code. Since x, y are both MPI vectors, we just need to >>> say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your >>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35], >>> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index sets >>> [0..17], [18..35], .., []. PETSc will figure out how to do the >>> communication. >>> >>> PetscInt rstart, rend, N; >>> IS ix; >>> VecScatter vscat; >>> Vec y; >>> MPI_Comm comm; >>> VecType type; >>> >>> PetscObjectGetComm((PetscObject)x, &comm); >>> VecGetType(x, &type); >>> VecGetSize(x, &N); >>> VecGetOwnershipRange(x, &rstart, &rend); >>> >>> VecCreate(comm, &y); >>> VecSetSizes(y, PETSC_DECIDE, N); >>> VecSetType(y, type); >>> >>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart, 1, >>> &ix); >>> VecScatterCreate(x, ix, y, ix, &vscat); >>> >>> --Junchao Zhang >>> >>> >>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat >>> wrote: >>> >>>> Suppose I am running on 12 processors, and I have a vector "v" of size >>>> 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so it >>>> has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to >>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3, >>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure >>>> what IndexSets to use for the sender and receiver. >>>> >>>> The result I am trying to achieve is this: >>>> >>>> Assume the vector is v = <0, 1, 2, ..., 35> >>>> >>>> Start Finish >>>> Proc | Entries Proc | Entries >>>> 0 | 0,...,8 0 | 0, 1, 2 >>>> 1 | 9,...,17 1 | 3, 4, 5 >>>> 2 | 18,...,26 2 | 6, 7, 8 >>>> 3 | 27,...,35 3 | 9, 10, 11 >>>> 4 | None 4 | 12, 13, 14 >>>> 5 | None 5 | 15, 16, 17 >>>> 6 | None 6 | 18, 19, 20 >>>> 7 | None 7 | 21, 22, 23 >>>> 8 | None 8 | 24, 25, 26 >>>> 9 | None 9 | 27, 28, 29 >>>> 10 | None 10 | 30, 31, 32 >>>> 11 | None 11 | 33, 34, 35 >>>> >>>> Appreciate any help you can provide on this. >>>> >>>> Thanks, >>>> Sreeram >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Tue Dec 5 22:09:21 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Tue, 5 Dec 2023 22:09:21 -0600 Subject: [petsc-users] Scattering a vector to/from a subset of processors In-Reply-To: References: Message-ID: Yes, I have an example code at github.com/s769/petsc-test. Only thing is, when I described the example before, I simplified the actual use case in the code to make things simpler. Here are the extra details relevant to this code: - We assume a 2D processor grid, given by the command-line args -proc_rows and -proc_cols - The total length of the vector is n_m*n_t given by command-line args -nm and -nt. n_m corresponds to a space index and n_t a time index. - In the "Start" phase, the vector is divided into n_m blocks each of size n_t (indexed space->time). The blocks are partitioned over the first row of processors. For example if -nm = 4 and -proc_cols = 4, each processor in the first row will get one block of size n_t. Each processor in the first row gets n_m_local blocks of size n_t, where the sum of all n_m_locals for the first row of processors is n_m. - In the "Finish" phase, the vector is divided into n_t blocks each of size n_m (indexed time->space; this is the reason for the permutation of indices). The blocks are partitioned over all processors. Each processor will get n_t_local blocks of size n_m, where the sum of all n_t_locals for all processors is n_t. I think the basic idea is similar to the previous example, but these details make things a bit more complicated. Please let me know if anything is unclear, and I can try to explain more. Thanks for your help, Sreeram On Tue, Dec 5, 2023 at 9:30?PM Junchao Zhang wrote: > I think your approach is correct. Do you have an example code? > > --Junchao Zhang > > > On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat > wrote: > >> Hi, I have a follow up question on this. >> >> Now, I'm trying to do a scatter and permutation of the vector. Under the >> same setup as the original example, here are the new Start and Finish >> states I want to achieve: >> Start Finish >> Proc | Entries Proc | Entries >> 0 | 0,...,8 0 | 0, 12, 24 >> 1 | 9,...,17 1 | 1, 13, 25 >> 2 | 18,...,26 2 | 2, 14, 26 >> 3 | 27,...,35 3 | 3, 15, 27 >> 4 | None 4 | 4, 16, 28 >> 5 | None 5 | 5, 17, 29 >> 6 | None 6 | 6, 18, 30 >> 7 | None 7 | 7, 19, 31 >> 8 | None 8 | 8, 20, 32 >> 9 | None 9 | 9, 21, 33 >> 10 | None 10 | 10, 22, 34 >> 11 | None 11 | 11, 23, 35 >> >> So far, I've tried to use ISCreateGeneral(), with each process giving an >> idx array corresponding to the indices it wants (i.e. idx on P0 looks like >> [0,12,24] P1 -> [1,13, 25], and so on). >> Then I use this to create the VecScatter with VecScatterCreate(x, is, y, >> NULL, &scatter). >> >> However, when I try to do the scatter, I get some illegal memory access >> errors. >> >> Is there something wrong with how I define the index sets? >> >> Thanks, >> Sreeram >> >> >> >> >> >> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat >> wrote: >> >>> Thank you. This works for me. >>> >>> Sreeram >>> >>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang >>> wrote: >>> >>>> Hi, Sreeram, >>>> You can try this code. Since x, y are both MPI vectors, we just need to >>>> say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your >>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35], >>>> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index sets >>>> [0..17], [18..35], .., []. PETSc will figure out how to do the >>>> communication. >>>> >>>> PetscInt rstart, rend, N; >>>> IS ix; >>>> VecScatter vscat; >>>> Vec y; >>>> MPI_Comm comm; >>>> VecType type; >>>> >>>> PetscObjectGetComm((PetscObject)x, &comm); >>>> VecGetType(x, &type); >>>> VecGetSize(x, &N); >>>> VecGetOwnershipRange(x, &rstart, &rend); >>>> >>>> VecCreate(comm, &y); >>>> VecSetSizes(y, PETSC_DECIDE, N); >>>> VecSetType(y, type); >>>> >>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart, >>>> 1, &ix); >>>> VecScatterCreate(x, ix, y, ix, &vscat); >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat >>>> wrote: >>>> >>>>> Suppose I am running on 12 processors, and I have a vector "v" of size >>>>> 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so it >>>>> has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to >>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3, >>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure >>>>> what IndexSets to use for the sender and receiver. >>>>> >>>>> The result I am trying to achieve is this: >>>>> >>>>> Assume the vector is v = <0, 1, 2, ..., 35> >>>>> >>>>> Start Finish >>>>> Proc | Entries Proc | Entries >>>>> 0 | 0,...,8 0 | 0, 1, 2 >>>>> 1 | 9,...,17 1 | 3, 4, 5 >>>>> 2 | 18,...,26 2 | 6, 7, 8 >>>>> 3 | 27,...,35 3 | 9, 10, 11 >>>>> 4 | None 4 | 12, 13, 14 >>>>> 5 | None 5 | 15, 16, 17 >>>>> 6 | None 6 | 18, 19, 20 >>>>> 7 | None 7 | 21, 22, 23 >>>>> 8 | None 8 | 24, 25, 26 >>>>> 9 | None 9 | 27, 28, 29 >>>>> 10 | None 10 | 30, 31, 32 >>>>> 11 | None 11 | 33, 34, 35 >>>>> >>>>> Appreciate any help you can provide on this. >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Dec 6 11:12:04 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 6 Dec 2023 11:12:04 -0600 Subject: [petsc-users] Scattering a vector to/from a subset of processors In-Reply-To: References: Message-ID: Hi, Sreeram, I made an example with your approach. It worked fine as you see the output at the end #include int main(int argc, char **argv) { PetscInt i, j, rstart, rend, n, N, *indices; PetscMPIInt size, rank; IS ix; VecScatter vscat; Vec x, y; PetscFunctionBeginUser; PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL)); PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); PetscCall(VecCreate(PETSC_COMM_WORLD, &x)); PetscCall(VecSetFromOptions(x)); PetscCall(PetscObjectSetName((PetscObject)x, "Vec X")); n = (rank < 4) ? 9 : 0; PetscCall(VecSetSizes(x, n, PETSC_DECIDE)); PetscCall(VecGetOwnershipRange(x, &rstart, &rend)); for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i, (PetscScalar)i, INSERT_VALUES)); PetscCall(VecAssemblyBegin(x)); PetscCall(VecAssemblyEnd(x)); PetscCall(VecGetSize(x, &N)); PetscCall(VecCreate(PETSC_COMM_WORLD, &y)); PetscCall(VecSetFromOptions(y)); PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y")); PetscCall(VecSetSizes(y, PETSC_DECIDE, N)); PetscCall(VecGetOwnershipRange(y, &rstart, &rend)); PetscCall(PetscMalloc1(rend - rstart, &indices)); for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j; PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices, PETSC_OWN_POINTER, &ix)); PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat)); PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD)); PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD)); PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD)); PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD)); PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD)); PetscCall(VecScatterDestroy(&vscat)); PetscCall(ISDestroy(&ix)); PetscCall(VecDestroy(&x)); PetscCall(VecDestroy(&y)); PetscCall(PetscFinalize()); return 0; } $ mpirun -n 12 ./ex100 IS Object: 12 MPI processes type: general [0] Number of indices in set 3 [0] 0 0 [0] 1 12 [0] 2 24 [1] Number of indices in set 3 [1] 0 1 [1] 1 13 [1] 2 25 [2] Number of indices in set 3 [2] 0 2 [2] 1 14 [2] 2 26 [3] Number of indices in set 3 [3] 0 3 [3] 1 15 [3] 2 27 [4] Number of indices in set 3 [4] 0 4 [4] 1 16 [4] 2 28 [5] Number of indices in set 3 [5] 0 5 [5] 1 17 [5] 2 29 [6] Number of indices in set 3 [6] 0 6 [6] 1 18 [6] 2 30 [7] Number of indices in set 3 [7] 0 7 [7] 1 19 [7] 2 31 [8] Number of indices in set 3 [8] 0 8 [8] 1 20 [8] 2 32 [9] Number of indices in set 3 [9] 0 9 [9] 1 21 [9] 2 33 [10] Number of indices in set 3 [10] 0 10 [10] 1 22 [10] 2 34 [11] Number of indices in set 3 [11] 0 11 [11] 1 23 [11] 2 35 Vec Object: Vec X 12 MPI processes type: mpi Process [0] 0. 1. 2. 3. 4. 5. 6. 7. 8. Process [1] 9. 10. 11. 12. 13. 14. 15. 16. 17. Process [2] 18. 19. 20. 21. 22. 23. 24. 25. 26. Process [3] 27. 28. 29. 30. 31. 32. 33. 34. 35. Process [4] Process [5] Process [6] Process [7] Process [8] Process [9] Process [10] Process [11] Vec Object: Vec Y 12 MPI processes type: mpi Process [0] 0. 12. 24. Process [1] 1. 13. 25. Process [2] 2. 14. 26. Process [3] 3. 15. 27. Process [4] 4. 16. 28. Process [5] 5. 17. 29. Process [6] 6. 18. 30. Process [7] 7. 19. 31. Process [8] 8. 20. 32. Process [9] 9. 21. 33. Process [10] 10. 22. 34. Process [11] 11. 23. 35. --Junchao Zhang On Tue, Dec 5, 2023 at 10:09?PM Sreeram R Venkat wrote: > Yes, I have an example code at github.com/s769/petsc-test. Only thing is, > when I described the example before, I simplified the actual use case in > the code to make things simpler. Here are the extra details relevant to > this code: > > - We assume a 2D processor grid, given by the command-line args > -proc_rows and -proc_cols > - The total length of the vector is n_m*n_t given by command-line args > -nm and -nt. n_m corresponds to a space index and n_t a time index. > - In the "Start" phase, the vector is divided into n_m blocks each of > size n_t (indexed space->time). The blocks are partitioned over the first > row of processors. For example if -nm = 4 and -proc_cols = 4, each > processor in the first row will get one block of size n_t. Each processor > in the first row gets n_m_local blocks of size n_t, where the sum of all > n_m_locals for the first row of processors is n_m. > - In the "Finish" phase, the vector is divided into n_t blocks each of > size n_m (indexed time->space; this is the reason for the permutation of > indices). The blocks are partitioned over all processors. Each processor > will get n_t_local blocks of size n_m, where the sum of all n_t_locals for > all processors is n_t. > > I think the basic idea is similar to the previous example, but these > details make things a bit more complicated. Please let me know if anything > is unclear, and I can try to explain more. > > Thanks for your help, > Sreeram > > On Tue, Dec 5, 2023 at 9:30?PM Junchao Zhang > wrote: > >> I think your approach is correct. Do you have an example code? >> >> --Junchao Zhang >> >> >> On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat >> wrote: >> >>> Hi, I have a follow up question on this. >>> >>> Now, I'm trying to do a scatter and permutation of the vector. Under the >>> same setup as the original example, here are the new Start and Finish >>> states I want to achieve: >>> Start Finish >>> Proc | Entries Proc | Entries >>> 0 | 0,...,8 0 | 0, 12, 24 >>> 1 | 9,...,17 1 | 1, 13, 25 >>> 2 | 18,...,26 2 | 2, 14, 26 >>> 3 | 27,...,35 3 | 3, 15, 27 >>> 4 | None 4 | 4, 16, 28 >>> 5 | None 5 | 5, 17, 29 >>> 6 | None 6 | 6, 18, 30 >>> 7 | None 7 | 7, 19, 31 >>> 8 | None 8 | 8, 20, 32 >>> 9 | None 9 | 9, 21, 33 >>> 10 | None 10 | 10, 22, 34 >>> 11 | None 11 | 11, 23, 35 >>> >>> So far, I've tried to use ISCreateGeneral(), with each process giving an >>> idx array corresponding to the indices it wants (i.e. idx on P0 looks like >>> [0,12,24] P1 -> [1,13, 25], and so on). >>> Then I use this to create the VecScatter with VecScatterCreate(x, is, y, >>> NULL, &scatter). >>> >>> However, when I try to do the scatter, I get some illegal memory access >>> errors. >>> >>> Is there something wrong with how I define the index sets? >>> >>> Thanks, >>> Sreeram >>> >>> >>> >>> >>> >>> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat >>> wrote: >>> >>>> Thank you. This works for me. >>>> >>>> Sreeram >>>> >>>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang >>>> wrote: >>>> >>>>> Hi, Sreeram, >>>>> You can try this code. Since x, y are both MPI vectors, we just need >>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your >>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35], >>>>> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index sets >>>>> [0..17], [18..35], .., []. PETSc will figure out how to do the >>>>> communication. >>>>> >>>>> PetscInt rstart, rend, N; >>>>> IS ix; >>>>> VecScatter vscat; >>>>> Vec y; >>>>> MPI_Comm comm; >>>>> VecType type; >>>>> >>>>> PetscObjectGetComm((PetscObject)x, &comm); >>>>> VecGetType(x, &type); >>>>> VecGetSize(x, &N); >>>>> VecGetOwnershipRange(x, &rstart, &rend); >>>>> >>>>> VecCreate(comm, &y); >>>>> VecSetSizes(y, PETSC_DECIDE, N); >>>>> VecSetType(y, type); >>>>> >>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart, >>>>> 1, &ix); >>>>> VecScatterCreate(x, ix, y, ix, &vscat); >>>>> >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat >>>>> wrote: >>>>> >>>>>> Suppose I am running on 12 processors, and I have a vector "v" of >>>>>> size 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so >>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to >>>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3, >>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure >>>>>> what IndexSets to use for the sender and receiver. >>>>>> >>>>>> The result I am trying to achieve is this: >>>>>> >>>>>> Assume the vector is v = <0, 1, 2, ..., 35> >>>>>> >>>>>> Start Finish >>>>>> Proc | Entries Proc | Entries >>>>>> 0 | 0,...,8 0 | 0, 1, 2 >>>>>> 1 | 9,...,17 1 | 3, 4, 5 >>>>>> 2 | 18,...,26 2 | 6, 7, 8 >>>>>> 3 | 27,...,35 3 | 9, 10, 11 >>>>>> 4 | None 4 | 12, 13, 14 >>>>>> 5 | None 5 | 15, 16, 17 >>>>>> 6 | None 6 | 18, 19, 20 >>>>>> 7 | None 7 | 21, 22, 23 >>>>>> 8 | None 8 | 24, 25, 26 >>>>>> 9 | None 9 | 27, 28, 29 >>>>>> 10 | None 10 | 30, 31, 32 >>>>>> 11 | None 11 | 33, 34, 35 >>>>>> >>>>>> Appreciate any help you can provide on this. >>>>>> >>>>>> Thanks, >>>>>> Sreeram >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Wed Dec 6 13:20:45 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Wed, 6 Dec 2023 13:20:45 -0600 Subject: [petsc-users] Scattering a vector to/from a subset of processors In-Reply-To: References: Message-ID: Thank you for your help. It turned out the problem was that I forgot to assemble the "x" vector before doing the scatter. It seems to be working now. Thanks, Sreeram On Wed, Dec 6, 2023 at 11:12?AM Junchao Zhang wrote: > Hi, Sreeram, > I made an example with your approach. It worked fine as you see the > output at the end > > #include > int main(int argc, char **argv) > { > PetscInt i, j, rstart, rend, n, N, *indices; > PetscMPIInt size, rank; > IS ix; > VecScatter vscat; > Vec x, y; > > PetscFunctionBeginUser; > PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL)); > PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); > PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); > > PetscCall(VecCreate(PETSC_COMM_WORLD, &x)); > PetscCall(VecSetFromOptions(x)); > PetscCall(PetscObjectSetName((PetscObject)x, "Vec X")); > n = (rank < 4) ? 9 : 0; > PetscCall(VecSetSizes(x, n, PETSC_DECIDE)); > > PetscCall(VecGetOwnershipRange(x, &rstart, &rend)); > for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i, > (PetscScalar)i, INSERT_VALUES)); > PetscCall(VecAssemblyBegin(x)); > PetscCall(VecAssemblyEnd(x)); > PetscCall(VecGetSize(x, &N)); > > PetscCall(VecCreate(PETSC_COMM_WORLD, &y)); > PetscCall(VecSetFromOptions(y)); > PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y")); > PetscCall(VecSetSizes(y, PETSC_DECIDE, N)); > > PetscCall(VecGetOwnershipRange(y, &rstart, &rend)); > PetscCall(PetscMalloc1(rend - rstart, &indices)); > for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j; > > PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices, > PETSC_OWN_POINTER, &ix)); > PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat)); > > PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD)); > PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD)); > > PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD)); > PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD)); > PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD)); > > PetscCall(VecScatterDestroy(&vscat)); > PetscCall(ISDestroy(&ix)); > PetscCall(VecDestroy(&x)); > PetscCall(VecDestroy(&y)); > PetscCall(PetscFinalize()); > return 0; > } > > $ mpirun -n 12 ./ex100 > IS Object: 12 MPI processes > type: general > [0] Number of indices in set 3 > [0] 0 0 > [0] 1 12 > [0] 2 24 > [1] Number of indices in set 3 > [1] 0 1 > [1] 1 13 > [1] 2 25 > [2] Number of indices in set 3 > [2] 0 2 > [2] 1 14 > [2] 2 26 > [3] Number of indices in set 3 > [3] 0 3 > [3] 1 15 > [3] 2 27 > [4] Number of indices in set 3 > [4] 0 4 > [4] 1 16 > [4] 2 28 > [5] Number of indices in set 3 > [5] 0 5 > [5] 1 17 > [5] 2 29 > [6] Number of indices in set 3 > [6] 0 6 > [6] 1 18 > [6] 2 30 > [7] Number of indices in set 3 > [7] 0 7 > [7] 1 19 > [7] 2 31 > [8] Number of indices in set 3 > [8] 0 8 > [8] 1 20 > [8] 2 32 > [9] Number of indices in set 3 > [9] 0 9 > [9] 1 21 > [9] 2 33 > [10] Number of indices in set 3 > [10] 0 10 > [10] 1 22 > [10] 2 34 > [11] Number of indices in set 3 > [11] 0 11 > [11] 1 23 > [11] 2 35 > Vec Object: Vec X 12 MPI processes > type: mpi > Process [0] > 0. > 1. > 2. > 3. > 4. > 5. > 6. > 7. > 8. > Process [1] > 9. > 10. > 11. > 12. > 13. > 14. > 15. > 16. > 17. > Process [2] > 18. > 19. > 20. > 21. > 22. > 23. > 24. > 25. > 26. > Process [3] > 27. > 28. > 29. > 30. > 31. > 32. > 33. > 34. > 35. > Process [4] > Process [5] > Process [6] > Process [7] > Process [8] > Process [9] > Process [10] > Process [11] > Vec Object: Vec Y 12 MPI processes > type: mpi > Process [0] > 0. > 12. > 24. > Process [1] > 1. > 13. > 25. > Process [2] > 2. > 14. > 26. > Process [3] > 3. > 15. > 27. > Process [4] > 4. > 16. > 28. > Process [5] > 5. > 17. > 29. > Process [6] > 6. > 18. > 30. > Process [7] > 7. > 19. > 31. > Process [8] > 8. > 20. > 32. > Process [9] > 9. > 21. > 33. > Process [10] > 10. > 22. > 34. > Process [11] > 11. > 23. > 35. > > --Junchao Zhang > > > On Tue, Dec 5, 2023 at 10:09?PM Sreeram R Venkat > wrote: > >> Yes, I have an example code at github.com/s769/petsc-test. Only thing >> is, when I described the example before, I simplified the actual use case >> in the code to make things simpler. Here are the extra details relevant to >> this code: >> >> - We assume a 2D processor grid, given by the command-line args >> -proc_rows and -proc_cols >> - The total length of the vector is n_m*n_t given by command-line >> args -nm and -nt. n_m corresponds to a space index and n_t a time index. >> - In the "Start" phase, the vector is divided into n_m blocks each of >> size n_t (indexed space->time). The blocks are partitioned over the first >> row of processors. For example if -nm = 4 and -proc_cols = 4, each >> processor in the first row will get one block of size n_t. Each processor >> in the first row gets n_m_local blocks of size n_t, where the sum of all >> n_m_locals for the first row of processors is n_m. >> - In the "Finish" phase, the vector is divided into n_t blocks each >> of size n_m (indexed time->space; this is the reason for the permutation of >> indices). The blocks are partitioned over all processors. Each processor >> will get n_t_local blocks of size n_m, where the sum of all n_t_locals for >> all processors is n_t. >> >> I think the basic idea is similar to the previous example, but these >> details make things a bit more complicated. Please let me know if anything >> is unclear, and I can try to explain more. >> >> Thanks for your help, >> Sreeram >> >> On Tue, Dec 5, 2023 at 9:30?PM Junchao Zhang >> wrote: >> >>> I think your approach is correct. Do you have an example code? >>> >>> --Junchao Zhang >>> >>> >>> On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat >>> wrote: >>> >>>> Hi, I have a follow up question on this. >>>> >>>> Now, I'm trying to do a scatter and permutation of the vector. Under >>>> the same setup as the original example, here are the new Start and Finish >>>> states I want to achieve: >>>> Start Finish >>>> Proc | Entries Proc | Entries >>>> 0 | 0,...,8 0 | 0, 12, 24 >>>> 1 | 9,...,17 1 | 1, 13, 25 >>>> 2 | 18,...,26 2 | 2, 14, 26 >>>> 3 | 27,...,35 3 | 3, 15, 27 >>>> 4 | None 4 | 4, 16, 28 >>>> 5 | None 5 | 5, 17, 29 >>>> 6 | None 6 | 6, 18, 30 >>>> 7 | None 7 | 7, 19, 31 >>>> 8 | None 8 | 8, 20, 32 >>>> 9 | None 9 | 9, 21, 33 >>>> 10 | None 10 | 10, 22, 34 >>>> 11 | None 11 | 11, 23, 35 >>>> >>>> So far, I've tried to use ISCreateGeneral(), with each process giving >>>> an idx array corresponding to the indices it wants (i.e. idx on P0 looks >>>> like [0,12,24] P1 -> [1,13, 25], and so on). >>>> Then I use this to create the VecScatter with VecScatterCreate(x, is, >>>> y, NULL, &scatter). >>>> >>>> However, when I try to do the scatter, I get some illegal memory access >>>> errors. >>>> >>>> Is there something wrong with how I define the index sets? >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat >>>> wrote: >>>> >>>>> Thank you. This works for me. >>>>> >>>>> Sreeram >>>>> >>>>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang >>>>> wrote: >>>>> >>>>>> Hi, Sreeram, >>>>>> You can try this code. Since x, y are both MPI vectors, we just need >>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your >>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35], >>>>>> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index sets >>>>>> [0..17], [18..35], .., []. PETSc will figure out how to do the >>>>>> communication. >>>>>> >>>>>> PetscInt rstart, rend, N; >>>>>> IS ix; >>>>>> VecScatter vscat; >>>>>> Vec y; >>>>>> MPI_Comm comm; >>>>>> VecType type; >>>>>> >>>>>> PetscObjectGetComm((PetscObject)x, &comm); >>>>>> VecGetType(x, &type); >>>>>> VecGetSize(x, &N); >>>>>> VecGetOwnershipRange(x, &rstart, &rend); >>>>>> >>>>>> VecCreate(comm, &y); >>>>>> VecSetSizes(y, PETSC_DECIDE, N); >>>>>> VecSetType(y, type); >>>>>> >>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, >>>>>> rstart, 1, &ix); >>>>>> VecScatterCreate(x, ix, y, ix, &vscat); >>>>>> >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat >>>>>> wrote: >>>>>> >>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of >>>>>>> size 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so >>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to >>>>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3, >>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure >>>>>>> what IndexSets to use for the sender and receiver. >>>>>>> >>>>>>> The result I am trying to achieve is this: >>>>>>> >>>>>>> Assume the vector is v = <0, 1, 2, ..., 35> >>>>>>> >>>>>>> Start Finish >>>>>>> Proc | Entries Proc | Entries >>>>>>> 0 | 0,...,8 0 | 0, 1, 2 >>>>>>> 1 | 9,...,17 1 | 3, 4, 5 >>>>>>> 2 | 18,...,26 2 | 6, 7, 8 >>>>>>> 3 | 27,...,35 3 | 9, 10, 11 >>>>>>> 4 | None 4 | 12, 13, 14 >>>>>>> 5 | None 5 | 15, 16, 17 >>>>>>> 6 | None 6 | 18, 19, 20 >>>>>>> 7 | None 7 | 21, 22, 23 >>>>>>> 8 | None 8 | 24, 25, 26 >>>>>>> 9 | None 9 | 27, 28, 29 >>>>>>> 10 | None 10 | 30, 31, 32 >>>>>>> 11 | None 11 | 33, 34, 35 >>>>>>> >>>>>>> Appreciate any help you can provide on this. >>>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Dec 6 14:36:59 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 6 Dec 2023 14:36:59 -0600 Subject: [petsc-users] Scattering a vector to/from a subset of processors In-Reply-To: References: Message-ID: Glad it worked! --Junchao Zhang On Wed, Dec 6, 2023 at 1:20?PM Sreeram R Venkat wrote: > Thank you for your help. It turned out the problem was that I forgot to > assemble the "x" vector before doing the scatter. It seems to be working > now. > > Thanks, > Sreeram > > On Wed, Dec 6, 2023 at 11:12?AM Junchao Zhang > wrote: > >> Hi, Sreeram, >> I made an example with your approach. It worked fine as you see the >> output at the end >> >> #include >> int main(int argc, char **argv) >> { >> PetscInt i, j, rstart, rend, n, N, *indices; >> PetscMPIInt size, rank; >> IS ix; >> VecScatter vscat; >> Vec x, y; >> >> PetscFunctionBeginUser; >> PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL)); >> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); >> PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); >> >> PetscCall(VecCreate(PETSC_COMM_WORLD, &x)); >> PetscCall(VecSetFromOptions(x)); >> PetscCall(PetscObjectSetName((PetscObject)x, "Vec X")); >> n = (rank < 4) ? 9 : 0; >> PetscCall(VecSetSizes(x, n, PETSC_DECIDE)); >> >> PetscCall(VecGetOwnershipRange(x, &rstart, &rend)); >> for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i, >> (PetscScalar)i, INSERT_VALUES)); >> PetscCall(VecAssemblyBegin(x)); >> PetscCall(VecAssemblyEnd(x)); >> PetscCall(VecGetSize(x, &N)); >> >> PetscCall(VecCreate(PETSC_COMM_WORLD, &y)); >> PetscCall(VecSetFromOptions(y)); >> PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y")); >> PetscCall(VecSetSizes(y, PETSC_DECIDE, N)); >> >> PetscCall(VecGetOwnershipRange(y, &rstart, &rend)); >> PetscCall(PetscMalloc1(rend - rstart, &indices)); >> for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j; >> >> PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices, >> PETSC_OWN_POINTER, &ix)); >> PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat)); >> >> PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD)); >> PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD)); >> >> PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD)); >> PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD)); >> PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD)); >> >> PetscCall(VecScatterDestroy(&vscat)); >> PetscCall(ISDestroy(&ix)); >> PetscCall(VecDestroy(&x)); >> PetscCall(VecDestroy(&y)); >> PetscCall(PetscFinalize()); >> return 0; >> } >> >> $ mpirun -n 12 ./ex100 >> IS Object: 12 MPI processes >> type: general >> [0] Number of indices in set 3 >> [0] 0 0 >> [0] 1 12 >> [0] 2 24 >> [1] Number of indices in set 3 >> [1] 0 1 >> [1] 1 13 >> [1] 2 25 >> [2] Number of indices in set 3 >> [2] 0 2 >> [2] 1 14 >> [2] 2 26 >> [3] Number of indices in set 3 >> [3] 0 3 >> [3] 1 15 >> [3] 2 27 >> [4] Number of indices in set 3 >> [4] 0 4 >> [4] 1 16 >> [4] 2 28 >> [5] Number of indices in set 3 >> [5] 0 5 >> [5] 1 17 >> [5] 2 29 >> [6] Number of indices in set 3 >> [6] 0 6 >> [6] 1 18 >> [6] 2 30 >> [7] Number of indices in set 3 >> [7] 0 7 >> [7] 1 19 >> [7] 2 31 >> [8] Number of indices in set 3 >> [8] 0 8 >> [8] 1 20 >> [8] 2 32 >> [9] Number of indices in set 3 >> [9] 0 9 >> [9] 1 21 >> [9] 2 33 >> [10] Number of indices in set 3 >> [10] 0 10 >> [10] 1 22 >> [10] 2 34 >> [11] Number of indices in set 3 >> [11] 0 11 >> [11] 1 23 >> [11] 2 35 >> Vec Object: Vec X 12 MPI processes >> type: mpi >> Process [0] >> 0. >> 1. >> 2. >> 3. >> 4. >> 5. >> 6. >> 7. >> 8. >> Process [1] >> 9. >> 10. >> 11. >> 12. >> 13. >> 14. >> 15. >> 16. >> 17. >> Process [2] >> 18. >> 19. >> 20. >> 21. >> 22. >> 23. >> 24. >> 25. >> 26. >> Process [3] >> 27. >> 28. >> 29. >> 30. >> 31. >> 32. >> 33. >> 34. >> 35. >> Process [4] >> Process [5] >> Process [6] >> Process [7] >> Process [8] >> Process [9] >> Process [10] >> Process [11] >> Vec Object: Vec Y 12 MPI processes >> type: mpi >> Process [0] >> 0. >> 12. >> 24. >> Process [1] >> 1. >> 13. >> 25. >> Process [2] >> 2. >> 14. >> 26. >> Process [3] >> 3. >> 15. >> 27. >> Process [4] >> 4. >> 16. >> 28. >> Process [5] >> 5. >> 17. >> 29. >> Process [6] >> 6. >> 18. >> 30. >> Process [7] >> 7. >> 19. >> 31. >> Process [8] >> 8. >> 20. >> 32. >> Process [9] >> 9. >> 21. >> 33. >> Process [10] >> 10. >> 22. >> 34. >> Process [11] >> 11. >> 23. >> 35. >> >> --Junchao Zhang >> >> >> On Tue, Dec 5, 2023 at 10:09?PM Sreeram R Venkat >> wrote: >> >>> Yes, I have an example code at github.com/s769/petsc-test. Only thing >>> is, when I described the example before, I simplified the actual use case >>> in the code to make things simpler. Here are the extra details relevant to >>> this code: >>> >>> - We assume a 2D processor grid, given by the command-line args >>> -proc_rows and -proc_cols >>> - The total length of the vector is n_m*n_t given by command-line >>> args -nm and -nt. n_m corresponds to a space index and n_t a time index. >>> - In the "Start" phase, the vector is divided into n_m blocks each >>> of size n_t (indexed space->time). The blocks are partitioned over the >>> first row of processors. For example if -nm = 4 and -proc_cols = 4, each >>> processor in the first row will get one block of size n_t. Each processor >>> in the first row gets n_m_local blocks of size n_t, where the sum of all >>> n_m_locals for the first row of processors is n_m. >>> - In the "Finish" phase, the vector is divided into n_t blocks each >>> of size n_m (indexed time->space; this is the reason for the permutation of >>> indices). The blocks are partitioned over all processors. Each processor >>> will get n_t_local blocks of size n_m, where the sum of all n_t_locals for >>> all processors is n_t. >>> >>> I think the basic idea is similar to the previous example, but these >>> details make things a bit more complicated. Please let me know if anything >>> is unclear, and I can try to explain more. >>> >>> Thanks for your help, >>> Sreeram >>> >>> On Tue, Dec 5, 2023 at 9:30?PM Junchao Zhang >>> wrote: >>> >>>> I think your approach is correct. Do you have an example code? >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat >>>> wrote: >>>> >>>>> Hi, I have a follow up question on this. >>>>> >>>>> Now, I'm trying to do a scatter and permutation of the vector. Under >>>>> the same setup as the original example, here are the new Start and Finish >>>>> states I want to achieve: >>>>> Start Finish >>>>> Proc | Entries Proc | Entries >>>>> 0 | 0,...,8 0 | 0, 12, 24 >>>>> 1 | 9,...,17 1 | 1, 13, 25 >>>>> 2 | 18,...,26 2 | 2, 14, 26 >>>>> 3 | 27,...,35 3 | 3, 15, 27 >>>>> 4 | None 4 | 4, 16, 28 >>>>> 5 | None 5 | 5, 17, 29 >>>>> 6 | None 6 | 6, 18, 30 >>>>> 7 | None 7 | 7, 19, 31 >>>>> 8 | None 8 | 8, 20, 32 >>>>> 9 | None 9 | 9, 21, 33 >>>>> 10 | None 10 | 10, 22, 34 >>>>> 11 | None 11 | 11, 23, 35 >>>>> >>>>> So far, I've tried to use ISCreateGeneral(), with each process giving >>>>> an idx array corresponding to the indices it wants (i.e. idx on P0 looks >>>>> like [0,12,24] P1 -> [1,13, 25], and so on). >>>>> Then I use this to create the VecScatter with VecScatterCreate(x, is, >>>>> y, NULL, &scatter). >>>>> >>>>> However, when I try to do the scatter, I get some illegal memory >>>>> access errors. >>>>> >>>>> Is there something wrong with how I define the index sets? >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat >>>>> wrote: >>>>> >>>>>> Thank you. This works for me. >>>>>> >>>>>> Sreeram >>>>>> >>>>>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang >>>>>> wrote: >>>>>> >>>>>>> Hi, Sreeram, >>>>>>> You can try this code. Since x, y are both MPI vectors, we just need >>>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your >>>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35], >>>>>>> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index sets >>>>>>> [0..17], [18..35], .., []. PETSc will figure out how to do the >>>>>>> communication. >>>>>>> >>>>>>> PetscInt rstart, rend, N; >>>>>>> IS ix; >>>>>>> VecScatter vscat; >>>>>>> Vec y; >>>>>>> MPI_Comm comm; >>>>>>> VecType type; >>>>>>> >>>>>>> PetscObjectGetComm((PetscObject)x, &comm); >>>>>>> VecGetType(x, &type); >>>>>>> VecGetSize(x, &N); >>>>>>> VecGetOwnershipRange(x, &rstart, &rend); >>>>>>> >>>>>>> VecCreate(comm, &y); >>>>>>> VecSetSizes(y, PETSC_DECIDE, N); >>>>>>> VecSetType(y, type); >>>>>>> >>>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, >>>>>>> rstart, 1, &ix); >>>>>>> VecScatterCreate(x, ix, y, ix, &vscat); >>>>>>> >>>>>>> --Junchao Zhang >>>>>>> >>>>>>> >>>>>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat >>>>>>> wrote: >>>>>>> >>>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of >>>>>>>> size 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so >>>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to >>>>>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3, >>>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure >>>>>>>> what IndexSets to use for the sender and receiver. >>>>>>>> >>>>>>>> The result I am trying to achieve is this: >>>>>>>> >>>>>>>> Assume the vector is v = <0, 1, 2, ..., 35> >>>>>>>> >>>>>>>> Start Finish >>>>>>>> Proc | Entries Proc | Entries >>>>>>>> 0 | 0,...,8 0 | 0, 1, 2 >>>>>>>> 1 | 9,...,17 1 | 3, 4, 5 >>>>>>>> 2 | 18,...,26 2 | 6, 7, 8 >>>>>>>> 3 | 27,...,35 3 | 9, 10, 11 >>>>>>>> 4 | None 4 | 12, 13, 14 >>>>>>>> 5 | None 5 | 15, 16, 17 >>>>>>>> 6 | None 6 | 18, 19, 20 >>>>>>>> 7 | None 7 | 21, 22, 23 >>>>>>>> 8 | None 8 | 24, 25, 26 >>>>>>>> 9 | None 9 | 27, 28, 29 >>>>>>>> 10 | None 10 | 30, 31, 32 >>>>>>>> 11 | None 11 | 33, 34, 35 >>>>>>>> >>>>>>>> Appreciate any help you can provide on this. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sreeram >>>>>>>> >>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From coltonbryant2021 at u.northwestern.edu Wed Dec 6 16:53:42 2023 From: coltonbryant2021 at u.northwestern.edu (Colton Bryant) Date: Wed, 6 Dec 2023 16:53:42 -0600 Subject: [petsc-users] DMSTAG Gathering Vector on single process Message-ID: Hello, I am working on a code in which a DMSTAG object is used to solve a fluid flow problem and I need to gather this flow data on a single process to interact with an existing (serial) library at each timestep of my simulation. After looking around the solution I've tried is: -use DMStagVecSplitToDMDA to extract vectors of each component of the flow -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the components naturally ordered -use VecScatterCreateToZero to set up and then do the scatter to gather on the single process Unless I'm misunderstanding something this method results in a lot of memory allocation/freeing happening at each step of the evolution and I was wondering if there is a way to directly perform such a scatter from the DMSTAG object without splitting as I'm doing here. Any advice would be much appreciated! Best, Colton Bryant -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 6 17:18:28 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 6 Dec 2023 18:18:28 -0500 Subject: [petsc-users] DMSTAG Gathering Vector on single process In-Reply-To: References: Message-ID: On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant < coltonbryant2021 at u.northwestern.edu> wrote: > Hello, > > I am working on a code in which a DMSTAG object is used to solve a fluid > flow problem and I need to gather this flow data on a single process to > interact with an existing (serial) library at each timestep of my > simulation. After looking around the solution I've tried is: > > -use DMStagVecSplitToDMDA to extract vectors of each component of the flow > -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the components > naturally ordered > -use VecScatterCreateToZero to set up and then do the scatter to gather on > the single process > > Unless I'm misunderstanding something this method results in a lot of > memory allocation/freeing happening at each step of the evolution and I was > wondering if there is a way to directly perform such a scatter from the > DMSTAG object without splitting as I'm doing here. > 1) You can see here: https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA that this function is small. You can do the DMDA creation manually, and then just call DMStagMigrateVecDMDA() each time, which will not create anything. 2) You can create the natural vector upfront, and just scatter each time. 3) You can create the serial vector upfront, and just scatter each time. This is some data movement. You can compress the g2n and 2zero scatters using https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/ as an optimization. Thanks, Matt > Any advice would be much appreciated! > > Best, > Colton Bryant > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From coltonbryant2021 at u.northwestern.edu Wed Dec 6 17:37:59 2023 From: coltonbryant2021 at u.northwestern.edu (Colton Bryant) Date: Wed, 6 Dec 2023 17:37:59 -0600 Subject: [petsc-users] DMSTAG Gathering Vector on single process In-Reply-To: References: Message-ID: Ah excellent! I was not aware of the ability to preallocate the objects and migrate them each time. Thanks! -Colton On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley wrote: > On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant < > coltonbryant2021 at u.northwestern.edu> wrote: > >> Hello, >> >> I am working on a code in which a DMSTAG object is used to solve a fluid >> flow problem and I need to gather this flow data on a single process to >> interact with an existing (serial) library at each timestep of my >> simulation. After looking around the solution I've tried is: >> >> -use DMStagVecSplitToDMDA to extract vectors of each component of the flow >> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the >> components naturally ordered >> -use VecScatterCreateToZero to set up and then do the scatter to gather >> on the single process >> >> Unless I'm misunderstanding something this method results in a lot of >> memory allocation/freeing happening at each step of the evolution and I was >> wondering if there is a way to directly perform such a scatter from the >> DMSTAG object without splitting as I'm doing here. >> > > 1) You can see here: > > https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA > > that this function is small. You can do the DMDA creation manually, and > then just call DMStagMigrateVecDMDA() each time, which will not create > anything. > > 2) You can create the natural vector upfront, and just scatter each time. > > 3) You can create the serial vector upfront, and just scatter each time. > > This is some data movement. You can compress the g2n and 2zero scatters > using > > https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/ > > as an optimization. > > Thanks, > > Matt > > >> Any advice would be much appreciated! >> >> Best, >> Colton Bryant >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 6 17:50:52 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 6 Dec 2023 18:50:52 -0500 Subject: [petsc-users] DMSTAG Gathering Vector on single process In-Reply-To: References: Message-ID: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev> Depending on the serial library you may not need to split the vector into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just global to natural and scatter to zero on the full vector, now the full vector is on the first rank and you can access what you need in that one vector if possible. > On Dec 6, 2023, at 6:37?PM, Colton Bryant wrote: > > Ah excellent! I was not aware of the ability to preallocate the objects and migrate them each time. > > Thanks! > -Colton > > On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley > wrote: >> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant > wrote: >>> Hello, >>> >>> I am working on a code in which a DMSTAG object is used to solve a fluid flow problem and I need to gather this flow data on a single process to interact with an existing (serial) library at each timestep of my simulation. After looking around the solution I've tried is: >>> >>> -use DMStagVecSplitToDMDA to extract vectors of each component of the flow >>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the components naturally ordered >>> -use VecScatterCreateToZero to set up and then do the scatter to gather on the single process >>> >>> Unless I'm misunderstanding something this method results in a lot of memory allocation/freeing happening at each step of the evolution and I was wondering if there is a way to directly perform such a scatter from the DMSTAG object without splitting as I'm doing here. >> >> 1) You can see here: >> >> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA >> >> that this function is small. You can do the DMDA creation manually, and then just call DMStagMigrateVecDMDA() each time, which will not create anything. >> >> 2) You can create the natural vector upfront, and just scatter each time. >> >> 3) You can create the serial vector upfront, and just scatter each time. >> >> This is some data movement. You can compress the g2n and 2zero scatters using >> >> https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/ >> >> as an optimization. >> >> Thanks, >> >> Matt >> >>> Any advice would be much appreciated! >>> >>> Best, >>> Colton Bryant >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 6 19:35:04 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 6 Dec 2023 20:35:04 -0500 Subject: [petsc-users] DMSTAG Gathering Vector on single process In-Reply-To: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev> References: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev> Message-ID: On Wed, Dec 6, 2023 at 8:10?PM Barry Smith wrote: > > Depending on the serial library you may not need to split the vector > into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just > global to natural and scatter to zero on the full vector, now the full > vector is on the first rank and you can access what you need in that one > vector if possible. > Does DMStag have a GlobalToNatural? Also, the serial code would have to have identical interleaving. Thanks, Matt > On Dec 6, 2023, at 6:37?PM, Colton Bryant < > coltonbryant2021 at u.northwestern.edu> wrote: > > Ah excellent! I was not aware of the ability to preallocate the objects > and migrate them each time. > > Thanks! > -Colton > > On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley wrote: > >> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant < >> coltonbryant2021 at u.northwestern.edu> wrote: >> >>> Hello, >>> >>> I am working on a code in which a DMSTAG object is used to solve a fluid >>> flow problem and I need to gather this flow data on a single process to >>> interact with an existing (serial) library at each timestep of my >>> simulation. After looking around the solution I've tried is: >>> >>> -use DMStagVecSplitToDMDA to extract vectors of each component of the >>> flow >>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the >>> components naturally ordered >>> -use VecScatterCreateToZero to set up and then do the scatter to gather >>> on the single process >>> >>> Unless I'm misunderstanding something this method results in a lot of >>> memory allocation/freeing happening at each step of the evolution and I was >>> wondering if there is a way to directly perform such a scatter from the >>> DMSTAG object without splitting as I'm doing here. >>> >> >> 1) You can see here: >> >> >> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA >> >> that this function is small. You can do the DMDA creation manually, and >> then just call DMStagMigrateVecDMDA() each time, which will not create >> anything. >> >> 2) You can create the natural vector upfront, and just scatter each time. >> >> 3) You can create the serial vector upfront, and just scatter each time. >> >> This is some data movement. You can compress the g2n and 2zero scatters >> using >> >> https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/ >> >> as an optimization. >> >> Thanks, >> >> Matt >> >> >>> Any advice would be much appreciated! >>> >>> Best, >>> Colton Bryant >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 6 20:17:51 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 6 Dec 2023 21:17:51 -0500 Subject: [petsc-users] DMSTAG Gathering Vector on single process In-Reply-To: References: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev> Message-ID: <73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev> > On Dec 6, 2023, at 8:35?PM, Matthew Knepley wrote: > > On Wed, Dec 6, 2023 at 8:10?PM Barry Smith > wrote: >> >> Depending on the serial library you may not need to split the vector into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just global to natural and scatter to zero on the full vector, now the full vector is on the first rank and you can access what you need in that one vector if possible. > > Does DMStag have a GlobalToNatural? Good point, it does not appear to have such a thing, though it could. > Also, the serial code would have to have identical interleaving. > > Thanks, > > Matt >>> On Dec 6, 2023, at 6:37?PM, Colton Bryant > wrote: >>> >>> Ah excellent! I was not aware of the ability to preallocate the objects and migrate them each time. >>> >>> Thanks! >>> -Colton >>> >>> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley > wrote: >>>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant > wrote: >>>>> Hello, >>>>> >>>>> I am working on a code in which a DMSTAG object is used to solve a fluid flow problem and I need to gather this flow data on a single process to interact with an existing (serial) library at each timestep of my simulation. After looking around the solution I've tried is: >>>>> >>>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the flow >>>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the components naturally ordered >>>>> -use VecScatterCreateToZero to set up and then do the scatter to gather on the single process >>>>> >>>>> Unless I'm misunderstanding something this method results in a lot of memory allocation/freeing happening at each step of the evolution and I was wondering if there is a way to directly perform such a scatter from the DMSTAG object without splitting as I'm doing here. >>>> >>>> 1) You can see here: >>>> >>>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA >>>> >>>> that this function is small. You can do the DMDA creation manually, and then just call DMStagMigrateVecDMDA() each time, which will not create anything. >>>> >>>> 2) You can create the natural vector upfront, and just scatter each time. >>>> >>>> 3) You can create the serial vector upfront, and just scatter each time. >>>> >>>> This is some data movement. You can compress the g2n and 2zero scatters using >>>> >>>> https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/ >>>> >>>> as an optimization. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> Any advice would be much appreciated! >>>>> >>>>> Best, >>>>> Colton Bryant >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Thu Dec 7 12:17:02 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 7 Dec 2023 12:17:02 -0600 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors Message-ID: I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >From what I have read on the documentation, I can think of 2 approaches. 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. Which would be the more efficient option? As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. The reason is that this could allow for more coalesced memory access when doing matvecs. Thanks, Sreeram -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Dec 7 13:34:15 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 7 Dec 2023 14:34:15 -0500 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: Message-ID: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> > On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat wrote: > > I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: > > 1. Matvecs of the form M*v_i = w_i, for i = 1..m. > 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. > > From what I have read on the documentation, I can think of 2 approaches. > > 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. > > 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. > > Which would be the more efficient option? Use 1. > > As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. No > The reason is that this could allow for more coalesced memory access when doing matvecs. PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized > > Thanks, > Sreeram -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Thu Dec 7 14:02:19 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 7 Dec 2023 21:02:19 +0100 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> Message-ID: To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html. Also, I?m guessing you are using some sort of preconditioner within your KSP. Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient. You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code. Thanks, Pierre > On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: > > > >> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat wrote: >> >> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: >> >> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >> >> From what I have read on the documentation, I can think of 2 approaches. >> >> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. >> >> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. >> >> Which would be the more efficient option? > > Use 1. >> >> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. > > No > >> The reason is that this could allow for more coalesced memory access when doing matvecs. > > PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized > >> >> Thanks, >> Sreeram -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Thu Dec 7 14:37:49 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 7 Dec 2023 14:37:49 -0600 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> Message-ID: Thank you Barry and Pierre; I will proceed with the first option. I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs. Thanks, Sreeram On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet wrote: > To expand on Barry?s answer, we have observed repeatedly that MatMatMult > with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce > this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html > . > Also, I?m guessing you are using some sort of preconditioner within your > KSP. > Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand > sides column by column, which is very inefficient. > You could run your code with -info dump and send us dump.0 to see what > needs to be done on our end to make things more efficient, should you not > be satisfied with the current performance of the code. > > Thanks, > Pierre > > On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: > > > > On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat wrote: > > I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x > n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has > size n. The data for v can be stored either in column-major or row-major > order. Now, I want to do 2 types of operations: > > 1. Matvecs of the form M*v_i = w_i, for i = 1..m. > 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. > > From what I have read on the documentation, I can think of 2 approaches. > > 1. Get the pointer to the data in v (column-major) and use it to create a > dense matrix V. Then do a MatMatMult with M*V = W, and take the data > pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R > and V. > > 2. Create a MATMAIJ using M/R and use that for matvecs directly with the > vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a > multiple RHS system and act accordingly. > > Which would be the more efficient option? > > > Use 1. > > > As a side-note, I am also wondering if there is a way to use row-major > storage of the vector v. > > > No > > The reason is that this could allow for more coalesced memory access when > doing matvecs. > > > PETSc matrix-vector products use BLAS GMEV matrix-vector products for > the computation so in theory they should already be well-optimized > > > Thanks, > Sreeram > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Thu Dec 7 15:02:54 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 7 Dec 2023 22:02:54 +0100 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> Message-ID: > On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat wrote: > > Thank you Barry and Pierre; I will proceed with the first option. > > I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs. Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation. BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. But let us know if you need assistance figuring things out. Thanks, Pierre > Thanks, > Sreeram > > On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet > wrote: >> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html. >> Also, I?m guessing you are using some sort of preconditioner within your KSP. >> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient. >> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code. >> >> Thanks, >> Pierre >> >>> On 7 Dec 2023, at 8:34?PM, Barry Smith > wrote: >>> >>> >>> >>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat > wrote: >>>> >>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: >>>> >>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>> >>>> From what I have read on the documentation, I can think of 2 approaches. >>>> >>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. >>>> >>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. >>>> >>>> Which would be the more efficient option? >>> >>> Use 1. >>>> >>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. >>> >>> No >>> >>>> The reason is that this could allow for more coalesced memory access when doing matvecs. >>> >>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized >>> >>>> >>>> Thanks, >>>> Sreeram >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Thu Dec 7 15:10:58 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 7 Dec 2023 15:10:58 -0600 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> Message-ID: Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier. Thanks, Sreeram On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet wrote: > > > On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat wrote: > > Thank you Barry and Pierre; I will proceed with the first option. > > I want to use the AMGX preconditioner for the KSP. I will try it out and > see how it performs. > > > Just FYI, AMGX does not handle systems with multiple RHS, and thus has no > PCMatApply() implementation. > BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. > But let us know if you need assistance figuring things out. > > Thanks, > Pierre > > Thanks, > Sreeram > > On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet wrote: > >> To expand on Barry?s answer, we have observed repeatedly that MatMatMult >> with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce >> this on your own with >> https://petsc.org/release/src/mat/tests/ex237.c.html. >> Also, I?m guessing you are using some sort of preconditioner within your >> KSP. >> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >> right-hand sides column by column, which is very inefficient. >> You could run your code with -info dump and send us dump.0 to see what >> needs to be done on our end to make things more efficient, should you not >> be satisfied with the current performance of the code. >> >> Thanks, >> Pierre >> >> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >> >> >> >> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat wrote: >> >> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x >> n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has >> size n. The data for v can be stored either in column-major or row-major >> order. Now, I want to do 2 types of operations: >> >> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >> >> From what I have read on the documentation, I can think of 2 approaches. >> >> 1. Get the pointer to the data in v (column-major) and use it to create a >> dense matrix V. Then do a MatMatMult with M*V = W, and take the data >> pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R >> and V. >> >> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the >> vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a >> multiple RHS system and act accordingly. >> >> Which would be the more efficient option? >> >> >> Use 1. >> >> >> As a side-note, I am also wondering if there is a way to use row-major >> storage of the vector v. >> >> >> No >> >> The reason is that this could allow for more coalesced memory access when >> doing matvecs. >> >> >> PETSc matrix-vector products use BLAS GMEV matrix-vector products for >> the computation so in theory they should already be well-optimized >> >> >> Thanks, >> Sreeram >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Dec 7 16:03:58 2023 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 7 Dec 2023 17:03:58 -0500 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> Message-ID: N.B., AMGX interface is a bit experimental. Mark On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat wrote: > Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly > was also tricky so hopefully the HYPRE build will be easier. > > Thanks, > Sreeram > > On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet wrote: > >> >> >> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat wrote: >> >> Thank you Barry and Pierre; I will proceed with the first option. >> >> I want to use the AMGX preconditioner for the KSP. I will try it out and >> see how it performs. >> >> >> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no >> PCMatApply() implementation. >> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >> implementation. >> But let us know if you need assistance figuring things out. >> >> Thanks, >> Pierre >> >> Thanks, >> Sreeram >> >> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet wrote: >> >>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult >>> with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce >>> this on your own with >>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>> Also, I?m guessing you are using some sort of preconditioner within your >>> KSP. >>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>> right-hand sides column by column, which is very inefficient. >>> You could run your code with -info dump and send us dump.0 to see what >>> needs to be done on our end to make things more efficient, should you not >>> be satisfied with the current performance of the code. >>> >>> Thanks, >>> Pierre >>> >>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>> >>> >>> >>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat >>> wrote: >>> >>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x >>> n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has >>> size n. The data for v can be stored either in column-major or row-major >>> order. Now, I want to do 2 types of operations: >>> >>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>> >>> From what I have read on the documentation, I can think of 2 approaches. >>> >>> 1. Get the pointer to the data in v (column-major) and use it to create >>> a dense matrix V. Then do a MatMatMult with M*V = W, and take the data >>> pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R >>> and V. >>> >>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the >>> vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a >>> multiple RHS system and act accordingly. >>> >>> Which would be the more efficient option? >>> >>> >>> Use 1. >>> >>> >>> As a side-note, I am also wondering if there is a way to use row-major >>> storage of the vector v. >>> >>> >>> No >>> >>> The reason is that this could allow for more coalesced memory access >>> when doing matvecs. >>> >>> >>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for >>> the computation so in theory they should already be well-optimized >>> >>> >>> Thanks, >>> Sreeram >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Fri Dec 8 12:53:13 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Fri, 8 Dec 2023 12:53:13 -0600 Subject: [petsc-users] Configure error while building PETSc with CUDA/MVAPICH2-GDR Message-ID: I am trying to build PETSc with CUDA using the CUDA-Aware MVAPICH2-GDR. Here is my configure command: ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 which errors with: UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode arch=compute_80,code=sm_80" generated from "--with-cuda-arch=80" The same configure command works when I use the Intel MPI and I can build with CUDA. The full config.log file is attached. Please let me know if you need any other information. I appreciate your help with this. Thanks, Sreeram -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2306445 bytes Desc: not available URL: From knepley at gmail.com Fri Dec 8 13:00:33 2023 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 8 Dec 2023 14:00:33 -0500 Subject: [petsc-users] Configure error while building PETSc with CUDA/MVAPICH2-GDR In-Reply-To: References: Message-ID: On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat wrote: > I am trying to build PETSc with CUDA using the CUDA-Aware MVAPICH2-GDR. > > Here is my configure command: > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre > --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > > which errors with: > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > --------------------------------------------------------------------------------------------- > CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 > -Xcompiler -fPIC > -Xcompiler -fvisibility=hidden -g -lineinfo -gencode > arch=compute_80,code=sm_80" > generated from "--with-cuda-arch=80" > > > > The same configure command works when I use the Intel MPI and I can build > with CUDA. The full config.log file is attached. Please let me know if you > need any other information. I appreciate your help with this. > The proximate error is Executing: nvcc -c -o /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o -I/tmp/petsc-kn3f29gl/config.setCompilers -I/tmp/petsc-kn3f29gl/config.types -I/tmp/petsc-kn3f29gl/config.packages.cuda -ccbin mpic++ -std=c++14 -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode arch=compute_80,code=sm_80 /tmp/petsc-kn3f29gl/config.packages.cuda/ conftest.cu stdout: /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one instance of overloaded function "__nv_associate_access_property_impl" has "C" linkage 1 error detected in the compilation of "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu". Possible ERROR while running compiler: exit code 1 stderr: /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one instance of overloaded function "__nv_associate_access_property_impl" has "C" linkage 1 error detected in the compilation of "/tmp/petsc-kn3f29gl/config.packages.cuda This looks like screwed up headers to me, but I will let someone that understands CUDA compilation reply. Thanks, Matt Thanks, > Sreeram > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Dec 8 13:14:58 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 8 Dec 2023 13:14:58 -0600 (CST) Subject: [petsc-users] Configure error while building PETSc with CUDA/MVAPICH2-GDR In-Reply-To: References: Message-ID: <1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov> Executing: mpicc -show stdout: icc -I/opt/apps/cuda/11.4/include -I/opt/apps/cuda/11.4/include -lcuda -L/opt/apps/cuda/11.4/lib64/stubs -L/opt/apps/cuda/11.4/lib64 -lcudart -lrt -Wl,-rpath,/opt/apps/cuda/11.4/lib64 -Wl,-rpath,XORIGIN/placeholder -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/ -lm -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags -lmpi Checking for program /opt/apps/cuda/12.0/bin/nvcc...found Looks like you are trying to mix in 2 different cuda versions in this build. Perhaps you need to use cuda-11.4 - with this install of mvapich.. Satish On Fri, 8 Dec 2023, Matthew Knepley wrote: > On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat wrote: > > > I am trying to build PETSc with CUDA using the CUDA-Aware MVAPICH2-GDR. > > > > Here is my configure command: > > > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre > > --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > > > > which errors with: > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > > details): > > > > --------------------------------------------------------------------------------------------- > > CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 > > -Xcompiler -fPIC > > -Xcompiler -fvisibility=hidden -g -lineinfo -gencode > > arch=compute_80,code=sm_80" > > generated from "--with-cuda-arch=80" > > > > > > > > The same configure command works when I use the Intel MPI and I can build > > with CUDA. The full config.log file is attached. Please let me know if you > > need any other information. I appreciate your help with this. > > > > The proximate error is > > Executing: nvcc -c -o /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o > -I/tmp/petsc-kn3f29gl/config.setCompilers > -I/tmp/petsc-kn3f29gl/config.types > -I/tmp/petsc-kn3f29gl/config.packages.cuda -ccbin mpic++ -std=c++14 > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode > arch=compute_80,code=sm_80 /tmp/petsc-kn3f29gl/config.packages.cuda/ > conftest.cu > stdout: > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one > instance of overloaded function "__nv_associate_access_property_impl" has > "C" linkage > 1 error detected in the compilation of > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu". > Possible ERROR while running compiler: exit code 1 > stderr: > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one > instance of overloaded function "__nv_associate_access_property_impl" has > "C" linkage > > 1 error detected in the compilation of > "/tmp/petsc-kn3f29gl/config.packages.cuda > > This looks like screwed up headers to me, but I will let someone that > understands CUDA compilation reply. > > Thanks, > > Matt > > Thanks, > > Sreeram > > > > > From srvenkat at utexas.edu Fri Dec 8 15:29:20 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Fri, 8 Dec 2023 15:29:20 -0600 Subject: [petsc-users] Configure error while building PETSc with CUDA/MVAPICH2-GDR In-Reply-To: <1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov> References: <1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov> Message-ID: Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr module didn't require CUDA 11.4 as a dependency, so I was using 12.0 On Fri, Dec 8, 2023 at 1:15?PM Satish Balay wrote: > Executing: mpicc -show > stdout: icc -I/opt/apps/cuda/11.4/include -I/opt/apps/cuda/11.4/include > -lcuda -L/opt/apps/cuda/11.4/lib64/stubs -L/opt/apps/cuda/11.4/lib64 > -lcudart -lrt -Wl,-rpath,/opt/apps/cuda/11.4/lib64 > -Wl,-rpath,XORIGIN/placeholder -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/ > -lm -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include > -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath > -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags -lmpi > > Checking for program /opt/apps/cuda/12.0/bin/nvcc...found > > Looks like you are trying to mix in 2 different cuda versions in this > build. > > Perhaps you need to use cuda-11.4 - with this install of mvapich.. > > Satish > > On Fri, 8 Dec 2023, Matthew Knepley wrote: > > > On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat > wrote: > > > > > I am trying to build PETSc with CUDA using the CUDA-Aware MVAPICH2-GDR. > > > > > > Here is my configure command: > > > > > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre > > > --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true > > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis > > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > > > > > > which errors with: > > > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > > > details): > > > > > > > --------------------------------------------------------------------------------------------- > > > CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 > > > -Xcompiler -fPIC > > > -Xcompiler -fvisibility=hidden -g -lineinfo -gencode > > > arch=compute_80,code=sm_80" > > > generated from "--with-cuda-arch=80" > > > > > > > > > > > > The same configure command works when I use the Intel MPI and I can > build > > > with CUDA. The full config.log file is attached. Please let me know if > you > > > need any other information. I appreciate your help with this. > > > > > > > The proximate error is > > > > Executing: nvcc -c -o /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o > > -I/tmp/petsc-kn3f29gl/config.setCompilers > > -I/tmp/petsc-kn3f29gl/config.types > > -I/tmp/petsc-kn3f29gl/config.packages.cuda -ccbin mpic++ -std=c++14 > > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode > > arch=compute_80,code=sm_80 /tmp/petsc-kn3f29gl/config.packages.cuda/ > > conftest.cu > > stdout: > > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one > > instance of overloaded function "__nv_associate_access_property_impl" has > > "C" linkage > > 1 error detected in the compilation of > > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu". > > Possible ERROR while running compiler: exit code 1 > > stderr: > > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one > > instance of overloaded function "__nv_associate_access_property_impl" has > > "C" linkage > > > > 1 error detected in the compilation of > > "/tmp/petsc-kn3f29gl/config.packages.cuda > > > > This looks like screwed up headers to me, but I will let someone that > > understands CUDA compilation reply. > > > > Thanks, > > > > Matt > > > > Thanks, > > > Sreeram > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Fri Dec 8 16:16:54 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Fri, 8 Dec 2023 16:16:54 -0600 Subject: [petsc-users] Configure error while building PETSc with CUDA/MVAPICH2-GDR In-Reply-To: References: <1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov> Message-ID: Actually, when I compile my program with this build of PETSc and run, I still get the error: PETSC ERROR: PETSc is configured with GPU support, but your MPI is not GPU-aware. For better performance, please use a GPU-aware MPI. I have the mvapich2-gdr module loaded and MV2_USE_CUDA=1. Is there anything else I need to do? Thanks, Sreeram On Fri, Dec 8, 2023 at 3:29?PM Sreeram R Venkat wrote: > Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr module > didn't require CUDA 11.4 as a dependency, so I was using 12.0 > > On Fri, Dec 8, 2023 at 1:15?PM Satish Balay wrote: > >> Executing: mpicc -show >> stdout: icc -I/opt/apps/cuda/11.4/include -I/opt/apps/cuda/11.4/include >> -lcuda -L/opt/apps/cuda/11.4/lib64/stubs -L/opt/apps/cuda/11.4/lib64 >> -lcudart -lrt -Wl,-rpath,/opt/apps/cuda/11.4/lib64 >> -Wl,-rpath,XORIGIN/placeholder -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/ >> -lm -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include >> -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath >> -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags -lmpi >> >> Checking for program /opt/apps/cuda/12.0/bin/nvcc...found >> >> Looks like you are trying to mix in 2 different cuda versions in this >> build. >> >> Perhaps you need to use cuda-11.4 - with this install of mvapich.. >> >> Satish >> >> On Fri, 8 Dec 2023, Matthew Knepley wrote: >> >> > On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat >> wrote: >> > >> > > I am trying to build PETSc with CUDA using the CUDA-Aware >> MVAPICH2-GDR. >> > > >> > > Here is my configure command: >> > > >> > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre >> > > --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true >> > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis >> > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 >> > > >> > > which errors with: >> > > >> > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >> for >> > > details): >> > > >> > > >> --------------------------------------------------------------------------------------------- >> > > CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 >> > > -Xcompiler -fPIC >> > > -Xcompiler -fvisibility=hidden -g -lineinfo -gencode >> > > arch=compute_80,code=sm_80" >> > > generated from "--with-cuda-arch=80" >> > > >> > > >> > > >> > > The same configure command works when I use the Intel MPI and I can >> build >> > > with CUDA. The full config.log file is attached. Please let me know >> if you >> > > need any other information. I appreciate your help with this. >> > > >> > >> > The proximate error is >> > >> > Executing: nvcc -c -o >> /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o >> > -I/tmp/petsc-kn3f29gl/config.setCompilers >> > -I/tmp/petsc-kn3f29gl/config.types >> > -I/tmp/petsc-kn3f29gl/config.packages.cuda -ccbin mpic++ -std=c++14 >> > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode >> > arch=compute_80,code=sm_80 /tmp/petsc-kn3f29gl/config.packages.cuda/ >> > conftest.cu >> > stdout: >> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one >> > instance of overloaded function "__nv_associate_access_property_impl" >> has >> > "C" linkage >> > 1 error detected in the compilation of >> > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu". >> > Possible ERROR while running compiler: exit code 1 >> > stderr: >> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one >> > instance of overloaded function "__nv_associate_access_property_impl" >> has >> > "C" linkage >> > >> > 1 error detected in the compilation of >> > "/tmp/petsc-kn3f29gl/config.packages.cuda >> > >> > This looks like screwed up headers to me, but I will let someone that >> > understands CUDA compilation reply. >> > >> > Thanks, >> > >> > Matt >> > >> > Thanks, >> > > Sreeram >> > > >> > >> > >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Dec 8 17:30:34 2023 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 8 Dec 2023 18:30:34 -0500 Subject: [petsc-users] Configure error while building PETSc with CUDA/MVAPICH2-GDR In-Reply-To: References: <1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov> Message-ID: You may need to set some env variables. This can be system specific so you might want to look at docs or ask TACC how to run with GPU-aware MPI. Mark On Fri, Dec 8, 2023 at 5:17?PM Sreeram R Venkat wrote: > Actually, when I compile my program with this build of PETSc and run, I > still get the error: > > PETSC ERROR: PETSc is configured with GPU support, but your MPI is not > GPU-aware. For better performance, please use a GPU-aware MPI. > > I have the mvapich2-gdr module loaded and MV2_USE_CUDA=1. > > Is there anything else I need to do? > > Thanks, > Sreeram > > On Fri, Dec 8, 2023 at 3:29?PM Sreeram R Venkat > wrote: > >> Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr module >> didn't require CUDA 11.4 as a dependency, so I was using 12.0 >> >> On Fri, Dec 8, 2023 at 1:15?PM Satish Balay wrote: >> >>> Executing: mpicc -show >>> stdout: icc -I/opt/apps/cuda/11.4/include -I/opt/apps/cuda/11.4/include >>> -lcuda -L/opt/apps/cuda/11.4/lib64/stubs -L/opt/apps/cuda/11.4/lib64 >>> -lcudart -lrt -Wl,-rpath,/opt/apps/cuda/11.4/lib64 >>> -Wl,-rpath,XORIGIN/placeholder -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/ >>> -lm -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include >>> -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath >>> -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags -lmpi >>> >>> Checking for program /opt/apps/cuda/12.0/bin/nvcc...found >>> >>> Looks like you are trying to mix in 2 different cuda versions in this >>> build. >>> >>> Perhaps you need to use cuda-11.4 - with this install of mvapich.. >>> >>> Satish >>> >>> On Fri, 8 Dec 2023, Matthew Knepley wrote: >>> >>> > On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat >>> wrote: >>> > >>> > > I am trying to build PETSc with CUDA using the CUDA-Aware >>> MVAPICH2-GDR. >>> > > >>> > > Here is my configure command: >>> > > >>> > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre >>> > > --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true >>> > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis >>> > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx >>> --with-fc=mpif90 >>> > > >>> > > which errors with: >>> > > >>> > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >>> for >>> > > details): >>> > > >>> > > >>> --------------------------------------------------------------------------------------------- >>> > > CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 >>> > > -Xcompiler -fPIC >>> > > -Xcompiler -fvisibility=hidden -g -lineinfo -gencode >>> > > arch=compute_80,code=sm_80" >>> > > generated from "--with-cuda-arch=80" >>> > > >>> > > >>> > > >>> > > The same configure command works when I use the Intel MPI and I can >>> build >>> > > with CUDA. The full config.log file is attached. Please let me know >>> if you >>> > > need any other information. I appreciate your help with this. >>> > > >>> > >>> > The proximate error is >>> > >>> > Executing: nvcc -c -o >>> /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o >>> > -I/tmp/petsc-kn3f29gl/config.setCompilers >>> > -I/tmp/petsc-kn3f29gl/config.types >>> > -I/tmp/petsc-kn3f29gl/config.packages.cuda -ccbin mpic++ -std=c++14 >>> > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode >>> > arch=compute_80,code=sm_80 /tmp/petsc-kn3f29gl/config.packages.cuda/ >>> > conftest.cu >>> > stdout: >>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one >>> > instance of overloaded function "__nv_associate_access_property_impl" >>> has >>> > "C" linkage >>> > 1 error detected in the compilation of >>> > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu". >>> > Possible ERROR while running compiler: exit code 1 >>> > stderr: >>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one >>> > instance of overloaded function "__nv_associate_access_property_impl" >>> has >>> > "C" linkage >>> > >>> > 1 error detected in the compilation of >>> > "/tmp/petsc-kn3f29gl/config.packages.cuda >>> > >>> > This looks like screwed up headers to me, but I will let someone that >>> > understands CUDA compilation reply. >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > Thanks, >>> > > Sreeram >>> > > >>> > >>> > >>> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From almaeder at student.ethz.ch Sat Dec 9 05:54:41 2023 From: almaeder at student.ethz.ch (Maeder Alexander) Date: Sat, 9 Dec 2023 11:54:41 +0000 Subject: [petsc-users] PETSc and MPI-3/RMA Message-ID: <8d365fe0be30429db2b7064412e49d2a@student.ethz.ch> I am a new user of PETSc and want to know more about the underlying implementation for matrix-vector multiplication (Ax=y). PETSc utilizes a 1D distribution and communicates only parts of the vector x utilized depending on the sparsity pattern of A. Is the communication of x done with MPI-3 RMA and utilizes cuda-aware mpi for RMA? Best regards, Alexander Maeder -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sat Dec 9 18:13:38 2023 From: jed at jedbrown.org (Jed Brown) Date: Sat, 09 Dec 2023 17:13:38 -0700 Subject: [petsc-users] PETSc and MPI-3/RMA In-Reply-To: <8d365fe0be30429db2b7064412e49d2a@student.ethz.ch> References: <8d365fe0be30429db2b7064412e49d2a@student.ethz.ch> Message-ID: <871qbuq5hp.fsf@jedbrown.org> It uses nonblocking point-to-point by default since that tends to perform better and is less prone to MPI implementation bugs, but you can select `-sf_type window` to try it, or use other strategies here depending on the sort of problem you're working with. #define PETSCSFBASIC "basic" #define PETSCSFNEIGHBOR "neighbor" #define PETSCSFALLGATHERV "allgatherv" #define PETSCSFALLGATHER "allgather" #define PETSCSFGATHERV "gatherv" #define PETSCSFGATHER "gather" #define PETSCSFALLTOALL "alltoall" #define PETSCSFWINDOW "window" PETSc does try to use GPU-aware MPI, though implementation bugs are present on many machines and it often requires a delicate environment arrangement. "Maeder Alexander" writes: > I am a new user of PETSc > > and want to know more about the underlying implementation for matrix-vector multiplication (Ax=y). > > PETSc utilizes a 1D distribution and communicates only parts of the vector x utilized depending on the sparsity pattern of A. > > Is the communication of x done with MPI-3 RMA and utilizes cuda-aware mpi for RMA? > > > Best regards, > > > Alexander Maeder From stephan.koehler at math.tu-freiberg.de Sun Dec 10 01:20:20 2023 From: stephan.koehler at math.tu-freiberg.de (=?UTF-8?Q?Stephan_K=C3=B6hler?=) Date: Sun, 10 Dec 2023 08:20:20 +0100 Subject: [petsc-users] Bug Report TaoALMM class In-Reply-To: References: Message-ID: Dear PETSc/Tao team, this is still an open issue andI haven't heard anything else so far that I'm wrong. Kind regards, Stephan K?hler Am 18.07.23 um 02:21 schrieb Matthew Knepley: > Toby and Hansol, > > Has anyone looked at this? > > Thanks, > > Matt > > On Mon, Jun 12, 2023 at 8:24?AM Stephan K?hler < > stephan.koehler at math.tu-freiberg.de> wrote: > >> Dear PETSc/Tao team, >> >> I think there might be a bug in the Tao ALMM class: In the function >> TaoALMMComputeAugLagAndGradient_Private(), see, eg. >> >> https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html#TAOALMM >> line 648 the gradient seems to be wrong. >> >> The given function and gradient computation is >> Lc = F + Ye^TCe + Yi^T(Ci - S) + 0.5*mu*[Ce^TCe + (Ci - S)^T(Ci - S)], >> dLc/dX = dF/dX + Ye^TAe + Yi^TAi + 0.5*mu*[Ce^TAe + (Ci - S)^TAi], >> >> but I think the gradient should be (without 0.5) >> >> dLc/dX = dF/dX + Ye^TAe + Yi^TAi + mu*[Ce^TAe + (Ci - S)^TAi]. >> >> Kind regards, >> Stephan K?hler >> >> -- >> Stephan K?hler >> TU Bergakademie Freiberg >> Institut f?r numerische Mathematik und Optimierung >> >> Akademiestra?e 6 >> 09599 Freiberg >> Geb?udeteil Mittelbau, Zimmer 2.07 >> >> Telefon: +49 (0)3731 39-3173 (B?ro) >> >> -- Stephan K?hler TU Bergakademie Freiberg Institut f?r numerische Mathematik und Optimierung Akademiestra?e 6 09599 Freiberg Geb?udeteil Mittelbau, Zimmer 2.07 Telefon: +49 (0)3731 39-3188 (B?ro) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xC9BF2C20DFE9F713.asc Type: application/pgp-keys Size: 758 bytes Desc: OpenPGP public key URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature.asc Type: application/pgp-signature Size: 236 bytes Desc: OpenPGP digital signature URL: From stephan.koehler at math.tu-freiberg.de Sun Dec 10 01:40:56 2023 From: stephan.koehler at math.tu-freiberg.de (=?UTF-8?Q?Stephan_K=C3=B6hler?=) Date: Sun, 10 Dec 2023 08:40:56 +0100 Subject: [petsc-users] Bug report VecNorm Message-ID: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> Dear PETSc/Tao team, there is a bug in the voector interface:? In the function VecNorm, see, eg. https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm line 197 the check for consistency in line 214 is done on the wrong communicator.? The communicator should be PETSC_COMM_SELF. Otherwise the program may hang when PetscCheck is executed. Please find a minimal example attached. Kind regards, Stephan K?hler -- Stephan K?hler TU Bergakademie Freiberg Institut f?r numerische Mathematik und Optimierung Akademiestra?e 6 09599 Freiberg Geb?udeteil Mittelbau, Zimmer 2.07 Telefon: +49 (0)3731 39-3188 (B?ro) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: minimal_ex_vec_norm.cpp Type: text/x-c++src Size: 1792 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xC9BF2C20DFE9F713.asc Type: application/pgp-keys Size: 758 bytes Desc: OpenPGP public key URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature.asc Type: application/pgp-signature Size: 236 bytes Desc: OpenPGP digital signature URL: From bsmith at petsc.dev Sun Dec 10 09:00:10 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 10 Dec 2023 10:00:10 -0500 Subject: [petsc-users] Bug report VecNorm In-Reply-To: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> Message-ID: <394A4E7D-302C-4B51-931D-DE1CBEBA4A61@petsc.dev> I don't fully understand your code and what it is trying to demonstrate, but VecGetArrayWrite is Logically Collective. Having if(rank == 0) { PetscCall(VecGetArrayWrite(vec, &xx)); PetscCall(VecRestoreArrayWrite(vec, &xx)); } is not allowed. The reason is that VecRestoreArrayWrite() changes the PetscObjectState of the vector, and this state must be changed consistently across all MPI processes that share the vector. > On Dec 10, 2023, at 2:40?AM, Stephan K?hler wrote: > > Dear PETSc/Tao team, > > there is a bug in the voector interface: In the function > VecNorm, see, eg. https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm line 197 the check for consistency in line 214 is done on the wrong communicator. The communicator should be PETSC_COMM_SELF. > Otherwise the program may hang when PetscCheck is executed. > > Please find a minimal example attached. > > > Kind regards, > Stephan K?hler > -- > Stephan K?hler > TU Bergakademie Freiberg > Institut f?r numerische Mathematik und Optimierung > > Akademiestra?e 6 > 09599 Freiberg > Geb?udeteil Mittelbau, Zimmer 2.07 > > Telefon: +49 (0)3731 39-3188 (B?ro) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Dec 10 11:54:02 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 10 Dec 2023 12:54:02 -0500 Subject: [petsc-users] Bug report VecNorm In-Reply-To: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> Message-ID: On Sun, Dec 10, 2023 at 2:41?AM Stephan K?hler < stephan.koehler at math.tu-freiberg.de> wrote: > Dear PETSc/Tao team, > > there is a bug in the voector interface: In the function > VecNorm, see, eg. > https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm > line 197 the check for consistency in line 214 is done on the wrong > communicator. The communicator should be PETSC_COMM_SELF. > Otherwise the program may hang when PetscCheck is executed. > > Please find a minimal example attached. > This is entirely right. I will fix it. Thanks, Matt > > > Kind regards, > Stephan K?hler > > -- > Stephan K?hler > TU Bergakademie Freiberg > Institut f?r numerische Mathematik und Optimierung > > Akademiestra?e 6 > 09599 Freiberg > Geb?udeteil Mittelbau, Zimmer 2.07 > > Telefon: +49 (0)3731 39-3188 (B?ro) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Dec 10 11:57:28 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 10 Dec 2023 12:57:28 -0500 Subject: [petsc-users] Bug report VecNorm In-Reply-To: References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> Message-ID: On Sun, Dec 10, 2023 at 12:54?PM Matthew Knepley wrote: > On Sun, Dec 10, 2023 at 2:41?AM Stephan K?hler < > stephan.koehler at math.tu-freiberg.de> wrote: > >> Dear PETSc/Tao team, >> >> there is a bug in the voector interface: In the function >> VecNorm, see, eg. >> https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm >> line 197 the check for consistency in line 214 is done on the wrong >> communicator. The communicator should be PETSC_COMM_SELF. >> Otherwise the program may hang when PetscCheck is executed. >> >> Please find a minimal example attached. >> > > This is entirely right. I will fix it. > Here is the MR. Thanks, Matt > Thanks, > > Matt > > >> >> >> Kind regards, >> Stephan K?hler >> >> -- >> Stephan K?hler >> TU Bergakademie Freiberg >> Institut f?r numerische Mathematik und Optimierung >> >> Akademiestra?e 6 >> 09599 Freiberg >> Geb?udeteil Mittelbau, Zimmer 2.07 >> >> Telefon: +49 (0)3731 39-3188 (B?ro) >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Dec 10 11:57:41 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 10 Dec 2023 12:57:41 -0500 Subject: [petsc-users] Bug report VecNorm In-Reply-To: References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> Message-ID: On Sun, Dec 10, 2023 at 12:57?PM Matthew Knepley wrote: > On Sun, Dec 10, 2023 at 12:54?PM Matthew Knepley > wrote: > >> On Sun, Dec 10, 2023 at 2:41?AM Stephan K?hler < >> stephan.koehler at math.tu-freiberg.de> wrote: >> >>> Dear PETSc/Tao team, >>> >>> there is a bug in the voector interface: In the function >>> VecNorm, see, eg. >>> https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm >>> line 197 the check for consistency in line 214 is done on the wrong >>> communicator. The communicator should be PETSC_COMM_SELF. >>> Otherwise the program may hang when PetscCheck is executed. >>> >>> Please find a minimal example attached. >>> >> >> This is entirely right. I will fix it. >> > > Here is the MR. > https://gitlab.com/petsc/petsc/-/merge_requests/7102 Thanks, Matt > Thanks, > > Matt > > >> Thanks, >> >> Matt >> >> >>> >>> >>> Kind regards, >>> Stephan K?hler >>> >>> -- >>> Stephan K?hler >>> TU Bergakademie Freiberg >>> Institut f?r numerische Mathematik und Optimierung >>> >>> Akademiestra?e 6 >>> 09599 Freiberg >>> Geb?udeteil Mittelbau, Zimmer 2.07 >>> >>> Telefon: +49 (0)3731 39-3188 (B?ro) >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Sun Dec 10 12:47:43 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Sun, 10 Dec 2023 19:47:43 +0100 Subject: [petsc-users] Bug report VecNorm In-Reply-To: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> Message-ID: > On 10 Dec 2023, at 8:40?AM, Stephan K?hler wrote: > > Dear PETSc/Tao team, > > there is a bug in the voector interface: In the function > VecNorm, see, eg. https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm line 197 the check for consistency in line 214 is done on the wrong communicator. The communicator should be PETSC_COMM_SELF. > Otherwise the program may hang when PetscCheck is executed. I think the communicator should not be changed, but instead, the check/conditional should be changed, ? la PetscValidLogicalCollectiveBool(). Thanks, Pierre > Please find a minimal example attached. > > > Kind regards, > Stephan K?hler > -- > Stephan K?hler > TU Bergakademie Freiberg > Institut f?r numerische Mathematik und Optimierung > > Akademiestra?e 6 > 09599 Freiberg > Geb?udeteil Mittelbau, Zimmer 2.07 > > Telefon: +49 (0)3731 39-3188 (B?ro) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Dec 10 19:57:21 2023 From: jed at jedbrown.org (Jed Brown) Date: Sun, 10 Dec 2023 18:57:21 -0700 Subject: [petsc-users] Bug report VecNorm In-Reply-To: References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> Message-ID: <87v895o60u.fsf@jedbrown.org> Pierre Jolivet writes: >> On 10 Dec 2023, at 8:40?AM, Stephan K?hler wrote: >> >> Dear PETSc/Tao team, >> >> there is a bug in the voector interface: In the function >> VecNorm, see, eg. https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm line 197 the check for consistency in line 214 is done on the wrong communicator. The communicator should be PETSC_COMM_SELF. >> Otherwise the program may hang when PetscCheck is executed. > > I think the communicator should not be changed, but instead, the check/conditional should be changed, ? la PetscValidLogicalCollectiveBool(). I agree -- it's no extra cost to discover collectively whether all, none, or some have the norm. In this case, it could be a MPI_SUM, in which case the error message could report how many processes took each path. From yc17470 at connect.um.edu.mo Mon Dec 11 03:32:23 2023 From: yc17470 at connect.um.edu.mo (Gong Yujie) Date: Mon, 11 Dec 2023 09:32:23 +0000 Subject: [petsc-users] Question on output vector in vtk file Message-ID: Dear PETSc developers, I have a DMPlex DM with 1 field 1dof. I'd like to output a vector with block size equals to 3. It seems that there is no response using command line option or using some code about PetscViewer. The DM is generated with (note that I'm not using PetscFE for discretization, just for allocate dof.) PetscCall(DMPlexCreateExodusFromFile(PETSC_COMM_WORLD,"tube.exo",interpolate,&dm)); PetscCall(PetscFECreateLagrange(PETSC_COMM_SELF,dim,1,PETSC_TRUE,1,PETSC_DETERMINE,&fe)); PetscCall(PetscObjectSetName((PetscObject)fe,"potential_field")); PetscCall(DMSetField(dm,0,NULL,(PetscObject)fe)); PetscCall(DMPlexDistribute(dm,0,&sf,&dmParallel)); The Vector is created using PetscCall(DMCreateGlobalVector(dm,&phi_1)); PetscCall(VecSetLocalToGlobalMapping(phi_1,Itog)); PetscCall(VecGetLocalSize(phi_1,&vec_local_size_test)); PetscCall(VecCreateMPI(PETSC_COMM_WORLD, vec_local_size_test*3, PETSC_DETERMINE, &u_grad_psi)); PetscCall(VecSetBlockSize(u_grad_psi, 3)); PetscCall(VecSetLocalToGlobalMapping(u_grad_psi,Itog)); The output command line option is just --vec_view vtk:test.vtk. The PETSc version I'm using is 3.19.5. Could you please give me some advice? Best Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 11 07:03:51 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Dec 2023 08:03:51 -0500 Subject: [petsc-users] Bug report VecNorm In-Reply-To: <394A4E7D-302C-4B51-931D-DE1CBEBA4A61@petsc.dev> References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de> <394A4E7D-302C-4B51-931D-DE1CBEBA4A61@petsc.dev> Message-ID: We already merged the fix. Thanks, Matt On Mon, Dec 11, 2023 at 6:00?AM Barry Smith wrote: > > I don't fully understand your code and what it is trying to > demonstrate, but VecGetArrayWrite is Logically Collective. Having > > if(rank == 0) > { > PetscCall(VecGetArrayWrite(vec, &xx)); > PetscCall(VecRestoreArrayWrite(vec, &xx)); > } > > is not allowed. The reason is that VecRestoreArrayWrite() changes the > PetscObjectState of the vector, and this state must be changed consistently > across all MPI processes that share the vector. > > > > On Dec 10, 2023, at 2:40?AM, Stephan K?hler < > stephan.koehler at math.tu-freiberg.de> wrote: > > Dear PETSc/Tao team, > > there is a bug in the voector interface: In the function > VecNorm, see, eg. > https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm > line 197 the check for consistency in line 214 is done on the wrong > communicator. The communicator should be PETSC_COMM_SELF. > Otherwise the program may hang when PetscCheck is executed. > > Please find a minimal example attached. > > > Kind regards, > Stephan K?hler > > -- > Stephan K?hler > TU Bergakademie Freiberg > Institut f?r numerische Mathematik und Optimierung > > Akademiestra?e 6 > 09599 Freiberg > Geb?udeteil Mittelbau, Zimmer 2.07 > > Telefon: +49 (0)3731 39-3188 (B?ro) > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 11 07:07:21 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Dec 2023 08:07:21 -0500 Subject: [petsc-users] Question on output vector in vtk file In-Reply-To: References: Message-ID: On Mon, Dec 11, 2023 at 4:32?AM Gong Yujie wrote: > Dear PETSc developers, > > I have a DMPlex DM with 1 field 1dof. I'd like to output a vector with > block size equals to 3. It seems that there is no response using command > line option or using some code about PetscViewer. > I am not sure how we can do this. If you only have 1 dof per cell (I assume), how can we have a blocksize of 3? Thanks, Matt > The DM is generated with (note that I'm not using PetscFE for > discretization, just for allocate dof.) > > *PetscCall(DMPlexCreateExodusFromFile(PETSC_COMM_WORLD,"tube.exo",interpolate,&dm));* > > *PetscCall(PetscFECreateLagrange(PETSC_COMM_SELF,dim,1,PETSC_TRUE,1,PETSC_DETERMINE,&fe));* > *PetscCall(PetscObjectSetName((PetscObject)fe,"potential_field"));* > *PetscCall(DMSetField(dm,0,NULL,(PetscObject)fe));* > *PetscCall(DMPlexDistribute(dm,0,&sf,&dmParallel));* > > The Vector is created using > *PetscCall(DMCreateGlobalVector(dm,&phi_1));* > *PetscCall(VecSetLocalToGlobalMapping(phi_1,Itog));* > *PetscCall(VecGetLocalSize(phi_1,&vec_local_size_test));* > *PetscCall(VecCreateMPI(PETSC_COMM_WORLD, vec_local_size_test*3, > PETSC_DETERMINE, &u_grad_psi));* > *PetscCall(VecSetBlockSize(u_grad_psi, 3));* > *PetscCall(VecSetLocalToGlobalMapping(u_grad_psi,Itog));* > > The output command line option is just --vec_view vtk:test.vtk. The PETSc > version I'm using is 3.19.5. > > Could you please give me some advice? > > Best Regards, > Yujie > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From 1807580692 at qq.com Mon Dec 11 01:51:54 2023 From: 1807580692 at qq.com (=?gb18030?B?MTgwNzU4MDY5Mg==?=) Date: Mon, 11 Dec 2023 15:51:54 +0800 Subject: [petsc-users] (no subject) Message-ID: Hello, I have encountered some problems. Here are some of my configurations. OS Version and Type:  Linux daihuanhe-Aspire-A315-55G 5.15.0-89-generic #99~20.04.1-Ubuntu SMP Thu Nov 2 15:16:47 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux PETSc Version: #define PETSC_VERSION_RELEASE    1 #define PETSC_VERSION_MAJOR      3 #define PETSC_VERSION_MINOR      19 #define PETSC_VERSION_SUBMINOR   0 #define PETSC_RELEASE_DATE       "Mar 30, 2023" #define PETSC_VERSION_DATE       "unknown" MPI implementation: MPICH Compiler and version: Gnu C The problem is when I type  ?mpiexec -n 4 ./ex19 -lidvelocity 100 -prandtl 0.72 -grashof 10000 -da_grid_x 64 -da_grid_y 64 -snes_type newtonls -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type symmetric_multiplicative -pc_fieldsplit_block_size 4 -pc_fieldsplit_0_fields 0,1,2,3 -pc_fieldsplit_1_fields 0,1,2,3 -fieldsplit_0_pc_type asm -fieldsplit_0_pc_asm_type restrict -fieldsplit_0_pc_asm_overlap 5 -fieldsplit_0_sub_pc_type lu -fieldsplit_1_pc_type asm -fieldsplit_1_pc_asm_type restrict -fieldsplit_1_pc_asm_overlap 5 -fieldsplit_1_sub_pc_type lu  -snes_monitor -snes_converged_reason -fieldsplit_0_ksp_atol 1e-10  -fieldsplit_1_ksp_atol 1e-10  -fieldsplit_0_ksp_rtol 1e-6  -fieldsplit_1_ksp_rtol 1e-6 -fieldsplit_0_snes_atol 1e-10  -fieldsplit_1_snes_atol 1e-10  -fieldsplit_0_snes_rtol 1e-6  -fieldsplit_1_snes_rtol 1e-6? in the command line, where my path is /petsc/src/snes/tutorials. It returns  ?WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! There are 8 unused database options. They are: Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line Option left: name:-fieldsplit_0_ksp_rtol value: 1e-6 source: command line Option left: name:-fieldsplit_0_snes_atol value: 1e-10 source: command line Option left: name:-fieldsplit_0_snes_rtol value: 1e-6 source: command line Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line Option left: name:-fieldsplit_1_ksp_rtol value: 1e-6 source: command line Option left: name:-fieldsplit_1_snes_atol value: 1e-10 source: command line Option left: name:-fieldsplit_1_snes_rtol value: 1e-6 source: command line?. Please tell me what should I do?Thank you very much. 1807580692 1807580692 at qq.com   -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Dec 11 11:00:23 2023 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 11 Dec 2023 12:00:23 -0500 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: The snes options are not relevant since the parts of a PCFIELDSPLIT are always linear problems. By default PCFIELDSPLIT uses a KSP type of preonly on each split (that is it applies the preconditioner exactly once inside the PCApply_FieldSplit() hence the -fieldsplit_*_ksp_ options are not relevent. You can use -fieldsplit_ksp_type gmres for example to have it use gmres on each of the splits, but note that then you should use -ksp_type fgmres since using gmres inside a preconditioner results in a nonlinear preconditioner. You can always run with -ksp_view to see the solver being used and the prefixes that currently make sense. Barry > On Dec 11, 2023, at 2:51?AM, 1807580692 <1807580692 at qq.com> wrote: > > Hello, I have encountered some problems. Here are some of my configurations. > OS Version and Type: Linux daihuanhe-Aspire-A315-55G 5.15.0-89-generic #99~20.04.1-Ubuntu SMP Thu Nov 2 15:16:47 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux > PETSc Version: #define PETSC_VERSION_RELEASE 1 > #define PETSC_VERSION_MAJOR 3 > #define PETSC_VERSION_MINOR 19 > #define PETSC_VERSION_SUBMINOR 0 > #define PETSC_RELEASE_DATE "Mar 30, 2023" > #define PETSC_VERSION_DATE "unknown" > MPI implementation: MPICH > Compiler and version: Gnu C > The problem is when I type > ?mpiexec -n 4 ./ex19 -lidvelocity 100 -prandtl 0.72 -grashof 10000 -da_grid_x 64 -da_grid_y 64 -snes_type newtonls -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type symmetric_multiplicative -pc_fieldsplit_block_size 4 -pc_fieldsplit_0_fields 0,1,2,3 -pc_fieldsplit_1_fields 0,1,2,3 -fieldsplit_0_pc_type asm -fieldsplit_0_pc_asm_type restrict -fieldsplit_0_pc_asm_overlap 5 -fieldsplit_0_sub_pc_type lu -fieldsplit_1_pc_type asm -fieldsplit_1_pc_asm_type restrict -fieldsplit_1_pc_asm_overlap 5 -fieldsplit_1_sub_pc_type lu -snes_monitor -snes_converged_reason -fieldsplit_0_ksp_atol 1e-10 -fieldsplit_1_ksp_atol 1e-10 -fieldsplit_0_ksp_rtol 1e-6 -fieldsplit_1_ksp_rtol 1e-6 -fieldsplit_0_snes_atol 1e-10 -fieldsplit_1_snes_atol 1e-10 -fieldsplit_0_snes_rtol 1e-6 -fieldsplit_1_snes_rtol 1e-6? > in the command line, where my path is /petsc/src/snes/tutorials. > > It returns > ?WARNING! There are options you set that were not used! > WARNING! could be spelling mistake, etc! > There are 8 unused database options. They are: > Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line > Option left: name:-fieldsplit_0_ksp_rtol value: 1e-6 source: command line > Option left: name:-fieldsplit_0_snes_atol value: 1e-10 source: command line > Option left: name:-fieldsplit_0_snes_rtol value: 1e-6 source: command line > Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line > Option left: name:-fieldsplit_1_ksp_rtol value: 1e-6 source: command line > Option left: name:-fieldsplit_1_snes_atol value: 1e-10 source: command line > Option left: name:-fieldsplit_1_snes_rtol value: 1e-6 source: command line?. > Please tell me what should I do?Thank you very much. > > 1807580692 > 1807580692 at qq.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bruce.Palmer at pnnl.gov Tue Dec 12 10:27:42 2023 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Tue, 12 Dec 2023 16:27:42 +0000 Subject: [petsc-users] Fortran Interface Message-ID: Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release) but if you go under the tab C/Fortran API, only descriptions for the C interface are there. Bruce Palmer -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 12 10:31:55 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Dec 2023 11:31:55 -0500 Subject: [petsc-users] Fortran Interface In-Reply-To: References: Message-ID: On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users < petsc-users at mcs.anl.gov> wrote: > Does documentation for the PETSc fortran interface still exist? I looked > at the web pages for 3.20 (petsc.org/release) but if you go under the tab > C/Fortran API, only descriptions for the C interface are there. > I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here: https://petsc.org/release/manual/fortran/ Thanks, Matt > Bruce Palmer > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bruce.Palmer at pnnl.gov Tue Dec 12 10:40:12 2023 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Tue, 12 Dec 2023 16:40:12 +0000 Subject: [petsc-users] Fortran Interface In-Reply-To: References: Message-ID: Thanks! It might be useful if there were a link to this page near the top of the C/Fortran API page. Bruce From: Matthew Knepley Date: Tuesday, December 12, 2023 at 8:33 AM To: Palmer, Bruce J Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Fortran Interface Check twice before you click! This email originated from outside PNNL. On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users > wrote: Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release) but if you go under the tab C/Fortran API, only descriptions for the C interface are there. I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here: https://petsc.org/release/manual/fortran/ Thanks, Matt Bruce Palmer -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Dec 12 11:07:32 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 12 Dec 2023 12:07:32 -0500 Subject: [petsc-users] Fortran Interface In-Reply-To: References: Message-ID: It is unlikely we will ever be able to maintain full manual pages for Fortran for all routines. But yes, the current pages are C-centric. Do you have any suggestions on what we could add to the current manual pages or how to format them etc that would make them better for Fortran users who are not used to C? A Fortran synopsis as well as the C one, or a single synopsis that is easier for both Fortran and C users to follow? Barry I am not sure it is trivial to automatically generate the Fortran synposis with appropriate use and include information but one could argue that we should. > On Dec 12, 2023, at 11:40?AM, Palmer, Bruce J via petsc-users wrote: > > Thanks! It might be useful if there were a link to this page near the top of the C/Fortran API page. > > Bruce > > From: Matthew Knepley > > Date: Tuesday, December 12, 2023 at 8:33 AM > To: Palmer, Bruce J > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Fortran Interface > > Check twice before you click! This email originated from outside PNNL. > > On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users > wrote: > Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release ) but if you go under the tab C/Fortran API, only descriptions for the C interface are there. > > I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here: > > https://petsc.org/release/manual/fortran/ > > Thanks, > > Matt > > Bruce Palmer > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From s_g at berkeley.edu Tue Dec 12 11:17:14 2023 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Tue, 12 Dec 2023 09:17:14 -0800 Subject: [petsc-users] Fortran Interface In-Reply-To: References: Message-ID: <0085ad64-2045-44ad-a686-86b5ae5c88a9@berkeley.edu> I agree with Bruce that having a link to https://petsc.org/release/manual/fortran/ at the top of the C/Fortran API page (https://petsc.org/release/manualpages/) would be helpful.?? The C descriptions themselves are 98% of the way there for Fortran users (like myself).? The only time that more information would be help on the manual pages themselves is when there is a strong variance between the C and Fortran usage, but that can not be easily automated. -sanjay On 12/12/23 9:07 AM, Barry Smith wrote: > > ? It is unlikely we will ever be able to maintain full manual pages > for Fortran for all routines. But yes, the current pages are C-centric. > > ? Do you have any suggestions on what we could add to the current > manual pages or how to format them etc that would make them better for > Fortran users who are not used to C? ?A Fortran synopsis as well as > the C one, or a single synopsis that is easier for both Fortran and C > users to follow? > > ? Barry > > I am not sure it is trivial to automatically generate the Fortran > synposis with appropriate use and include information but one could > argue that we should. > > > >> On Dec 12, 2023, at 11:40?AM, Palmer, Bruce J via petsc-users >> wrote: >> >> Thanks! It might be useful if there were a link to this page near the >> top of the C/Fortran API page. >> Bruce >> >> *From:*Matthew Knepley >> *Date:*Tuesday, December 12, 2023 at 8:33 AM >> *To:*Palmer, Bruce J >> *Cc:*petsc-users at mcs.anl.gov >> *Subject:*Re: [petsc-users] Fortran Interface >> >> Check twice before you click! This email originated from outside PNNL. >> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users >> wrote: >> >> Does documentation for the PETSc fortran interface still exist? I >> looked at the web pages for 3.20 (petsc.org/release >> ) but if you go under the tab C/Fortran >> API, only descriptions for the C interface are there. >> >> I think after the most recent changes, the interface was supposed to >> be very close to C, so we just document the differences on specific >> pages, and put the general stuff here: >> https://petsc.org/release/manual/fortran/ >> ? ?Thanks, >> ? ? ?Matt >> >> Bruce Palmer >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From onur.notonur at proton.me Tue Dec 12 11:21:31 2023 From: onur.notonur at proton.me (onur.notonur) Date: Tue, 12 Dec 2023 17:21:31 +0000 Subject: [petsc-users] DMPlex "Could not find orientation for quadrilateral" Message-ID: Hi, I hope this email finds you well. I am currently working on importing an OpenFOAM PolyMesh into DMPlex, and I've encountered an issue. The PolyMesh format includes face owner cells/neighbor cells and face-to-vertex connectivity. I was using the "DMPlexCreateFromCellListPetsc()" function, which required cell-to-vertex connectivity. However, when attempting to create the cell connectivity using an edge loop [p_0, p_1, ..., p_7] (p_n and p_(n+1) are valid edges in my mesh), I encountered an error stating, "Could not find orientation for quadrilateral." (Actually at first, I generated the connectivity list by simply creating a cell-to-face list and then using that to create a cell-to-vertex list. (just map over the list and remove duplicates) This created a DMPlex successfully, however, resulted in a mesh that was incorrect when looking with ParaView. I think that was because of I stated wrong edge loop to create cells) I understand that I may need to follow a different format for connectivity, but I'm not sure what that format is. My current mesh is hexahedral, consisting of 8 corner elements(if important). I would appreciate any guidance on a general approach to address this issue. Thank you for your time and assistance. Best, Onur Sent with Proton Mail secure email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bldenton at buffalo.edu Tue Dec 12 11:22:37 2023 From: bldenton at buffalo.edu (Brandon Denton) Date: Tue, 12 Dec 2023 17:22:37 +0000 Subject: [petsc-users] Applying Natural Boundary Conditions using PETSc FEM Technology Message-ID: Good Afternoon, I am currently working on an Inviscid Navier-Stokes problem and would like to apply DM_BC_NATURAL boundary conditions to my domain. Looking through the example files on petsc.org, I noticed that in almost all cases there are the following series of calls. PetscCall(DMAddBoundary(dm, DM_BC_NATURAL, "wall", label, 1, &id, 0, 0, NULL, NULL, NULL, user, &bd)); PetscCall(PetscDSGetBoundary(ds, bd, &wf, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL)); PetscCall(PetscWeakFormSetIndexBdResidual(wf, label, id, 0, 0, 0, f0_bd_u, 0, NULL)); Is this the standard way of applying Natural boundary conditions in PETSc for FEM? Also, I noticed in the signature for the f0_bd_u function, there is a const PetscReal n[] array. What is this array and what information does it hold. Is it the normal vector at the point? static void f0_bd_u(PetscInt dim, PetscInt Nf, PetscInt NfAux, const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[], const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[], PetscReal t, const PetscReal x[], const PetscReal n[], PetscInt numConstants, const PetscScalar constants[], PetscScalar f0[]) Thank you in advance for your time. Brandon -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Tue Dec 12 13:36:01 2023 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Tue, 12 Dec 2023 19:36:01 +0000 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics In-Reply-To: <2BEA961D-00D5-4880-A162-7262E398C048@us.es> References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es> Message-ID: Dear Matthew and Mark, Thank you for four useful guidance. I have taken as a starting point the example in "dm/tutorials/swarm_ex3.c" to build a first approximation for domain decomposition in my molecular dynamics code (diffusive molecular dynamic to be more precise :-) ). And I must say that I am very happy with the result. However, in my journey integrating domain decomposition into my code, I am facing some new (and expected) problems. The first is in the implementation of the nearest neighbor algorithm (list of atoms closest to each atom). My current approach to the problem is a brute force algorithm (double loop through the list of atoms and calculate the distance). However, it seems that if I call the "neighbours" function after the "DMSwarmMigrate" function the search algorithm does not work correctly. My thoughts / hints are: * The two nested for loops should be done on the global indexing of the atoms instead of the local one (I don't know how to get this number). * If I print the mean position of atom #0 (for example) each range prints a different value of the average position. One of them is the correct position corresponding to site #0, the others are different (but identically labeled) atomic sites. Which means that the site_i index is not bijective. I believe that solving this problem will increase my understanding of the domain decomposition approach and may allow me to fix the remaining parts of my code. Any additional comments are greatly appreciated. For instance, I will be happy to be pointed to any piece of code (petsc examples for example) with solves a similar problem in order to self-learn learn by example. Many thanks in advance. Best, Miguel This is the piece of code (simplified) which computes the list of neighbours for each atomic site. DMD is a structure which contains the atomistic information (DMSWARM), and the background mesh and bounding cell (DMDA and DMShell) int neighbours(DMD* Simulation) { PetscFunctionBegin; PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local)); //! Get array with the mean position of the atoms DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q_ptr); Eigen::Map mean_q(mean_q_ptr, n_atoms_local, dim); int* neigh = Simulation->neigh; int* numneigh = Simulation->numneigh; for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) { //! Get mean position of site i Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0); //! Search neighbourhs in the main cell (+ periodic cells) for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) { if (site_i != site_j) { //! Get mean position of site j in the periodic box Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0); //! Check is site j is the neibourhood of the site i double norm_r_ij = (mean_q_i - mean_q_j).norm(); if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) { neigh[site_i * maxneigh + numneigh[site_i]] = site_j; numneigh[site_i] += 1; } } } } // MPI for loop (site_i) DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q_ptr); return EXIT_SUCCESS; } This is the piece of code that I use to read the atomic positions (mean_q) from a file: //! @brief mean_q: Mean value of each atomic position double* mean_q; PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q)); cnt = 0; for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) { if (cnt < n_atoms) { mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0]; mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1]; mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2]; cnt++; } } PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q)); [Screenshot 2023-12-12 at 19.42.13.png] On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ wrote: ?Thank you Mark! I will have a look to it. Best, Miguel On 4 Nov 2023, at 13:54, Matthew Knepley wrote: ? On Sat, Nov 4, 2023 at 8:40?AM Mark Adams > wrote: Hi MIGUEL, This might be a good place to start: https://petsc.org/main/manual/vec/ Feel free to ask more specific questions, but the docs are a good place to start. Thanks, Mark On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ > wrote: Dear all, I am currently working on the development of a in-house molecular dynamics code using PETSc and C++. So far the code works great, however it is a little bit slow since I am not exploiting MPI for PETSc vectors. I was wondering if there is a way to perform the domain decomposition efficiently using some PETSc functionality. Any feedback is highly appreciated. It sounds like you mean "is there a way to specify a communication construct that can send my particle information automatically". We use PetscSF for that. You can see how this works with the DMSwarm class, which represents a particle discretization. You can either use that, or if it does not work for you, do the same things with your class. Thanks, Matt Best regards, Miguel -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2023-12-12 at 19.42.13.png Type: image/png Size: 1252024 bytes Desc: Screenshot 2023-12-12 at 19.42.13.png URL: From Bruce.Palmer at pnnl.gov Tue Dec 12 13:54:57 2023 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Tue, 12 Dec 2023 19:54:57 +0000 Subject: [petsc-users] Fortran Interface In-Reply-To: References: Message-ID: I think having a link to your fortran interface page on the C/Fortran API tab is probably sufficient, particularly if the interfaces are similar. If functions have significant differences between C and Fortran, it would be helpful if the notes about it are on the page describing the function. I?m the project lead for Global Arrays and we wrote our API documentation in LaTeX. Each function has C and Fortran-specific documentation as well as some generic documentation that can apply to either interface. We run the tex files through a preprocessor that filters out just the C or Fortran-specific text to build the documentation for the C or Fortran API. It sorta works, but it is a fair amount of effort to keep everything synched up and we have a lot fewer functions in our API than you do. The one advantage is that everything about a particular function is located in one spot, so it makes it relatively easy to fix everything up if you make changes. Bruce From: Barry Smith Date: Tuesday, December 12, 2023 at 9:07 AM To: Palmer, Bruce J Cc: Matthew Knepley , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Fortran Interface It is unlikely we will ever be able to maintain full manual pages for Fortran for all routines. But yes, the current pages are C-centric. Do you have any suggestions on what we could add to the current manual pages or how to format them etc that would make them better for Fortran users who are not used to C? A Fortran synopsis as well as the C one, or a single synopsis that is easier for both Fortran and C users to follow? Barry I am not sure it is trivial to automatically generate the Fortran synposis with appropriate use and include information but one could argue that we should. On Dec 12, 2023, at 11:40?AM, Palmer, Bruce J via petsc-users wrote: Thanks! It might be useful if there were a link to this page near the top of the C/Fortran API page. Bruce From: Matthew Knepley > Date: Tuesday, December 12, 2023 at 8:33 AM To: Palmer, Bruce J > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Fortran Interface Check twice before you click! This email originated from outside PNNL. On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users > wrote: Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release) but if you go under the tab C/Fortran API, only descriptions for the C interface are there. I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here: https://petsc.org/release/manual/fortran/ Thanks, Matt Bruce Palmer -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bruce.Palmer at pnnl.gov Tue Dec 12 14:22:59 2023 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Tue, 12 Dec 2023 20:22:59 +0000 Subject: [petsc-users] Fortran Interface In-Reply-To: References: Message-ID: What do you do with something like a void pointer? I?m looking at the TaoSetObjectiveAndGradient function and it wants to pass a void *ctx pointer. You can set this to null, but apparently you have to specify the type. What type should I use? Is there something called PETSC_NULL_VOID or PETSC_NULL_CONTEXT or do I use something else? From: Matthew Knepley Date: Tuesday, December 12, 2023 at 8:33 AM To: Palmer, Bruce J Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Fortran Interface Check twice before you click! This email originated from outside PNNL. On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users > wrote: Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release) but if you go under the tab C/Fortran API, only descriptions for the C interface are there. I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here: https://petsc.org/release/manual/fortran/ Thanks, Matt Bruce Palmer -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Tue Dec 12 16:06:10 2023 From: rlmackie862 at gmail.com (Randall Mackie) Date: Tue, 12 Dec 2023 14:06:10 -0800 Subject: [petsc-users] valgrind errors Message-ID: It now seems to me that petsc+mpich is no longer valgrind clean, or I am doing something wrong. A simple program: Program test #include "petsc/finclude/petscsys.h" use petscsys PetscInt :: ierr call PetscInitialize(PETSC_NULL_CHARACTER,ierr) call PetscFinalize(ierr) end program test PETSc compiled in debug mode, complex scalars, and download-mpich, when run with valgrind generates errors like these: ==3997== Syscall param writev(vector[...]) points to uninitialised byte(s) ==3997== at 0x8C31867: writev (writev.c:26) ==3997== by 0x9C20DE4: MPL_large_writev (mpl_sock.c:31) ==3997== by 0x9BF1050: MPIDI_CH3I_Sock_writev (sock.c:2689) ==3997== by 0x9BF9812: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:92) ==3997== by 0x9BA7790: MPIDI_CH3_EagerContigSend (ch3u_eager.c:191) ==3997== by 0x9BCA7EC: MPID_Send (mpid_send.c:132) ==3997== by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206) ==3997== by 0x9A2AC7C: MPIC_Send (helper_fns.c:126) ==3997== by 0x993A645: MPIR_Bcast_intra_binomial (bcast_intra_binomial.c:146) ==3997== by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323) ==3997== by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420) ==3997== by 0x99FCF86: MPID_Bcast (mpid_coll.h:30) ==3997== by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465) ==3997== by 0x974A513: internal_Bcast (bcast.c:93) ==3997== by 0x974A72B: PMPI_Bcast (bcast.c:143) ==3997== by 0x4B8D6DB: PETScParseFortranArgs_Private (zstart.c:182) ==3997== by 0x4B8DDFA: PetscInitFortran_Private (zstart.c:200) ==3997== by 0x4B34931: PetscInitialize_Common (pinit.c:974) ==3997== by 0x4B8E8C7: petscinitializef_ (zstart.c:284) ==3997== by 0x4959434: __petscsys_MOD_petscinitializenohelp (petscsysmod.F90:374) ==3997== Address 0x1ffeffcac0 is on thread 1's stack ==3997== in frame #4, created by MPIDI_CH3_EagerContigSend (ch3u_eager.c:160) ==3997== Uninitialised value was created by a stack allocation ==3997== at 0x9BA7601: MPIDI_CH3_EagerContigSend (ch3u_eager.c:160) ==3997== ==3997== Syscall param write(buf) points to uninitialised byte(s) ==3997== at 0x8C2B697: write (write.c:26) ==3997== by 0x9BF0F1D: MPIDI_CH3I_Sock_write (sock.c:2614) ==3997== by 0x9BF7AAE: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68) ==3997== by 0x9BA7A27: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:262) ==3997== by 0x9BCA766: MPID_Send (mpid_send.c:119) ==3997== by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206) ==3997== by 0x9A2AC7C: MPIC_Send (helper_fns.c:126) ==3997== by 0x993A645: MPIR_Bcast_intra_binomial (bcast_intra_binomial.c:146) ==3997== by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323) ==3997== by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420) ==3997== by 0x99FCF86: MPID_Bcast (mpid_coll.h:30) ==3997== by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465) ==3997== by 0x974A513: internal_Bcast (bcast.c:93) ==3997== by 0x974A72B: PMPI_Bcast (bcast.c:143) ==3997== by 0x4DB95A2: PetscOptionsGetenv (pdisplay.c:61) ==3997== by 0x4E0D745: PetscStrreplace (str.c:572) ==3997== by 0x4AC8DEA: PetscOptionsFilename (options.c:416) ==3997== by 0x4ACF0B5: PetscOptionsInsertFile (options.c:632) ==3997== by 0x4AD3CB5: PetscOptionsInsert (options.c:861) ==3997== by 0x4B8E0EF: PetscInitFortran_Private (zstart.c:206) ==3997== Address 0x1ffeff7998 is on thread 1's stack ==3997== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:223) ==3997== Uninitialised value was created by a stack allocation ==3997== at 0x9BA788F: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:223) ==3997== Is this a known issue or am I doing something wrong? Thanks, Randy -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.roongta at mpie.de Tue Dec 12 16:49:48 2023 From: s.roongta at mpie.de (Sharan Roongta) Date: Tue, 12 Dec 2023 23:49:48 +0100 Subject: [petsc-users] difference in Face Sets in latest petsc release Message-ID: <3518806234-6560@xmail1.mpie.de> Hello, I see discrepancy in the size/value of the 'Face Sets' printed in the current release v3.20.2 , and v3.18.6 Attached is the .msh file -dm_view with v3.18.6 DM Object: Generated Mesh 1 MPI process ? type: plex Generated Mesh in 3 dimensions: ? Number of 0-cells per rank: 14 ? Number of 1-cells per rank: 49 ? Number of 2-cells per rank: 60 ? Number of 3-cells per rank: 24 Labels: ? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) ? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) ? Cell Sets: 1 strata with value/size (1 (24)) ? Face Sets: 5 strata with value/size (1 (4), 2 (4), 3 (4), 4 (4), 5 (4)) -dm_view with the current release (commit?4b9a870af96) DM Object: Generated Mesh 1 MPI process ? type: plex Generated Mesh in 3 dimensions: ? Number of 0-cells per rank: 14 ? Number of 1-cells per rank: 49 ? Number of 2-cells per rank: 60 ? Number of 3-cells per rank: 24 Labels: ? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) ? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) ? Cell Sets: 1 strata with value/size (1 (24)) ? Face Sets: 12 strata with value/size (1 (5), 2 (5), 3 (5), 4 (5), 5 (5), 6 (1), 7 (1), 8 (1), 9 (1), 10 (1), 11 (1), 12 (1)) I believe the older version printed the correct thing??Has something changed in the interpretation of Face Sets? Thanks, Sharan Group - Theory & Simulation Department of Microstructure Physics & Alloy Design ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: m1.tar.xz Type: application/x-xz Size: 844 bytes Desc: not available URL: From knepley at gmail.com Tue Dec 12 17:02:14 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Dec 2023 18:02:14 -0500 Subject: [petsc-users] Applying Natural Boundary Conditions using PETSc FEM Technology In-Reply-To: References: Message-ID: On Tue, Dec 12, 2023 at 12:23?PM Brandon Denton via petsc-users < petsc-users at mcs.anl.gov> wrote: > Good Afternoon, > > I am currently working on an Inviscid Navier-Stokes problem and would like > to apply DM_BC_NATURAL boundary conditions to my domain. Looking through > the example files on petsc.org, I noticed that in almost all cases there > are the following series of calls. > > PetscCall(DMAddBoundary(dm, DM_BC_NATURAL, "wall", label, 1, &id, 0, 0, > NULL, NULL, NULL, user, &bd)); > PetscCall(PetscDSGetBoundary(ds, bd, &wf, NULL, NULL, NULL, NULL, NULL, > NULL, NULL, NULL, NULL, NULL, NULL)); > PetscCall(PetscWeakFormSetIndexBdResidual(wf, label, id, 0, 0, 0, f0_bd_u, > 0, NULL)); > > Is this the standard way of applying Natural boundary conditions in PETSc > for FEM? > Yes. The problem is that AddBoundary was designed just to deliver boundary values, but inhomogeneous Neumann conditions really want weak forms, and the weak form interface came later. It is a little clunky. > Also, I noticed in the signature for the f0_bd_u function, there is a > const PetscReal n[] array. What is this array and what information does it > hold. Is it the normal vector at the point? > That is the normal at the evaluation point. Thanks, Matt > static void f0_bd_u(PetscInt dim, PetscInt Nf, PetscInt NfAux, const > PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const > PetscScalar u_t[], const PetscScalar u_x[], const PetscInt aOff[], const > PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const > PetscScalar a_x[], PetscReal t, const PetscReal x[], const PetscReal n[], > PetscInt numConstants, const PetscScalar constants[], PetscScalar f0[]) > > Thank you in advance for your time. > Brandon > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 12 17:16:40 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Dec 2023 18:16:40 -0500 Subject: [petsc-users] DMPlex "Could not find orientation for quadrilateral" In-Reply-To: References: Message-ID: On Tue, Dec 12, 2023 at 12:22?PM onur.notonur via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > I hope this email finds you well. I am currently working on importing an > OpenFOAM PolyMesh into DMPlex, and I've encountered an issue. The PolyMesh > format includes face owner cells/neighbor cells and face-to-vertex > connectivity. I was using the "DMPlexCreateFromCellListPetsc()" function, > which required cell-to-vertex connectivity. However, when attempting to > create the cell connectivity using an edge loop [p_0, p_1, ..., p_7] (p_n > and p_(n+1) are valid edges in my mesh), I encountered an error stating, > "Could not find orientation for quadrilateral." > > (Actually at first, I generated the connectivity list by simply creating a > cell-to-face list and then using that to create a cell-to-vertex list. > (just map over the list and remove duplicates) This created a DMPlex > successfully, however, resulted in a mesh that was incorrect when looking > with ParaView. I think that was because of I stated wrong edge loop to > create cells) > > I understand that I may need to follow a different format for > connectivity, but I'm not sure what that format is. My current mesh is > hexahedral, consisting of 8 corner elements(if important). I would > appreciate any guidance on a general approach to address this issue. > Can you start by giving the PolyMesh format, or some URL with it documented? Thanks, Matt > Thank you for your time and assistance. > Best, > Onur > Sent with Proton Mail secure email. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 12 17:51:46 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Dec 2023 18:51:46 -0500 Subject: [petsc-users] difference in Face Sets in latest petsc release In-Reply-To: <3518806234-6560@xmail1.mpie.de> References: <3518806234-6560@xmail1.mpie.de> Message-ID: On Tue, Dec 12, 2023 at 5:50?PM Sharan Roongta wrote: > Hello, > > I see discrepancy in the size/value of the 'Face Sets' printed in the > current release v3.20.2 , and v3.18.6 > > Attached is the .msh file > > -dm_view with v3.18.6 > DM Object: Generated Mesh 1 MPI process > type: plex > Generated Mesh in 3 dimensions: > Number of 0-cells per rank: 14 > Number of 1-cells per rank: 49 > Number of 2-cells per rank: 60 > Number of 3-cells per rank: 24 > Labels: > celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) > depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) > Cell Sets: 1 strata with value/size (1 (24)) > Face Sets: 5 strata with value/size (1 (4), 2 (4), 3 (4), 4 (4), 5 (4)) > > > -dm_view with the current release (commit 4b9a870af96) > > DM Object: Generated Mesh 1 MPI process > type: plex > Generated Mesh in 3 dimensions: > Number of 0-cells per rank: 14 > Number of 1-cells per rank: 49 > Number of 2-cells per rank: 60 > Number of 3-cells per rank: 24 > Labels: > celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) > depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) > Cell Sets: 1 strata with value/size (1 (24)) > Face Sets: 12 strata with value/size (1 (5), 2 (5), 3 (5), 4 (5), 5 (5), > 6 (1), 7 (1), 8 (1), 9 (1), 10 (1), 11 (1), 12 (1)) > > I believe the older version printed the correct thing? Has something > changed in the interpretation of Face Sets? > Yes. In the older version, I was only labeling cells, faces, and vertices. There were complaints, so I put in the edge labels. If you check, all the additional labels are on edges, and checking your .msh file, those edges clearly have those labels. Thanks, Matt > Thanks, > Sharan > > *Group - Theory & Simulation* > *Department of Microstructure Physics & Alloy Design* > > > ------------------------------ > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 12 18:01:13 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Dec 2023 19:01:13 -0500 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics In-Reply-To: References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es> Message-ID: On Tue, Dec 12, 2023 at 2:36?PM MIGUEL MOLINOS PEREZ wrote: > Dear Matthew and Mark, > > Thank you for four useful guidance. I have taken as a starting point the > example in "dm/tutorials/swarm_ex3.c" to build a first approximation for > domain decomposition in my molecular dynamics code (diffusive molecular > dynamic to be more precise :-) ). And I must say that I am very happy with > the result. However, in my journey integrating domain decomposition into my > code, I am facing some new (and expected) problems. The first is in the > implementation of the nearest neighbor algorithm (list of atoms closest to > each atom). > Can you help me understand this? For a given atom, there should be a single "closest" atom (barring degeneracies in distance). What do you mean by the list of closest atoms? Thanks, Matt > My current approach to the problem is a brute force algorithm (double loop > through the list of atoms and calculate the distance). However, it seems > that if I call the "neighbours" function after the "DMSwarmMigrate" > function the search algorithm does not work correctly. My thoughts / hints > are: > > - The two nested for loops should be done on the global indexing of > the atoms instead of the local one (I don't know how to get this number). > - If I print the mean position of atom #0 (for example) each range > prints a different value of the average position. One of them is the > correct position corresponding to site #0, the others are different (but > identically labeled) atomic sites. Which means that the site_i index is not > bijective. > > > I believe that solving this problem will increase my understanding of the > domain decomposition approach and may allow me to fix the remaining parts > of my code. > > Any additional comments are greatly appreciated. For instance, I will be > happy to be pointed to any piece of code (petsc examples for example) with > solves a similar problem in order to self-learn learn by example. > > Many thanks in advance. > > Best, > Miguel > > This is the piece of code (simplified) which computes the list of > neighbours for each atomic site. DMD is a structure which contains the > atomistic information (DMSWARM), and the background mesh and bounding > cell (DMDA and DMShell) > > int neighbours(DMD* Simulation) { > > PetscFunctionBegin; > PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); > > PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local > )); > > //! Get array with the mean position of the atoms > DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, & > blocksize, NULL, > (void**)&mean_q_ptr); > Eigen::Map mean_q(mean_q_ptr, n_atoms_local, dim); > > int* neigh = Simulation->neigh; > int* numneigh = Simulation->numneigh; > > for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) { > > //! Get mean position of site i > Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0); > > //! Search neighbourhs in the main cell (+ periodic cells) > for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) { > if (site_i != site_j) { > //! Get mean position of site j in the periodic box > Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0); > > //! Check is site j is the neibourhood of the site i > double norm_r_ij = (mean_q_i - mean_q_j).norm(); > if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) { > neigh[site_i * maxneigh + numneigh[site_i]] = site_j; > numneigh[site_i] += 1; > } > } > } > > } // MPI for loop (site_i) > > DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, & > blocksize, > NULL, (void**)&mean_q_ptr); > > return EXIT_SUCCESS; > } > > > This is the piece of code that I use to read the atomic positions (mean_q) > from a file: > //! @brief mean_q: Mean value of each atomic position > double* mean_q; > PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &blocksize > , > NULL, (void**)&mean_q)); > > cnt = 0; > for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) { > if (cnt < n_atoms) { > mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0]; > mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1]; > mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2]; > > cnt++; > } > } > PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor, > &blocksize, NULL, (void**)&mean_q)); > > > > [image: Screenshot 2023-12-12 at 19.42.13.png] > > On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ wrote: > > ?Thank you Mark! I will have a look to it. > > Best, > Miguel > > > On 4 Nov 2023, at 13:54, Matthew Knepley wrote: > > ? > On Sat, Nov 4, 2023 at 8:40?AM Mark Adams wrote: > >> Hi MIGUEL, >> >> This might be a good place to start: https://petsc.org/main/manual/vec/ >> Feel free to ask more specific questions, but the docs are a good place >> to start. >> >> Thanks, >> Mark >> >> On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ >> wrote: >> >>> Dear all, >>> >>> I am currently working on the development of a in-house molecular >>> dynamics code using PETSc and C++. So far the code works great, however it >>> is a little bit slow since I am not exploiting MPI for PETSc vectors. I was >>> wondering if there is a way to perform the domain decomposition efficiently >>> using some PETSc functionality. Any feedback is highly appreciated. >>> >> > It sounds like you mean "is there a way to specify a communication > construct that can send my particle > information automatically". We use PetscSF for that. You can see how this > works with the DMSwarm class, which represents a particle discretization. > You can either use that, or if it does not work for you, do the same things > with your class. > > Thanks, > > Matt > > >> Best regards, >>> Miguel >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2023-12-12 at 19.42.13.png Type: image/png Size: 1252024 bytes Desc: not available URL: From s.roongta at mpie.de Tue Dec 12 18:13:02 2023 From: s.roongta at mpie.de (Sharan Roongta) Date: Wed, 13 Dec 2023 01:13:02 +0100 Subject: [petsc-users] difference in Face Sets in latest petsc release In-Reply-To: Message-ID: <3523697218-10216@xmail1.mpie.de> Thanks for the clarification. However, would it be more consistent to differentiate between 0D (vertex sets), 1D (edge sets), 2d (faces) and 3D (cell sets)?? If I want to now apply boundary condition on a face with tag 1, it would contain the 4 edges making up that face and an additional edge with the same physical tag?? Basically, I can?t differentiate between the two entities Thanks, Sharan? Group - Theory & Simulation Department of Microstructure Physics & Alloy Design From: Matthew Knepley To: Sharan Roongta Cc: "petsc-users at mcs.anl.gov" Sent: 13/12/2023 12:51 AM Subject: Re: [petsc-users] difference in Face Sets in latest petsc release On Tue, Dec 12, 2023 at 5:50?PM Sharan Roongta wrote: Hello, I see discrepancy in the size/value of the 'Face Sets' printed in the current release v3.20.2 , and v3.18.6 Attached is the .msh file -dm_view with v3.18.6 DM Object: Generated Mesh 1 MPI process ? type: plex Generated Mesh in 3 dimensions: ? Number of 0-cells per rank: 14 ? Number of 1-cells per rank: 49 ? Number of 2-cells per rank: 60 ? Number of 3-cells per rank: 24 Labels: ? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) ? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) ? Cell Sets: 1 strata with value/size (1 (24)) ? Face Sets: 5 strata with value/size (1 (4), 2 (4), 3 (4), 4 (4), 5 (4)) -dm_view with the current release (commit?4b9a870af96) DM Object: Generated Mesh 1 MPI process ? type: plex Generated Mesh in 3 dimensions: ? Number of 0-cells per rank: 14 ? Number of 1-cells per rank: 49 ? Number of 2-cells per rank: 60 ? Number of 3-cells per rank: 24 Labels: ? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) ? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) ? Cell Sets: 1 strata with value/size (1 (24)) ? Face Sets: 12 strata with value/size (1 (5), 2 (5), 3 (5), 4 (5), 5 (5), 6 (1), 7 (1), 8 (1), 9 (1), 10 (1), 11 (1), 12 (1)) I believe the older version printed the correct thing??Has something changed in the interpretation of Face Sets? Yes. In the older version, I was only labeling cells, faces, and vertices. There were complaints, so I put in the edge labels. If you check, all the additional labels are on edges, and checking your .msh file, those edges clearly have those labels. ? Thanks, ? ? ?Matt ? Thanks, Sharan Group - Theory & Simulation Department of Microstructure Physics & Alloy Design ---------------- ------------------------------------------------- Stay?up?to?date?and?follow?us?on?LinkedIn,?Twitter?and?YouTube. Max-Planck-Institut?f?r?Eisenforschung?GmbH Max-Planck-Stra?e?1 D-40237?D?sseldorf ? Handelsregister?B?2533? Amtsgericht?D?sseldorf ? Gesch?ftsf?hrung Prof.?Dr.?Gerhard?Dehm Prof.?Dr.?J?rg?Neugebauer Prof.?Dr.?Dierk?Raabe Dr.?Kai?de?Weldige ? Ust.-Id.-Nr.:?DE?11?93?58?514? Steuernummer:?105?5891?1000 Please?consider?that?invitations?and?e-mails?of?our?institute?are? only?valid?if?they?end?with??@mpie.de.? If?you?are?not?sure?of?the?validity?please?contact?rco at mpie.de Bitte?beachten?Sie,?dass?Einladungen?zu?Veranstaltungen?und?E-Mails aus?unserem?Haus?nur?mit?der?Endung??@mpie.de?g?ltig?sind.? In?Zweifelsf?llen?wenden?Sie?sich?bitte?an?rco at mpie.de ------------------------------------------------- -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 12 18:20:22 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Dec 2023 19:20:22 -0500 Subject: [petsc-users] difference in Face Sets in latest petsc release In-Reply-To: <3523697218-10216@xmail1.mpie.de> References: <3523697218-10216@xmail1.mpie.de> Message-ID: On Tue, Dec 12, 2023 at 7:13?PM Sharan Roongta wrote: > Thanks for the clarification. > However, would it be more consistent to differentiate between 0D (vertex > sets), 1D (edge sets), 2d (faces) and 3D (cell sets)? > If I want to now apply boundary condition on a face with tag 1, it would > contain the 4 edges making up that face and an additional edge with the > same physical tag? > Basically, I can?t differentiate between the two entities > When we do this in PyLith, we only tag the things we want tagged. In this mesh, absolutely everything is tagged, and with overlapping tag ranges. That seems counterproductive. Thanks, Matt > Thanks, > Sharan > > *Group - Theory & Simulation* > *Department of Microstructure Physics & Alloy Design* > > > > * From: * Matthew Knepley > * To: * Sharan Roongta > * Cc: * "petsc-users at mcs.anl.gov" > * Sent: * 13/12/2023 12:51 AM > * Subject: * Re: [petsc-users] difference in Face Sets in latest petsc > release > > On Tue, Dec 12, 2023 at 5:50?PM Sharan Roongta wrote: > > Hello, > > I see discrepancy in the size/value of the 'Face Sets' printed in the > current release v3.20.2 , and v3.18.6 > > Attached is the .msh file > > -dm_view with v3.18.6 > DM Object: Generated Mesh 1 MPI process > type: plex > Generated Mesh in 3 dimensions: > Number of 0-cells per rank: 14 > Number of 1-cells per rank: 49 > Number of 2-cells per rank: 60 > Number of 3-cells per rank: 24 > Labels: > celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) > depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) > Cell Sets: 1 strata with value/size (1 (24)) > Face Sets: 5 strata with value/size (1 (4), 2 (4), 3 (4), 4 (4), 5 (4)) > > > -dm_view with the current release (commit 4b9a870af96) > > DM Object: Generated Mesh 1 MPI process > type: plex > Generated Mesh in 3 dimensions: > Number of 0-cells per rank: 14 > Number of 1-cells per rank: 49 > Number of 2-cells per rank: 60 > Number of 3-cells per rank: 24 > Labels: > celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) > depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) > Cell Sets: 1 strata with value/size (1 (24)) > Face Sets: 12 strata with value/size (1 (5), 2 (5), 3 (5), 4 (5), 5 (5), > 6 (1), 7 (1), 8 (1), 9 (1), 10 (1), 11 (1), 12 (1)) > > I believe the older version printed the correct thing? Has something > changed in the interpretation of Face Sets? > > > Yes. In the older version, I was only labeling cells, faces, and vertices. > There were complaints, so I put in the edge labels. If you check, all the > additional labels are on edges, and checking your .msh file, those edges > clearly have those labels. > > Thanks, > > Matt > > > Thanks, > Sharan > > *Group - Theory & Simulation* > *Department of Microstructure Physics & Alloy Design* > > > ------------------------------ > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > ------------------------------ > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Tue Dec 12 18:28:17 2023 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 13 Dec 2023 00:28:17 +0000 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics In-Reply-To: References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es> Message-ID: <9072F31D-93C8-43B6-8D96-5DB4CFDECFF9@us.es> I meant the list of atoms which lies inside of a sphere of radius R_cutoff centered at the mean position of a given atom. Best, Miguel On 13 Dec 2023, at 01:14, Matthew Knepley wrote: ? On Tue, Dec 12, 2023 at 2:36?PM MIGUEL MOLINOS PEREZ > wrote: Dear Matthew and Mark, Thank you for four useful guidance. I have taken as a starting point the example in "dm/tutorials/swarm_ex3.c" to build a first approximation for domain decomposition in my molecular dynamics code (diffusive molecular dynamic to be more precise :-) ). And I must say that I am very happy with the result. However, in my journey integrating domain decomposition into my code, I am facing some new (and expected) problems. The first is in the implementation of the nearest neighbor algorithm (list of atoms closest to each atom). Can you help me understand this? For a given atom, there should be a single "closest" atom (barring degeneracies in distance). What do you mean by the list of closest atoms? Thanks, Matt My current approach to the problem is a brute force algorithm (double loop through the list of atoms and calculate the distance). However, it seems that if I call the "neighbours" function after the "DMSwarmMigrate" function the search algorithm does not work correctly. My thoughts / hints are: * The two nested for loops should be done on the global indexing of the atoms instead of the local one (I don't know how to get this number). * If I print the mean position of atom #0 (for example) each range prints a different value of the average position. One of them is the correct position corresponding to site #0, the others are different (but identically labeled) atomic sites. Which means that the site_i index is not bijective. I believe that solving this problem will increase my understanding of the domain decomposition approach and may allow me to fix the remaining parts of my code. Any additional comments are greatly appreciated. For instance, I will be happy to be pointed to any piece of code (petsc examples for example) with solves a similar problem in order to self-learn learn by example. Many thanks in advance. Best, Miguel This is the piece of code (simplified) which computes the list of neighbours for each atomic site. DMD is a structure which contains the atomistic information (DMSWARM), and the background mesh and bounding cell (DMDA and DMShell) int neighbours(DMD* Simulation) { PetscFunctionBegin; PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local)); //! Get array with the mean position of the atoms DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q_ptr); Eigen::Map mean_q(mean_q_ptr, n_atoms_local, dim); int* neigh = Simulation->neigh; int* numneigh = Simulation->numneigh; for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) { //! Get mean position of site i Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0); //! Search neighbourhs in the main cell (+ periodic cells) for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) { if (site_i != site_j) { //! Get mean position of site j in the periodic box Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0); //! Check is site j is the neibourhood of the site i double norm_r_ij = (mean_q_i - mean_q_j).norm(); if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) { neigh[site_i * maxneigh + numneigh[site_i]] = site_j; numneigh[site_i] += 1; } } } } // MPI for loop (site_i) DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q_ptr); return EXIT_SUCCESS; } This is the piece of code that I use to read the atomic positions (mean_q) from a file: //! @brief mean_q: Mean value of each atomic position double* mean_q; PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q)); cnt = 0; for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) { if (cnt < n_atoms) { mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0]; mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1]; mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2]; cnt++; } } PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q)); On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ > wrote: ?Thank you Mark! I will have a look to it. Best, Miguel On 4 Nov 2023, at 13:54, Matthew Knepley > wrote: ? On Sat, Nov 4, 2023 at 8:40?AM Mark Adams > wrote: Hi MIGUEL, This might be a good place to start: https://petsc.org/main/manual/vec/ Feel free to ask more specific questions, but the docs are a good place to start. Thanks, Mark On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ > wrote: Dear all, I am currently working on the development of a in-house molecular dynamics code using PETSc and C++. So far the code works great, however it is a little bit slow since I am not exploiting MPI for PETSc vectors. I was wondering if there is a way to perform the domain decomposition efficiently using some PETSc functionality. Any feedback is highly appreciated. It sounds like you mean "is there a way to specify a communication construct that can send my particle information automatically". We use PetscSF for that. You can see how this works with the DMSwarm class, which represents a particle discretization. You can either use that, or if it does not work for you, do the same things with your class. Thanks, Matt Best regards, Miguel -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2023-12-12 at 19.42.13.png Type: image/png Size: 1252024 bytes Desc: Screenshot 2023-12-12 at 19.42.13.png URL: From knepley at gmail.com Tue Dec 12 18:43:29 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Dec 2023 19:43:29 -0500 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics In-Reply-To: <9072F31D-93C8-43B6-8D96-5DB4CFDECFF9@us.es> References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es> <9072F31D-93C8-43B6-8D96-5DB4CFDECFF9@us.es> Message-ID: On Tue, Dec 12, 2023 at 7:28?PM MIGUEL MOLINOS PEREZ wrote: > I meant the list of atoms which lies inside of a sphere of radius R_cutoff > centered at the mean position of a given atom. > Okay, this is possible in parallel, but would require hard work and I have not done it yet, although I think all the tools are coded. In serial, there are at least two ways to do it. First, you can use a k-d tree implementation, since they usually have the radius query. I have not put one of these in, because I did not like any implementation, but Underworld and Firedrake have and it works fine. Second, you can choose a grid size of R_cutoff for the background grid, and then check neighbors. This is probably how I would start. Thanks, Matt > Best, > Miguel > > On 13 Dec 2023, at 01:14, Matthew Knepley wrote: > > ? > On Tue, Dec 12, 2023 at 2:36?PM MIGUEL MOLINOS PEREZ > wrote: > >> Dear Matthew and Mark, >> >> Thank you for four useful guidance. I have taken as a starting point the >> example in "dm/tutorials/swarm_ex3.c" to build a first approximation for >> domain decomposition in my molecular dynamics code (diffusive molecular >> dynamic to be more precise :-) ). And I must say that I am very happy with >> the result. However, in my journey integrating domain decomposition into my >> code, I am facing some new (and expected) problems. The first is in the >> implementation of the nearest neighbor algorithm (list of atoms closest to >> each atom). >> > > Can you help me understand this? For a given atom, there should be a > single "closest" atom (barring > degeneracies in distance). What do you mean by the list of closest atoms? > > Thanks, > > Matt > > >> My current approach to the problem is a brute force algorithm (double >> loop through the list of atoms and calculate the distance). However, it >> seems that if I call the "neighbours" function after the "DMSwarmMigrate" >> function the search algorithm does not work correctly. My thoughts / hints >> are: >> >> - The two nested for loops should be done on the global indexing of >> the atoms instead of the local one (I don't know how to get this number). >> - If I print the mean position of atom #0 (for example) each range >> prints a different value of the average position. One of them is the >> correct position corresponding to site #0, the others are different (but >> identically labeled) atomic sites. Which means that the site_i index is not >> bijective. >> >> >> I believe that solving this problem will increase my understanding of the >> domain decomposition approach and may allow me to fix the remaining parts >> of my code. >> >> Any additional comments are greatly appreciated. For instance, I will be >> happy to be pointed to any piece of code (petsc examples for example) with >> solves a similar problem in order to self-learn learn by example. >> >> Many thanks in advance. >> >> Best, >> Miguel >> >> This is the piece of code (simplified) which computes the list of >> neighbours for each atomic site. DMD is a structure which contains the >> atomistic information (DMSWARM), and the background mesh and bounding >> cell (DMDA and DMShell) >> >> int neighbours(DMD* Simulation) { >> >> PetscFunctionBegin; >> PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); >> >> PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local >> )); >> >> //! Get array with the mean position of the atoms >> DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, & >> blocksize, NULL, >> (void**)&mean_q_ptr); >> Eigen::Map mean_q(mean_q_ptr, n_atoms_local, dim); >> >> int* neigh = Simulation->neigh; >> int* numneigh = Simulation->numneigh; >> >> for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) { >> >> //! Get mean position of site i >> Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0); >> >> //! Search neighbourhs in the main cell (+ periodic cells) >> for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) { >> if (site_i != site_j) { >> //! Get mean position of site j in the periodic box >> Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0); >> >> //! Check is site j is the neibourhood of the site i >> double norm_r_ij = (mean_q_i - mean_q_j).norm(); >> if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) { >> neigh[site_i * maxneigh + numneigh[site_i]] = site_j; >> numneigh[site_i] += 1; >> } >> } >> } >> >> } // MPI for loop (site_i) >> >> DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, & >> blocksize, >> NULL, (void**)&mean_q_ptr); >> >> return EXIT_SUCCESS; >> } >> >> >> This is the piece of code that I use to read the atomic positions >> (mean_q) from a file: >> //! @brief mean_q: Mean value of each atomic position >> double* mean_q; >> PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, & >> blocksize, >> NULL, (void**)&mean_q)); >> >> cnt = 0; >> for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) { >> if (cnt < n_atoms) { >> mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0]; >> mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1]; >> mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2]; >> >> cnt++; >> } >> } >> PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor, >> &blocksize, NULL, (void**)&mean_q)); >> >> >> >> >> >> On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ wrote: >> >> ?Thank you Mark! I will have a look to it. >> >> Best, >> Miguel >> >> >> On 4 Nov 2023, at 13:54, Matthew Knepley wrote: >> >> ? >> On Sat, Nov 4, 2023 at 8:40?AM Mark Adams wrote: >> >>> Hi MIGUEL, >>> >>> This might be a good place to start: https://petsc.org/main/manual/vec/ >>> Feel free to ask more specific questions, but the docs are a good place >>> to start. >>> >>> Thanks, >>> Mark >>> >>> On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ >>> wrote: >>> >>>> Dear all, >>>> >>>> I am currently working on the development of a in-house molecular >>>> dynamics code using PETSc and C++. So far the code works great, however it >>>> is a little bit slow since I am not exploiting MPI for PETSc vectors. I was >>>> wondering if there is a way to perform the domain decomposition efficiently >>>> using some PETSc functionality. Any feedback is highly appreciated. >>>> >>> >> It sounds like you mean "is there a way to specify a communication >> construct that can send my particle >> information automatically". We use PetscSF for that. You can see how this >> works with the DMSwarm class, which represents a particle discretization. >> You can either use that, or if it does not work for you, do the same things >> with your class. >> >> Thanks, >> >> Matt >> >> >>> Best regards, >>>> Miguel >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Tue Dec 12 20:53:36 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 12 Dec 2023 19:53:36 -0700 Subject: [petsc-users] valgrind errors In-Reply-To: References: Message-ID: I was able to reproduce it. Let me ask MPICH developers. --Junchao Zhang On Tue, Dec 12, 2023 at 3:06?PM Randall Mackie wrote: > It now seems to me that petsc+mpich is no longer valgrind clean, or I am > doing something wrong. > > A simple program: > > > Program test > > #include "petsc/finclude/petscsys.h" > use petscsys > > PetscInt :: ierr > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > call PetscFinalize(ierr) > > end program test > > > PETSc compiled in debug mode, complex scalars, and download-mpich, when > run with valgrind generates errors like these: > > ==3997== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==3997== at 0x8C31867: writev (writev.c:26) > ==3997== by 0x9C20DE4: MPL_large_writev (mpl_sock.c:31) > ==3997== by 0x9BF1050: MPIDI_CH3I_Sock_writev (sock.c:2689) > ==3997== by 0x9BF9812: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:92) > ==3997== by 0x9BA7790: MPIDI_CH3_EagerContigSend (ch3u_eager.c:191) > ==3997== by 0x9BCA7EC: MPID_Send (mpid_send.c:132) > ==3997== by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206) > ==3997== by 0x9A2AC7C: MPIC_Send (helper_fns.c:126) > ==3997== by 0x993A645: MPIR_Bcast_intra_binomial > (bcast_intra_binomial.c:146) > ==3997== by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323) > ==3997== by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420) > ==3997== by 0x99FCF86: MPID_Bcast (mpid_coll.h:30) > ==3997== by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465) > ==3997== by 0x974A513: internal_Bcast (bcast.c:93) > ==3997== by 0x974A72B: PMPI_Bcast (bcast.c:143) > ==3997== by 0x4B8D6DB: PETScParseFortranArgs_Private (zstart.c:182) > ==3997== by 0x4B8DDFA: PetscInitFortran_Private (zstart.c:200) > ==3997== by 0x4B34931: PetscInitialize_Common (pinit.c:974) > ==3997== by 0x4B8E8C7: petscinitializef_ (zstart.c:284) > ==3997== by 0x4959434: __petscsys_MOD_petscinitializenohelp > (petscsysmod.F90:374) > ==3997== Address 0x1ffeffcac0 is on thread 1's stack > ==3997== in frame #4, created by MPIDI_CH3_EagerContigSend > (ch3u_eager.c:160) > ==3997== Uninitialised value was created by a stack allocation > ==3997== at 0x9BA7601: MPIDI_CH3_EagerContigSend (ch3u_eager.c:160) > ==3997== > > ==3997== Syscall param write(buf) points to uninitialised byte(s) > ==3997== at 0x8C2B697: write (write.c:26) > ==3997== by 0x9BF0F1D: MPIDI_CH3I_Sock_write (sock.c:2614) > ==3997== by 0x9BF7AAE: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68) > ==3997== by 0x9BA7A27: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:262) > ==3997== by 0x9BCA766: MPID_Send (mpid_send.c:119) > ==3997== by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206) > ==3997== by 0x9A2AC7C: MPIC_Send (helper_fns.c:126) > ==3997== by 0x993A645: MPIR_Bcast_intra_binomial > (bcast_intra_binomial.c:146) > ==3997== by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323) > ==3997== by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420) > ==3997== by 0x99FCF86: MPID_Bcast (mpid_coll.h:30) > ==3997== by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465) > ==3997== by 0x974A513: internal_Bcast (bcast.c:93) > ==3997== by 0x974A72B: PMPI_Bcast (bcast.c:143) > ==3997== by 0x4DB95A2: PetscOptionsGetenv (pdisplay.c:61) > ==3997== by 0x4E0D745: PetscStrreplace (str.c:572) > ==3997== by 0x4AC8DEA: PetscOptionsFilename (options.c:416) > ==3997== by 0x4ACF0B5: PetscOptionsInsertFile (options.c:632) > ==3997== by 0x4AD3CB5: PetscOptionsInsert (options.c:861) > ==3997== by 0x4B8E0EF: PetscInitFortran_Private (zstart.c:206) > ==3997== Address 0x1ffeff7998 is on thread 1's stack > ==3997== in frame #3, created by MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:223) > ==3997== Uninitialised value was created by a stack allocation > ==3997== at 0x9BA788F: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:223) > ==3997== > > Is this a known issue or am I doing something wrong? > > Thanks, Randy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Dec 12 21:10:40 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 12 Dec 2023 22:10:40 -0500 Subject: [petsc-users] Fortran Interface In-Reply-To: References: Message-ID: <94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev> See https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran and https://gitlab.com/petsc/petsc/-/merge_requests/7114 > On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users wrote: > > What do you do with something like a void pointer? I?m looking at the TaoSetObjectiveAndGradient function and it wants to pass a void *ctx pointer. You can set this to null, but apparently you have to specify the type. What type should I use? Is there something called PETSC_NULL_VOID or PETSC_NULL_CONTEXT or do I use something else? > > From: Matthew Knepley > > Date: Tuesday, December 12, 2023 at 8:33 AM > To: Palmer, Bruce J > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Fortran Interface > > Check twice before you click! This email originated from outside PNNL. > > On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users > wrote: > Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release ) but if you go under the tab C/Fortran API, only descriptions for the C interface are there. > > I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here: > > https://petsc.org/release/manual/fortran/ > > Thanks, > > Matt > > Bruce Palmer > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From s_g at berkeley.edu Tue Dec 12 22:17:06 2023 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Tue, 12 Dec 2023 20:17:06 -0800 Subject: [petsc-users] Fortran Interface In-Reply-To: <94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev> References: <94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev> Message-ID: did you mean to write type (userctx) ctx in this example? subroutine func(snes, x, f, ctx, ierr) SNES snes Vec x,f type (userctx) user PetscErrorCode ierr ... external func SNESSetFunction(snes, r, func, ctx, ierr) SNES snes Vec r PetscErrorCode ierr type (userctx) user On Tue, Dec 12, 2023 at 7:10?PM Barry Smith wrote: > > See > https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran > and https://gitlab.com/petsc/petsc/-/merge_requests/7114 > > > > On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > What do you do with something like a void pointer? I?m looking at the > TaoSetObjectiveAndGradient function and it wants to pass a void *ctx > pointer. You can set this to null, but apparently you have to specify the > type. What type should I use? Is there something called PETSC_NULL_VOID or > PETSC_NULL_CONTEXT or do I use something else? > > > *From: *Matthew Knepley > *Date: *Tuesday, December 12, 2023 at 8:33 AM > *To: *Palmer, Bruce J > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] Fortran Interface > Check twice before you click! This email originated from outside PNNL. > > On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Does documentation for the PETSc fortran interface still exist? I looked > at the web pages for 3.20 (petsc.org/release) but if you go under the tab > C/Fortran API, only descriptions for the C interface are there. > > > I think after the most recent changes, the interface was supposed to be > very close to C, so we just document the differences on specific pages, and > put the general stuff here: > > https://petsc.org/release/manual/fortran/ > > Thanks, > > Matt > > > Bruce Palmer > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Tue Dec 12 22:22:03 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 12 Dec 2023 21:22:03 -0700 Subject: [petsc-users] valgrind errors In-Reply-To: References: Message-ID: MPICH folks confirmed it's an MPICH problem and an issue is created at https://github.com/pmodels/mpich/issues/6843 --Junchao Zhang On Tue, Dec 12, 2023 at 7:53?PM Junchao Zhang wrote: > I was able to reproduce it. Let me ask MPICH developers. > > --Junchao Zhang > > > On Tue, Dec 12, 2023 at 3:06?PM Randall Mackie > wrote: > >> It now seems to me that petsc+mpich is no longer valgrind clean, or I am >> doing something wrong. >> >> A simple program: >> >> >> Program test >> >> #include "petsc/finclude/petscsys.h" >> use petscsys >> >> PetscInt :: ierr >> >> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >> call PetscFinalize(ierr) >> >> end program test >> >> >> PETSc compiled in debug mode, complex scalars, and download-mpich, when >> run with valgrind generates errors like these: >> >> ==3997== Syscall param writev(vector[...]) points to uninitialised byte(s) >> ==3997== at 0x8C31867: writev (writev.c:26) >> ==3997== by 0x9C20DE4: MPL_large_writev (mpl_sock.c:31) >> ==3997== by 0x9BF1050: MPIDI_CH3I_Sock_writev (sock.c:2689) >> ==3997== by 0x9BF9812: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:92) >> ==3997== by 0x9BA7790: MPIDI_CH3_EagerContigSend (ch3u_eager.c:191) >> ==3997== by 0x9BCA7EC: MPID_Send (mpid_send.c:132) >> ==3997== by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206) >> ==3997== by 0x9A2AC7C: MPIC_Send (helper_fns.c:126) >> ==3997== by 0x993A645: MPIR_Bcast_intra_binomial >> (bcast_intra_binomial.c:146) >> ==3997== by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323) >> ==3997== by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420) >> ==3997== by 0x99FCF86: MPID_Bcast (mpid_coll.h:30) >> ==3997== by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465) >> ==3997== by 0x974A513: internal_Bcast (bcast.c:93) >> ==3997== by 0x974A72B: PMPI_Bcast (bcast.c:143) >> ==3997== by 0x4B8D6DB: PETScParseFortranArgs_Private (zstart.c:182) >> ==3997== by 0x4B8DDFA: PetscInitFortran_Private (zstart.c:200) >> ==3997== by 0x4B34931: PetscInitialize_Common (pinit.c:974) >> ==3997== by 0x4B8E8C7: petscinitializef_ (zstart.c:284) >> ==3997== by 0x4959434: __petscsys_MOD_petscinitializenohelp >> (petscsysmod.F90:374) >> ==3997== Address 0x1ffeffcac0 is on thread 1's stack >> ==3997== in frame #4, created by MPIDI_CH3_EagerContigSend >> (ch3u_eager.c:160) >> ==3997== Uninitialised value was created by a stack allocation >> ==3997== at 0x9BA7601: MPIDI_CH3_EagerContigSend (ch3u_eager.c:160) >> ==3997== >> >> ==3997== Syscall param write(buf) points to uninitialised byte(s) >> ==3997== at 0x8C2B697: write (write.c:26) >> ==3997== by 0x9BF0F1D: MPIDI_CH3I_Sock_write (sock.c:2614) >> ==3997== by 0x9BF7AAE: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68) >> ==3997== by 0x9BA7A27: MPIDI_CH3_EagerContigShortSend >> (ch3u_eager.c:262) >> ==3997== by 0x9BCA766: MPID_Send (mpid_send.c:119) >> ==3997== by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206) >> ==3997== by 0x9A2AC7C: MPIC_Send (helper_fns.c:126) >> ==3997== by 0x993A645: MPIR_Bcast_intra_binomial >> (bcast_intra_binomial.c:146) >> ==3997== by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323) >> ==3997== by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420) >> ==3997== by 0x99FCF86: MPID_Bcast (mpid_coll.h:30) >> ==3997== by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465) >> ==3997== by 0x974A513: internal_Bcast (bcast.c:93) >> ==3997== by 0x974A72B: PMPI_Bcast (bcast.c:143) >> ==3997== by 0x4DB95A2: PetscOptionsGetenv (pdisplay.c:61) >> ==3997== by 0x4E0D745: PetscStrreplace (str.c:572) >> ==3997== by 0x4AC8DEA: PetscOptionsFilename (options.c:416) >> ==3997== by 0x4ACF0B5: PetscOptionsInsertFile (options.c:632) >> ==3997== by 0x4AD3CB5: PetscOptionsInsert (options.c:861) >> ==3997== by 0x4B8E0EF: PetscInitFortran_Private (zstart.c:206) >> ==3997== Address 0x1ffeff7998 is on thread 1's stack >> ==3997== in frame #3, created by MPIDI_CH3_EagerContigShortSend >> (ch3u_eager.c:223) >> ==3997== Uninitialised value was created by a stack allocation >> ==3997== at 0x9BA788F: MPIDI_CH3_EagerContigShortSend >> (ch3u_eager.c:223) >> ==3997== >> >> Is this a known issue or am I doing something wrong? >> >> Thanks, Randy >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From onur.notonur at proton.me Wed Dec 13 02:12:51 2023 From: onur.notonur at proton.me (onur.notonur) Date: Wed, 13 Dec 2023 08:12:51 +0000 Subject: [petsc-users] DMPlex "Could not find orientation for quadrilateral" In-Reply-To: References: Message-ID: Hi, This page explains polyMesh in the section 4.1.2: https://www.openfoam.com/documentation/user-guide/4-mesh-generation-and-conversion/4.1-mesh-description (Also I added my polyMesh files as attachment.) After my first email I generated valid mesh by finding 2 of the 6 faces which have no common vertex, then arranging them to look like this: 1-2-3-4 is my first face, 5-6-7-8 is my second face. But this approach isn't a general solution (but currently it works because my mesh consists of only hexahedral elements), If I could learn the true solution I'd be happy! Thank you again 1 2 +------+. |`. | `. | 4+--+---+3 | | | | 5+---+--+8 | `. | `.| 6+------+7 Sent with [Proton Mail](https://proton.me/) secure email. On Wednesday, December 13th, 2023 at 2:16 AM, Matthew Knepley wrote: > On Tue, Dec 12, 2023 at 12:22?PM onur.notonur via petsc-users wrote: > >> Hi, >> >> I hope this email finds you well. I am currently working on importing an OpenFOAM PolyMesh into DMPlex, and I've encountered an issue. The PolyMesh format includes face owner cells/neighbor cells and face-to-vertex connectivity. I was using the "DMPlexCreateFromCellListPetsc()" function, which required cell-to-vertex connectivity. However, when attempting to create the cell connectivity using an edge loop [p_0, p_1, ..., p_7] (p_n and p_(n+1) are valid edges in my mesh), I encountered an error stating, "Could not find orientation for quadrilateral." >> >> (Actually at first, I generated the connectivity list by simply creating a cell-to-face list and then using that to create a cell-to-vertex list. (just map over the list and remove duplicates) This created a DMPlex successfully, however, resulted in a mesh that was incorrect when looking with ParaView. I think that was because of I stated wrong edge loop to create cells) >> >> I understand that I may need to follow a different format for connectivity, but I'm not sure what that format is. My current mesh is hexahedral, consisting of 8 corner elements(if important). I would appreciate any guidance on a general approach to address this issue. > > Can you start by giving the PolyMesh format, or some URL with it documented? > > Thanks, > > Matt > >> Thank you for your time and assistance. >> >> Best, >> Onur >> >> Sent with Proton Mail secure email. > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: polyMesh.zip Type: application/zip Size: 426688 bytes Desc: not available URL: From 1807580692 at qq.com Wed Dec 13 02:51:09 2023 From: 1807580692 at qq.com (=?gb18030?B?MTgwNzU4MDY5Mg==?=) Date: Wed, 13 Dec 2023 16:51:09 +0800 Subject: [petsc-users] (no subject) Message-ID: Hello, I have encountered some problems. Here are some of my configurations. OS Version and Type:  Linux daihuanhe-Aspire-A315-55G 5.15.0-89-generic #99~20.04.1-Ubuntu SMP Thu Nov 2 15:16:47 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux PETSc Version: #define PETSC_VERSION_RELEASE    1 #define PETSC_VERSION_MAJOR      3 #define PETSC_VERSION_MINOR      19 #define PETSC_VERSION_SUBMINOR   0 #define PETSC_RELEASE_DATE       "Mar 30, 2023" #define PETSC_VERSION_DATE       "unknown" MPI implementation: MPICH Compiler and version: Gnu C The problem is when I type ?mpiexec -n 4 ./ex19_1 -lidvelocity 100 -prandtl 0.72 -grashof 10000 -da_grid_x 128 -da_grid_y 128 -snes_type newtonls -pc_type fieldsplit -pc_fieldsplit_type multiplicative -pc_fieldsplit_block_size 4 -pc_fieldsplit_0_fields 0,2 -pc_fieldsplit_1_fields 1,3 -fieldsplit_0_pc_type asm -fieldsplit_0_pc_asm_type restrict -fieldsplit_0_pc_asm_overlap 5 -fieldsplit_0_sub_pc_type lu -fieldsplit_1_pc_type asm -fieldsplit_1_pc_asm_type restrict -fieldsplit_1_pc_asm_overlap 5 -fieldsplit_1_sub_pc_type lu -snes_monitor -snes_converged_reason  -fieldsplit_ksp_type gmres -fieldsplit_0_ksp_atol 1e-10 -fieldsplit_0_ksp_rtol 1e-8 -fieldsplit_1_ksp_atol 1e-10 -fieldsplit_1_ksp_rtol 1e-8? in the command line, where my path is /petsc/src/snes/tutorials. It returns ?lid velocity = 100., prandtl # = 0.72, grashof # = 10000.   0 SNES Function norm 1.125212317214e+03 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Block size 2 is incompatible with the indices: non consecutive indices 0 2 [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Invalid argument [1]PETSC ERROR: Block size 2 is incompatible with the indices: non consecutive indices 16384 16386 [1]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [1]PETSC ERROR:   Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [2]PETSC ERROR: Invalid argument [2]PETSC ERROR: Block size 2 is incompatible with the indices: non consecutive indices 32768 32770 [2]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [2]PETSC ERROR:   Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_0_ksp_rtol value: 1e-8 source: command line [2]PETSC ERROR: [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [3]PETSC ERROR: Invalid argument [3]PETSC ERROR: Block size 2 is incompatible with the indices: non consecutive indices 49152 49154 [3]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [3]PETSC ERROR:   Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_0_ksp_rtol value: 1e-8 source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_0_ksp_rtol value: 1e-8 source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_asm_overlap value: 5 source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_asm_type value: restrict source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_type value: asm source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_0_sub_pc_type value: lu source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_1_ksp_rtol value: 1e-8 source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_asm_overlap value: 5 source: command line [0]PETSC ERROR: [1]PETSC ERROR:   Option left: name:-fieldsplit_0_ksp_rtol value: 1e-8 source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_asm_overlap value: 5 source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_asm_type value: restrict source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_type value: asm source: command line   Option left: name:-fieldsplit_0_pc_asm_overlap value: 5 source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_asm_type value: restrict source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_type value: asm source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_0_sub_pc_type value: lu source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_asm_overlap value: 5 source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_asm_type value: restrict source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_0_pc_type value: asm source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_0_sub_pc_type value: lu source: command line   Option left: name:-fieldsplit_1_pc_asm_type value: restrict source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_type value: asm source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_1_sub_pc_type value: lu source: command line [0]PETSC ERROR:   Option left: name:-fieldsplit_ksp_type value: gmres source: command line [0]PETSC ERROR:   Option left: name:-snes_converged_reason (no value) source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_0_sub_pc_type value: lu source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_1_ksp_rtol value: 1e-8 source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_asm_overlap value: 5 source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_asm_type value: restrict source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_type value: asm source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_1_ksp_rtol value: 1e-8 source: command line [3]PETSC ERROR: [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.20.2, unknown [0]PETSC ERROR: ./ex19_1 on a arch-linux-c-debug named daihuanhe-Aspire-A315-55G by daihuanhe Wed Dec 13 16:47:08 2023   Option left: name:-fieldsplit_1_ksp_rtol value: 1e-8 source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_asm_overlap value: 5 source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_asm_type value: restrict source: command line [1]PETSC ERROR: [2]PETSC ERROR:   Option left: name:-fieldsplit_1_sub_pc_type value: lu source: command line [2]PETSC ERROR:   Option left: name:-fieldsplit_ksp_type value: gmres source: command line [2]PETSC ERROR:   Option left: name:-snes_converged_reason (no value) source: command line   Option left: name:-fieldsplit_1_pc_asm_overlap value: 5 source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_asm_type value: restrict source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_1_pc_type value: asm source: command line [3]PETSC ERROR: [0]PETSC ERROR: Configure options   Option left: name:-fieldsplit_1_pc_type value: asm source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_1_sub_pc_type value: lu source: command line [1]PETSC ERROR:   Option left: name:-fieldsplit_ksp_type value: gmres source: command line [2]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [2]PETSC ERROR: Petsc Release Version 3.20.2, unknown [2]PETSC ERROR:   Option left: name:-fieldsplit_1_sub_pc_type value: lu source: command line [3]PETSC ERROR:   Option left: name:-fieldsplit_ksp_type value: gmres source: command line [3]PETSC ERROR:   Option left: name:-snes_converged_reason (no value) source: command line [0]PETSC ERROR: #1 ISSetBlockSize() at /home/daihuanhe/petsc-v3.20.2/src/vec/is/is/interface/index.c:1933 [0]PETSC ERROR: [1]PETSC ERROR:   Option left: name:-snes_converged_reason (no value) source: command line [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.20.2, unknown ./ex19_1 on a arch-linux-c-debug named daihuanhe-Aspire-A315-55G by daihuanhe Wed Dec 13 16:47:08 2023 [2]PETSC ERROR: Configure options [2]PETSC ERROR: [3]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [3]PETSC ERROR: Petsc Release Version 3.20.2, unknown [3]PETSC ERROR: ./ex19_1 on a arch-linux-c-debug named daihuanhe-Aspire-A315-55G by daihuanhe Wed Dec 13 16:47:08 2023 #2 PCSetUp_FieldSplit() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c:633 [0]PETSC ERROR: #3 PCSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/interface/precon.c:1080 [1]PETSC ERROR: ./ex19_1 on a arch-linux-c-debug named daihuanhe-Aspire-A315-55G by daihuanhe Wed Dec 13 16:47:08 2023 [1]PETSC ERROR: Configure options #1 ISSetBlockSize() at /home/daihuanhe/petsc-v3.20.2/src/vec/is/is/interface/index.c:1933 [2]PETSC ERROR: #2 PCSetUp_FieldSplit() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c:633 [2]PETSC ERROR: [3]PETSC ERROR: Configure options [3]PETSC ERROR: #1 ISSetBlockSize() at /home/daihuanhe/petsc-v3.20.2/src/vec/is/is/interface/index.c:1933 [0]PETSC ERROR: #4 KSPSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:415 [0]PETSC ERROR: #5 KSPSolve_Private() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:836 [1]PETSC ERROR: #1 ISSetBlockSize() at /home/daihuanhe/petsc-v3.20.2/src/vec/is/is/interface/index.c:1933 [1]PETSC ERROR: #2 PCSetUp_FieldSplit() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c:633 #3 PCSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/interface/precon.c:1080 [2]PETSC ERROR: #4 KSPSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:415 [2]PETSC ERROR: #5 KSPSolve_Private() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:836 [3]PETSC ERROR: #2 PCSetUp_FieldSplit() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c:633 [3]PETSC ERROR: [0]PETSC ERROR: #6 KSPSolve() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:1083 [0]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/daihuanhe/petsc-v3.20.2/src/snes/impls/ls/ls.c:215 [0]PETSC ERROR: [1]PETSC ERROR: #3 PCSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/interface/precon.c:1080 [1]PETSC ERROR: #4 KSPSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:415 [1]PETSC ERROR: [2]PETSC ERROR: #6 KSPSolve() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:1083 [2]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/daihuanhe/petsc-v3.20.2/src/snes/impls/ls/ls.c:215 [2]PETSC ERROR: #3 PCSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/interface/precon.c:1080 [3]PETSC ERROR: #4 KSPSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:415 [3]PETSC ERROR: #5 KSPSolve_Private() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:836 [3]PETSC ERROR: #6 KSPSolve() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:1083 #8 SNESSolve() at /home/daihuanhe/petsc-v3.20.2/src/snes/interface/snes.c:4659 [0]PETSC ERROR: #9 main() at ex19_1.c:159 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -da_grid_x 128 (source: command line) [0]PETSC ERROR: #5 KSPSolve_Private() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:836 [1]PETSC ERROR: #6 KSPSolve() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:1083 [1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/daihuanhe/petsc-v3.20.2/src/snes/impls/ls/ls.c:215 [1]PETSC ERROR: #8 SNESSolve() at /home/daihuanhe/petsc-v3.20.2/src/snes/interface/snes.c:4659 #8 SNESSolve() at /home/daihuanhe/petsc-v3.20.2/src/snes/interface/snes.c:4659 [2]PETSC ERROR: #9 main() at ex19_1.c:159 [2]PETSC ERROR: PETSc Option Table entries: [2]PETSC ERROR: -da_grid_x 128 (source: command line) [2]PETSC ERROR: -da_grid_y 128 (source: command line) [3]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/daihuanhe/petsc-v3.20.2/src/snes/impls/ls/ls.c:215 [3]PETSC ERROR: #8 SNESSolve() at /home/daihuanhe/petsc-v3.20.2/src/snes/interface/snes.c:4659 [3]PETSC ERROR: -da_grid_y 128 (source: command line) [0]PETSC ERROR: -fieldsplit_0_ksp_atol 1e-10 (source: command line) [0]PETSC ERROR: -fieldsplit_0_ksp_rtol 1e-8 (source: command line) [0]PETSC ERROR: -fieldsplit_0_pc_asm_overlap 5 (source: command line) [1]PETSC ERROR: #9 main() at ex19_1.c:159 [1]PETSC ERROR: PETSc Option Table entries: [1]PETSC ERROR: -da_grid_x 128 (source: command line) [2]PETSC ERROR: -fieldsplit_0_ksp_atol 1e-10 (source: command line) [2]PETSC ERROR: -fieldsplit_0_ksp_rtol 1e-8 (source: command line) [2]PETSC ERROR: -fieldsplit_0_pc_asm_overlap 5 (source: command line) [2]PETSC ERROR: #9 main() at ex19_1.c:159 [3]PETSC ERROR: PETSc Option Table entries: [3]PETSC ERROR: -da_grid_x 128 (source: command line) [3]PETSC ERROR: [0]PETSC ERROR: -fieldsplit_0_pc_asm_type restrict (source: command line) [0]PETSC ERROR: -fieldsplit_0_pc_type asm (source: command line) [0]PETSC ERROR: -fieldsplit_0_sub_pc_type lu (source: command line) [0]PETSC ERROR: [1]PETSC ERROR: -da_grid_y 128 (source: command line) [1]PETSC ERROR: -fieldsplit_0_ksp_atol 1e-10 (source: command line) [1]PETSC ERROR: -fieldsplit_0_ksp_rtol 1e-8 (source: command line) -fieldsplit_0_pc_asm_type restrict (source: command line) [2]PETSC ERROR: -fieldsplit_0_pc_type asm (source: command line) [2]PETSC ERROR: -fieldsplit_0_sub_pc_type lu (source: command line) [2]PETSC ERROR: -fieldsplit_1_ksp_atol 1e-10 (source: command line) -da_grid_y 128 (source: command line) [3]PETSC ERROR: -fieldsplit_0_ksp_atol 1e-10 (source: command line) [3]PETSC ERROR: -fieldsplit_0_ksp_rtol 1e-8 (source: command line) [3]PETSC ERROR: -fieldsplit_1_ksp_atol 1e-10 (source: command line) [0]PETSC ERROR: -fieldsplit_1_ksp_rtol 1e-8 (source: command line) [0]PETSC ERROR: -fieldsplit_1_pc_asm_overlap 5 (source: command line) [0]PETSC ERROR: [1]PETSC ERROR: -fieldsplit_0_pc_asm_overlap 5 (source: command line) [1]PETSC ERROR: -fieldsplit_0_pc_asm_type restrict (source: command line) [1]PETSC ERROR: -fieldsplit_0_pc_type asm (source: command line) [2]PETSC ERROR: -fieldsplit_1_ksp_rtol 1e-8 (source: command line) [2]PETSC ERROR: -fieldsplit_1_pc_asm_overlap 5 (source: command line) [2]PETSC ERROR: -fieldsplit_1_pc_asm_type restrict (source: command line) [2]PETSC ERROR: -fieldsplit_0_pc_asm_overlap 5 (source: command line) [3]PETSC ERROR: -fieldsplit_0_pc_asm_type restrict (source: command line) [3]PETSC ERROR: -fieldsplit_0_pc_type asm (source: command line) [3]PETSC ERROR: -fieldsplit_1_pc_asm_type restrict (source: command line) [0]PETSC ERROR: -fieldsplit_1_pc_type asm (source: command line) [0]PETSC ERROR: -fieldsplit_1_sub_pc_type lu (source: command line) [0]PETSC ERROR: [1]PETSC ERROR: -fieldsplit_0_sub_pc_type lu (source: command line) [1]PETSC ERROR: -fieldsplit_1_ksp_atol 1e-10 (source: command line) [1]PETSC ERROR: -fieldsplit_1_ksp_rtol 1e-8 (source: command line) -fieldsplit_1_pc_type asm (source: command line) [2]PETSC ERROR: -fieldsplit_1_sub_pc_type lu (source: command line) [2]PETSC ERROR: -fieldsplit_ksp_type gmres (source: command line) [2]PETSC ERROR: -grashof 10000 (source: command line) -fieldsplit_0_sub_pc_type lu (source: command line) [3]PETSC ERROR: -fieldsplit_1_ksp_atol 1e-10 (source: command line) [3]PETSC ERROR: -fieldsplit_1_ksp_rtol 1e-8 (source: command line) [3]PETSC ERROR: -fieldsplit_ksp_type gmres (source: command line) [0]PETSC ERROR: -grashof 10000 (source: command line) [0]PETSC ERROR: -lidvelocity 100 (source: command line) [0]PETSC ERROR: -pc_fieldsplit_0_fields 0,2 (source: command line) [1]PETSC ERROR: -fieldsplit_1_pc_asm_overlap 5 (source: command line) [1]PETSC ERROR: -fieldsplit_1_pc_asm_type restrict (source: command line) [1]PETSC ERROR: -fieldsplit_1_pc_type asm (source: command line) [2]PETSC ERROR: -lidvelocity 100 (source: command line) [2]PETSC ERROR: -pc_fieldsplit_0_fields 0,2 (source: command line) [2]PETSC ERROR: -pc_fieldsplit_1_fields 1,3 (source: command line) [2]PETSC ERROR: -fieldsplit_1_pc_asm_overlap 5 (source: command line) [3]PETSC ERROR: -fieldsplit_1_pc_asm_type restrict (source: command line) [3]PETSC ERROR: -fieldsplit_1_pc_type asm (source: command line) [3]PETSC ERROR: [0]PETSC ERROR: -pc_fieldsplit_1_fields 1,3 (source: command line) [0]PETSC ERROR: -pc_fieldsplit_block_size 4 (source: command line) [0]PETSC ERROR: -pc_fieldsplit_type multiplicative (source: command line) [1]PETSC ERROR: -fieldsplit_1_sub_pc_type lu (source: command line) [1]PETSC ERROR: -fieldsplit_ksp_type gmres (source: command line) [1]PETSC ERROR: -grashof 10000 (source: command line) -pc_fieldsplit_block_size 4 (source: command line) [2]PETSC ERROR: -pc_fieldsplit_type multiplicative (source: command line) [2]PETSC ERROR: -pc_type fieldsplit (source: command line) [2]PETSC ERROR: -prandtl 0.72 (source: command line) -fieldsplit_1_sub_pc_type lu (source: command line) [3]PETSC ERROR: -fieldsplit_ksp_type gmres (source: command line) [3]PETSC ERROR: -grashof 10000 (source: command line) [3]PETSC ERROR: [0]PETSC ERROR: -pc_type fieldsplit (source: command line) [0]PETSC ERROR: -prandtl 0.72 (source: command line) [0]PETSC ERROR: -snes_converged_reason (source: command line) [1]PETSC ERROR: -lidvelocity 100 (source: command line) [1]PETSC ERROR: -pc_fieldsplit_0_fields 0,2 (source: command line) [1]PETSC ERROR: -pc_fieldsplit_1_fields 1,3 (source: command line) [2]PETSC ERROR: -snes_converged_reason (source: command line) [2]PETSC ERROR: -snes_monitor (source: command line) [2]PETSC ERROR: -snes_type newtonls (source: command line) -lidvelocity 100 (source: command line) [3]PETSC ERROR: -pc_fieldsplit_0_fields 0,2 (source: command line) [3]PETSC ERROR: -pc_fieldsplit_1_fields 1,3 (source: command line) [3]PETSC ERROR: [0]PETSC ERROR: -snes_monitor (source: command line) [0]PETSC ERROR: -snes_type newtonls (source: command line) [0]PETSC ERROR: [1]PETSC ERROR: -pc_fieldsplit_block_size 4 (source: command line) [1]PETSC ERROR: -pc_fieldsplit_type multiplicative (source: command line) [1]PETSC ERROR: -pc_type fieldsplit (source: command line) [2]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 2 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0 -pc_fieldsplit_block_size 4 (source: command line) [3]PETSC ERROR: -pc_fieldsplit_type multiplicative (source: command line) [3]PETSC ERROR: -pc_type fieldsplit (source: command line) [3]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 0 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0 [1]PETSC ERROR: -prandtl 0.72 (source: command line) [1]PETSC ERROR: -snes_converged_reason (source: command line) [1]PETSC ERROR: -snes_monitor (source: command line) -prandtl 0.72 (source: command line) [3]PETSC ERROR: -snes_converged_reason (source: command line) [3]PETSC ERROR: -snes_monitor (source: command line) [1]PETSC ERROR: -snes_type newtonls (source: command line) [1]PETSC ERROR: [3]PETSC ERROR: -snes_type newtonls (source: command line) [3]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 1 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0 Abort(62) on node 3 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0?. Please tell me what should I do?Thank you very much. 1807580692 1807580692 at qq.com   -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Dec 13 04:18:50 2023 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 13 Dec 2023 10:18:50 +0000 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics In-Reply-To: References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es> <9072F31D-93C8-43B6-8D96-5DB4CFDECFF9@us.es> Message-ID: Thank you for the feedback. Seems like the second option is the right one (according to MD literature). Best, Miguel On 13 Dec 2023, at 01:43, Matthew Knepley wrote: On Tue, Dec 12, 2023 at 7:28?PM MIGUEL MOLINOS PEREZ > wrote: I meant the list of atoms which lies inside of a sphere of radius R_cutoff centered at the mean position of a given atom. Okay, this is possible in parallel, but would require hard work and I have not done it yet, although I think all the tools are coded. In serial, there are at least two ways to do it. First, you can use a k-d tree implementation, since they usually have the radius query. I have not put one of these in, because I did not like any implementation, but Underworld and Firedrake have and it works fine. Second, you can choose a grid size of R_cutoff for the background grid, and then check neighbors. This is probably how I would start. Thanks, Matt Best, Miguel On 13 Dec 2023, at 01:14, Matthew Knepley > wrote: ? On Tue, Dec 12, 2023 at 2:36?PM MIGUEL MOLINOS PEREZ > wrote: Dear Matthew and Mark, Thank you for four useful guidance. I have taken as a starting point the example in "dm/tutorials/swarm_ex3.c" to build a first approximation for domain decomposition in my molecular dynamics code (diffusive molecular dynamic to be more precise :-) ). And I must say that I am very happy with the result. However, in my journey integrating domain decomposition into my code, I am facing some new (and expected) problems. The first is in the implementation of the nearest neighbor algorithm (list of atoms closest to each atom). Can you help me understand this? For a given atom, there should be a single "closest" atom (barring degeneracies in distance). What do you mean by the list of closest atoms? Thanks, Matt My current approach to the problem is a brute force algorithm (double loop through the list of atoms and calculate the distance). However, it seems that if I call the "neighbours" function after the "DMSwarmMigrate" function the search algorithm does not work correctly. My thoughts / hints are: * The two nested for loops should be done on the global indexing of the atoms instead of the local one (I don't know how to get this number). * If I print the mean position of atom #0 (for example) each range prints a different value of the average position. One of them is the correct position corresponding to site #0, the others are different (but identically labeled) atomic sites. Which means that the site_i index is not bijective. I believe that solving this problem will increase my understanding of the domain decomposition approach and may allow me to fix the remaining parts of my code. Any additional comments are greatly appreciated. For instance, I will be happy to be pointed to any piece of code (petsc examples for example) with solves a similar problem in order to self-learn learn by example. Many thanks in advance. Best, Miguel This is the piece of code (simplified) which computes the list of neighbours for each atomic site. DMD is a structure which contains the atomistic information (DMSWARM), and the background mesh and bounding cell (DMDA and DMShell) int neighbours(DMD* Simulation) { PetscFunctionBegin; PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local)); //! Get array with the mean position of the atoms DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q_ptr); Eigen::Map mean_q(mean_q_ptr, n_atoms_local, dim); int* neigh = Simulation->neigh; int* numneigh = Simulation->numneigh; for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) { //! Get mean position of site i Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0); //! Search neighbourhs in the main cell (+ periodic cells) for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) { if (site_i != site_j) { //! Get mean position of site j in the periodic box Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0); //! Check is site j is the neibourhood of the site i double norm_r_ij = (mean_q_i - mean_q_j).norm(); if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) { neigh[site_i * maxneigh + numneigh[site_i]] = site_j; numneigh[site_i] += 1; } } } } // MPI for loop (site_i) DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q_ptr); return EXIT_SUCCESS; } This is the piece of code that I use to read the atomic positions (mean_q) from a file: //! @brief mean_q: Mean value of each atomic position double* mean_q; PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q)); cnt = 0; for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) { if (cnt < n_atoms) { mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0]; mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1]; mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2]; cnt++; } } PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor, &blocksize, NULL, (void**)&mean_q)); On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ > wrote: ?Thank you Mark! I will have a look to it. Best, Miguel On 4 Nov 2023, at 13:54, Matthew Knepley > wrote: ? On Sat, Nov 4, 2023 at 8:40?AM Mark Adams > wrote: Hi MIGUEL, This might be a good place to start: https://petsc.org/main/manual/vec/ Feel free to ask more specific questions, but the docs are a good place to start. Thanks, Mark On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ > wrote: Dear all, I am currently working on the development of a in-house molecular dynamics code using PETSc and C++. So far the code works great, however it is a little bit slow since I am not exploiting MPI for PETSc vectors. I was wondering if there is a way to perform the domain decomposition efficiently using some PETSc functionality. Any feedback is highly appreciated. It sounds like you mean "is there a way to specify a communication construct that can send my particle information automatically". We use PetscSF for that. You can see how this works with the DMSwarm class, which represents a particle discretization. You can either use that, or if it does not work for you, do the same things with your class. Thanks, Matt Best regards, Miguel -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Dec 13 05:38:15 2023 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 13 Dec 2023 06:38:15 -0500 Subject: [petsc-users] Fortran Interface In-Reply-To: References: <94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev> Message-ID: yep, that's a cut and paste bug. "user" is a common name for a user context. On Tue, Dec 12, 2023 at 11:17?PM Sanjay Govindjee via petsc-users < petsc-users at mcs.anl.gov> wrote: > did you mean to write > > type (userctx) ctx > > in this example? > > > subroutine func(snes, x, f, ctx, ierr) > SNES snes > Vec x,f > type (userctx) user > PetscErrorCode ierr > ... > > external func > SNESSetFunction(snes, r, func, ctx, ierr) > SNES snes > Vec r > PetscErrorCode ierr > type (userctx) user > > > > > On Tue, Dec 12, 2023 at 7:10?PM Barry Smith wrote: > >> >> See >> https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran >> and https://gitlab.com/petsc/petsc/-/merge_requests/7114 >> >> >> >> On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> What do you do with something like a void pointer? I?m looking at the >> TaoSetObjectiveAndGradient function and it wants to pass a void *ctx >> pointer. You can set this to null, but apparently you have to specify the >> type. What type should I use? Is there something called PETSC_NULL_VOID or >> PETSC_NULL_CONTEXT or do I use something else? >> >> >> *From: *Matthew Knepley >> *Date: *Tuesday, December 12, 2023 at 8:33 AM >> *To: *Palmer, Bruce J >> *Cc: *petsc-users at mcs.anl.gov >> *Subject: *Re: [petsc-users] Fortran Interface >> Check twice before you click! This email originated from outside PNNL. >> >> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> Does documentation for the PETSc fortran interface still exist? I looked >> at the web pages for 3.20 (petsc.org/release) but if you go under the >> tab C/Fortran API, only descriptions for the C interface are there. >> >> >> I think after the most recent changes, the interface was supposed to be >> very close to C, so we just document the differences on specific pages, and >> put the general stuff here: >> >> https://petsc.org/release/manual/fortran/ >> >> Thanks, >> >> Matt >> >> >> Bruce Palmer >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 13 08:27:50 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 13 Dec 2023 09:27:50 -0500 Subject: [petsc-users] Fortran Interface In-Reply-To: References: <94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev> Message-ID: <68FF47E6-F54A-4321-9C37-E256C66E21EF@petsc.dev> fixed > On Dec 12, 2023, at 11:17?PM, Sanjay Govindjee wrote: > > did you mean to write > type (userctx) ctx > in this example? > > subroutine func(snes, x, f, ctx, ierr) > SNES snes > Vec x,f > type (userctx) user > PetscErrorCode ierr > ... > > external func > SNESSetFunction(snes, r, func, ctx, ierr) > SNES snes > Vec r > PetscErrorCode ierr > type (userctx) user > > > > On Tue, Dec 12, 2023 at 7:10?PM Barry Smith > wrote: >> >> See https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran and https://gitlab.com/petsc/petsc/-/merge_requests/7114 >> >> >> >>> On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users > wrote: >>> >>> What do you do with something like a void pointer? I?m looking at the TaoSetObjectiveAndGradient function and it wants to pass a void *ctx pointer. You can set this to null, but apparently you have to specify the type. What type should I use? Is there something called PETSC_NULL_VOID or PETSC_NULL_CONTEXT or do I use something else? >>> >>> From: Matthew Knepley > >>> Date: Tuesday, December 12, 2023 at 8:33 AM >>> To: Palmer, Bruce J > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Fortran Interface >>> >>> Check twice before you click! This email originated from outside PNNL. >>> >>> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users > wrote: >>> Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release ) but if you go under the tab C/Fortran API, only descriptions for the C interface are there. >>> >>> I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here: >>> >>> https://petsc.org/release/manual/fortran/ >>> >>> Thanks, >>> >>> Matt >>> >>> Bruce Palmer >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From coltonbryant2021 at u.northwestern.edu Wed Dec 13 10:21:43 2023 From: coltonbryant2021 at u.northwestern.edu (Colton Bryant) Date: Wed, 13 Dec 2023 10:21:43 -0600 Subject: [petsc-users] DMSTAG Gathering Vector on single process In-Reply-To: <73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev> References: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev> <73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev> Message-ID: Hi, Thanks for the help last week. The suggestions made the implementation I had much cleaner. I had one follow up question. Is there a way to sort of undo this operation? I know the vec scatter can be done backwards to distribute the arrays but I didn't see an easy way to migrate the DMDA vectors back into the DMStag object. Thanks for any advice. -Colton On Wed, Dec 6, 2023 at 8:18?PM Barry Smith wrote: > > > On Dec 6, 2023, at 8:35?PM, Matthew Knepley wrote: > > On Wed, Dec 6, 2023 at 8:10?PM Barry Smith wrote: > >> >> Depending on the serial library you may not need to split the vector >> into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just >> global to natural and scatter to zero on the full vector, now the full >> vector is on the first rank and you can access what you need in that one >> vector if possible. >> > > Does DMStag have a GlobalToNatural? > > > Good point, it does not appear to have such a thing, though it could. > > Also, the serial code would have to have identical interleaving. > > Thanks, > > Matt > >> On Dec 6, 2023, at 6:37?PM, Colton Bryant < >> coltonbryant2021 at u.northwestern.edu> wrote: >> >> Ah excellent! I was not aware of the ability to preallocate the objects >> and migrate them each time. >> >> Thanks! >> -Colton >> >> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley wrote: >> >>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant < >>> coltonbryant2021 at u.northwestern.edu> wrote: >>> >>>> Hello, >>>> >>>> I am working on a code in which a DMSTAG object is used to solve a >>>> fluid flow problem and I need to gather this flow data on a single process >>>> to interact with an existing (serial) library at each timestep of my >>>> simulation. After looking around the solution I've tried is: >>>> >>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the >>>> flow >>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the >>>> components naturally ordered >>>> -use VecScatterCreateToZero to set up and then do the scatter to gather >>>> on the single process >>>> >>>> Unless I'm misunderstanding something this method results in a lot of >>>> memory allocation/freeing happening at each step of the evolution and I was >>>> wondering if there is a way to directly perform such a scatter from the >>>> DMSTAG object without splitting as I'm doing here. >>>> >>> >>> 1) You can see here: >>> >>> >>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA >>> >>> that this function is small. You can do the DMDA creation manually, and >>> then just call DMStagMigrateVecDMDA() each time, which will not create >>> anything. >>> >>> 2) You can create the natural vector upfront, and just scatter each time. >>> >>> 3) You can create the serial vector upfront, and just scatter each time. >>> >>> This is some data movement. You can compress the g2n and 2zero scatters >>> using >>> >>> https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/ >>> >>> as an optimization. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Any advice would be much appreciated! >>>> >>>> Best, >>>> Colton Bryant >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bruce.Palmer at pnnl.gov Wed Dec 13 10:42:16 2023 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Wed, 13 Dec 2023 16:42:16 +0000 Subject: [petsc-users] Fortran Interface In-Reply-To: <68FF47E6-F54A-4321-9C37-E256C66E21EF@petsc.dev> References: <94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev> <68FF47E6-F54A-4321-9C37-E256C66E21EF@petsc.dev> Message-ID: Thanks, that clears things up nicely. Bruce From: Barry Smith Date: Wednesday, December 13, 2023 at 6:28 AM To: Sanjay Govindjee Cc: Palmer, Bruce J , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Fortran Interface fixed On Dec 12, 2023, at 11:17?PM, Sanjay Govindjee wrote: did you mean to write type (userctx) ctx in this example? subroutine func(snes, x, f, ctx, ierr) SNES snes Vec x,f type (userctx) user PetscErrorCode ierr ... external func SNESSetFunction(snes, r, func, ctx, ierr) SNES snes Vec r PetscErrorCode ierr type (userctx) user On Tue, Dec 12, 2023 at 7:10?PM Barry Smith > wrote: See https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran and https://gitlab.com/petsc/petsc/-/merge_requests/7114 On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users > wrote: What do you do with something like a void pointer? I?m looking at the TaoSetObjectiveAndGradient function and it wants to pass a void *ctx pointer. You can set this to null, but apparently you have to specify the type. What type should I use? Is there something called PETSC_NULL_VOID or PETSC_NULL_CONTEXT or do I use something else? From: Matthew Knepley > Date: Tuesday, December 12, 2023 at 8:33 AM To: Palmer, Bruce J > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Fortran Interface Check twice before you click! This email originated from outside PNNL. On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users > wrote: Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release) but if you go under the tab C/Fortran API, only descriptions for the C interface are there. I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here: https://petsc.org/release/manual/fortran/ Thanks, Matt Bruce Palmer -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 13 10:48:23 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 13 Dec 2023 11:48:23 -0500 Subject: [petsc-users] DMSTAG Gathering Vector on single process In-Reply-To: References: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev> <73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev> Message-ID: On Wed, Dec 13, 2023 at 11:22?AM Colton Bryant < coltonbryant2021 at u.northwestern.edu> wrote: > Hi, > > Thanks for the help last week. The suggestions made the implementation I > had much cleaner. I had one follow up question. Is there a way to sort of > undo this operation? I know the vec scatter can be done backwards to > distribute the arrays but I didn't see an easy way to migrate the DMDA > vectors back into the DMStag object. > It is not there. However, writing it would be straightforward. I would 1) Expose DMStagMigrateVecDMDA(), which is not currently public 2) Add a ScatterMode argument 3) Put in code that calls DMStagSetValuesStencil(), rather than GetValuesStencil() We would be happy to take on MR on this, and could help in the review. Thanks, MAtt > Thanks for any advice. > > -Colton > > On Wed, Dec 6, 2023 at 8:18?PM Barry Smith wrote: > >> >> >> On Dec 6, 2023, at 8:35?PM, Matthew Knepley wrote: >> >> On Wed, Dec 6, 2023 at 8:10?PM Barry Smith wrote: >> >>> >>> Depending on the serial library you may not need to split the vector >>> into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just >>> global to natural and scatter to zero on the full vector, now the full >>> vector is on the first rank and you can access what you need in that one >>> vector if possible. >>> >> >> Does DMStag have a GlobalToNatural? >> >> >> Good point, it does not appear to have such a thing, though it could. >> >> Also, the serial code would have to have identical interleaving. >> >> Thanks, >> >> Matt >> >>> On Dec 6, 2023, at 6:37?PM, Colton Bryant < >>> coltonbryant2021 at u.northwestern.edu> wrote: >>> >>> Ah excellent! I was not aware of the ability to preallocate the objects >>> and migrate them each time. >>> >>> Thanks! >>> -Colton >>> >>> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley >>> wrote: >>> >>>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant < >>>> coltonbryant2021 at u.northwestern.edu> wrote: >>>> >>>>> Hello, >>>>> >>>>> I am working on a code in which a DMSTAG object is used to solve a >>>>> fluid flow problem and I need to gather this flow data on a single process >>>>> to interact with an existing (serial) library at each timestep of my >>>>> simulation. After looking around the solution I've tried is: >>>>> >>>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the >>>>> flow >>>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the >>>>> components naturally ordered >>>>> -use VecScatterCreateToZero to set up and then do the scatter to >>>>> gather on the single process >>>>> >>>>> Unless I'm misunderstanding something this method results in a lot of >>>>> memory allocation/freeing happening at each step of the evolution and I was >>>>> wondering if there is a way to directly perform such a scatter from the >>>>> DMSTAG object without splitting as I'm doing here. >>>>> >>>> >>>> 1) You can see here: >>>> >>>> >>>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA >>>> >>>> that this function is small. You can do the DMDA creation manually, and >>>> then just call DMStagMigrateVecDMDA() each time, which will not create >>>> anything. >>>> >>>> 2) You can create the natural vector upfront, and just scatter each >>>> time. >>>> >>>> 3) You can create the serial vector upfront, and just scatter each time. >>>> >>>> This is some data movement. You can compress the g2n and 2zero scatters >>>> using >>>> >>>> https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/ >>>> >>>> as an optimization. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Any advice would be much appreciated! >>>>> >>>>> Best, >>>>> Colton Bryant >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From coltonbryant2021 at u.northwestern.edu Wed Dec 13 11:02:21 2023 From: coltonbryant2021 at u.northwestern.edu (Colton Bryant) Date: Wed, 13 Dec 2023 11:02:21 -0600 Subject: [petsc-users] DMSTAG Gathering Vector on single process In-Reply-To: References: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev> <73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev> Message-ID: Ok! Thanks for the reply. I'll take a look when I get a chance. -Colton On Wed, Dec 13, 2023 at 10:48?AM Matthew Knepley wrote: > On Wed, Dec 13, 2023 at 11:22?AM Colton Bryant < > coltonbryant2021 at u.northwestern.edu> wrote: > >> Hi, >> >> Thanks for the help last week. The suggestions made the implementation I >> had much cleaner. I had one follow up question. Is there a way to sort of >> undo this operation? I know the vec scatter can be done backwards to >> distribute the arrays but I didn't see an easy way to migrate the DMDA >> vectors back into the DMStag object. >> > > It is not there. However, writing it would be straightforward. I would > > 1) Expose DMStagMigrateVecDMDA(), which is not currently public > > 2) Add a ScatterMode argument > > 3) Put in code that calls DMStagSetValuesStencil(), rather than > GetValuesStencil() > > We would be happy to take on MR on this, and could help in the review. > > Thanks, > > MAtt > > >> Thanks for any advice. >> >> -Colton >> >> On Wed, Dec 6, 2023 at 8:18?PM Barry Smith wrote: >> >>> >>> >>> On Dec 6, 2023, at 8:35?PM, Matthew Knepley wrote: >>> >>> On Wed, Dec 6, 2023 at 8:10?PM Barry Smith wrote: >>> >>>> >>>> Depending on the serial library you may not need to split the vector >>>> into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just >>>> global to natural and scatter to zero on the full vector, now the full >>>> vector is on the first rank and you can access what you need in that one >>>> vector if possible. >>>> >>> >>> Does DMStag have a GlobalToNatural? >>> >>> >>> Good point, it does not appear to have such a thing, though it could. >>> >>> Also, the serial code would have to have identical interleaving. >>> >>> Thanks, >>> >>> Matt >>> >>>> On Dec 6, 2023, at 6:37?PM, Colton Bryant < >>>> coltonbryant2021 at u.northwestern.edu> wrote: >>>> >>>> Ah excellent! I was not aware of the ability to preallocate the objects >>>> and migrate them each time. >>>> >>>> Thanks! >>>> -Colton >>>> >>>> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley >>>> wrote: >>>> >>>>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant < >>>>> coltonbryant2021 at u.northwestern.edu> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> I am working on a code in which a DMSTAG object is used to solve a >>>>>> fluid flow problem and I need to gather this flow data on a single process >>>>>> to interact with an existing (serial) library at each timestep of my >>>>>> simulation. After looking around the solution I've tried is: >>>>>> >>>>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the >>>>>> flow >>>>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the >>>>>> components naturally ordered >>>>>> -use VecScatterCreateToZero to set up and then do the scatter to >>>>>> gather on the single process >>>>>> >>>>>> Unless I'm misunderstanding something this method results in a lot of >>>>>> memory allocation/freeing happening at each step of the evolution and I was >>>>>> wondering if there is a way to directly perform such a scatter from the >>>>>> DMSTAG object without splitting as I'm doing here. >>>>>> >>>>> >>>>> 1) You can see here: >>>>> >>>>> >>>>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA >>>>> >>>>> that this function is small. You can do the DMDA creation manually, >>>>> and then just call DMStagMigrateVecDMDA() each time, which will not >>>>> create anything. >>>>> >>>>> 2) You can create the natural vector upfront, and just scatter each >>>>> time. >>>>> >>>>> 3) You can create the serial vector upfront, and just scatter each >>>>> time. >>>>> >>>>> This is some data movement. You can compress the g2n and 2zero >>>>> scatters using >>>>> >>>>> https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/ >>>>> >>>>> as an optimization. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Any advice would be much appreciated! >>>>>> >>>>>> Best, >>>>>> Colton Bryant >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Dec 13 13:54:17 2023 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 13 Dec 2023 14:54:17 -0500 Subject: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower) convergence for classical AMG (sequential and especially in parallel) In-Reply-To: <4c9f02898f324fcd8be1fe5dcc9f0416@cea.fr> References: <4c9f02898f324fcd8be1fe5dcc9f0416@cea.fr> Message-ID: Hi Pierre, Sorry I missed this post and your issues were brought to my attention today. First, the classic version is not supported well. The postdoc that wrote the code is long gone and I don't know the code at all. It is really a reference implementation that someone could build on and is not meant for production. In 10 years you are the first user that has connected us. The hypre package is a very good AMG solver and it uses classical AMG as the main solver. I wrote GAMG ("agg") which is a smoothed aggregation AMG solver and is very different from classical. I would suggest you move to hypre or '-pc_gamg_type agg'. The coarsening was developed in this time frame and there was a lot of churn as a new strategy for aggressive coarsening did not work well for some users and I had to add the old method in and then made it the default (again). This change missed v3.20, but you can get the old aggressive strategy with '-pc_gamg_aggressive_square_graph'. Check with -options_left to check that it is being used. As far as your output (nice formatting, thank you), the coarse grid is smaller in the new code. rows=41, cols=41 | rows=30, cols=30 "square graph" should fix this. You can also try not using aggressive coarsening with: You could try '-pc_gamg_aggressive_coarsening 0' Let me know how it goes and let's try to get you into a more sustainable state ... I really try not to change this code but sometimes need to. Thanks, Mark On Mon, Oct 9, 2023 at 10:43?AM LEDAC Pierre wrote: > Hello all, > > > I am struggling to find the same convergence in iterations when using > classical algebric multigrid in my code with PETSc 3.20 compared to PETSc > 3.14. > > > I am using in order to solve a Poisson system: > > *-ksp_type cg -pc_type gamg -pc_gamg_type classical* > > > I read the different releases notes between 3.15 and 3.20: > > https://petsc.org/release/changes/317 > > https://petsc.org/main/manualpages/PC/PCGAMGSetThreshold/ > > > And have a look at the archive mailing list (especially this one: > https://www.mail-archive.com/petsc-users at mcs.anl.gov/msg46688.html) > > so I added some other options to try to have the same behaviour than PETSc > 3.14: > > > *-ksp_type cg -pc_type gamg -pc_gamg_type classical *-mg_levels_pc_type > sor -pc_gamg_threshold 0. > > > It improves the convergence but there still a different convergence though > (26 vs 18 iterations). > > On another of my test case, the number of levels is different (e.g. 6 vs > 4) also, and here it is the same, but with a different coarsening according > to the output from the -ksp_view option > > The main point is that the convergence dramatically degrades in parallel > on a third test case, so I can't upgrade to PETSc 3.20 for now unhappily. > > I send you the partial report (petsc_314_vs_petsc_320.ksp_view) with > -ksp_view (left PETSc 3.14, right PETSc 3.20) and the configure/command > line options used (in petsc_XXX_petsc.TU files). > > > Could my issue related to the following 3.18 change ? I have not tried the > first one. > > > - > > Remove PCGAMGSetSymGraph() and -pc_gamg_sym_graph. The user should now > indicate symmetry and structural symmetry using MatSetOption > () and GAMG > will symmetrize the graph if a symmetric options is not set > - > > Change -pc_gamg_reuse_interpolation default from false to true. > > > Any advice would be greatly appreciated, > > > Pierre LEDAC > Commissariat ? l??nergie atomique et aux ?nergies alternatives > Centre de SACLAY > DES/ISAS/DM2S/SGLS/LCAN > B?timent 451 ? point courrier n?43 > F-91191 Gif-sur-Yvette > +33 1 69 08 04 03 > +33 6 83 42 05 79 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Wed Dec 13 16:05:41 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Wed, 13 Dec 2023 16:05:41 -0600 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> Message-ID: Hello Pierre, I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code. Can you please help me with this? Thanks, Sreeram On Thu, Dec 7, 2023 at 4:04?PM Mark Adams wrote: > N.B., AMGX interface is a bit experimental. > Mark > > On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat > wrote: > >> Oh, in that case I will try out BoomerAMG. Getting AMGX to build >> correctly was also tricky so hopefully the HYPRE build will be easier. >> >> Thanks, >> Sreeram >> >> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet wrote: >> >>> >>> >>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat wrote: >>> >>> Thank you Barry and Pierre; I will proceed with the first option. >>> >>> I want to use the AMGX preconditioner for the KSP. I will try it out and >>> see how it performs. >>> >>> >>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has >>> no PCMatApply() implementation. >>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >>> implementation. >>> But let us know if you need assistance figuring things out. >>> >>> Thanks, >>> Pierre >>> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet wrote: >>> >>>> To expand on Barry?s answer, we have observed repeatedly that >>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can >>>> reproduce this on your own with >>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>> Also, I?m guessing you are using some sort of preconditioner within >>>> your KSP. >>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>>> right-hand sides column by column, which is very inefficient. >>>> You could run your code with -info dump and send us dump.0 to see what >>>> needs to be done on our end to make things more efficient, should you not >>>> be satisfied with the current performance of the code. >>>> >>>> Thanks, >>>> Pierre >>>> >>>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>>> >>>> >>>> >>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat >>>> wrote: >>>> >>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n >>>> x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has >>>> size n. The data for v can be stored either in column-major or row-major >>>> order. Now, I want to do 2 types of operations: >>>> >>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>> >>>> From what I have read on the documentation, I can think of 2 >>>> approaches. >>>> >>>> 1. Get the pointer to the data in v (column-major) and use it to create >>>> a dense matrix V. Then do a MatMatMult with M*V = W, and take the data >>>> pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R >>>> and V. >>>> >>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with >>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that it >>>> is a multiple RHS system and act accordingly. >>>> >>>> Which would be the more efficient option? >>>> >>>> >>>> Use 1. >>>> >>>> >>>> As a side-note, I am also wondering if there is a way to use row-major >>>> storage of the vector v. >>>> >>>> >>>> No >>>> >>>> The reason is that this could allow for more coalesced memory access >>>> when doing matvecs. >>>> >>>> >>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for >>>> the computation so in theory they should already be well-optimized >>>> >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dump.0 Type: application/octet-stream Size: 1940230 bytes Desc: not available URL: From 2111191 at tongji.edu.cn Wed Dec 13 05:52:48 2023 From: 2111191 at tongji.edu.cn (2111191 at tongji.edu.cn) Date: Wed, 13 Dec 2023 19:52:48 +0800 (GMT+08:00) Subject: [petsc-users] =?utf-8?b?5o2V6I63?= Message-ID: <376cb39c.11543.18c6305b919.Coremail.2111191@tongji.edu.cn> Dear SLEPc Developers, I a am student from Tongji University. Recently I am trying to write a c++ program for matrix solving, which requires importing the PETSc library that you have developed. However a lot of errors occur in the cpp file when I use #include directly. I also try to use extern "C" but it gives me the error in the picture below. Is there a good way to use the PETSc library in a c++ program? (I compiled using cmake and my compiler is g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)). My cmakelists.txt is: cmake_minimum_required(VERSION 3.1.0) set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE) set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH}) set(SLEPC $ENV{SLEPC_DIR}/$ENV{PETSC_ARCH}) set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig:${SLEPC}/lib/pkgconfig) set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99") project(test) add_executable(${PROJECT_NAME} eigen_test2.cpp) find_package(PkgConfig REQUIRED) pkg_search_module(PETSc REQUIRED IMPORTED_TARGET PETSc) target_link_libraries(${PROJECT_NAME} PkgConfig::PETSc) The testing code is:eigen_test2.cpp extern "C"{ //#include #include #include #include #include } int main(int argc,char **argv) { return 0; } Best regards Weijie Xu -------------- next part -------------- A non-text attachment was scrubbed... Name: ??.PNG Type: image/png Size: 63554 bytes Desc: not available URL: From 2111191 at tongji.edu.cn Wed Dec 13 21:13:01 2023 From: 2111191 at tongji.edu.cn (2111191 at tongji.edu.cn) Date: Thu, 14 Dec 2023 11:13:01 +0800 (GMT+08:00) Subject: [petsc-users] Some question about compiling c++ program including PETSc using cmake Message-ID: <771d0fcf.3be16.18c6650324c.Coremail.2111191@tongji.edu.cn> Dear SLEPc Developers, I a am student from Tongji University. Recently I am trying to write a c++ program for matrix solving, which requires importing the PETSc library that you have developed. However a lot of errors occur in the cpp file when I use #include directly. I also try to use extern "C" but it gives me the error in the picture below. Is there a good way to use the PETSc library in a c++ program? (I compiled using cmake and my compiler is g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)). My cmakelists.txt is: cmake_minimum_required(VERSION 3.1.0) set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE) set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH}) set(SLEPC $ENV{SLEPC_DIR}/$ENV{PETSC_ARCH}) set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig:${SLEPC}/lib/pkgconfig) set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99") project(test) add_executable(${PROJECT_NAME} eigen_test2.cpp) find_package(PkgConfig REQUIRED) pkg_search_module(PETSc REQUIRED IMPORTED_TARGET PETSc) target_link_libraries(${PROJECT_NAME} PkgConfig::PETSc) The testing code is:eigen_test2.cpp extern "C"{ //#include #include #include #include #include } int main(int argc,char **argv) { return 0; } Best regards Weijie Xu From pierre at joliv.et Thu Dec 14 00:41:24 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 14 Dec 2023 07:41:24 +0100 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> Message-ID: <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> Hello Sreeram, KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once. There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY. If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices. Thanks, Pierre PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?. > On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat wrote: > > Hello Pierre, > > I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. > > I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code. > > Can you please help me with this? > > Thanks, > Sreeram > > > On Thu, Dec 7, 2023 at 4:04?PM Mark Adams > wrote: >> N.B., AMGX interface is a bit experimental. >> Mark >> >> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat > wrote: >>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier. >>> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet > wrote: >>>> >>>> >>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat > wrote: >>>>> >>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>> >>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs. >>>> >>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation. >>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. >>>> But let us know if you need assistance figuring things out. >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet > wrote: >>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP. >>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient. >>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith > wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat > wrote: >>>>>>>> >>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: >>>>>>>> >>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>> >>>>>>>> From what I have read on the documentation, I can think of 2 approaches. >>>>>>>> >>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. >>>>>>>> >>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. >>>>>>>> >>>>>>>> Which would be the more efficient option? >>>>>>> >>>>>>> Use 1. >>>>>>>> >>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. >>>>>>> >>>>>>> No >>>>>>> >>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs. >>>>>>> >>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sreeram >>>>>> >>>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Thu Dec 14 00:45:38 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 14 Dec 2023 07:45:38 +0100 Subject: [petsc-users] Some question about compiling c++ program including PETSc using cmake In-Reply-To: <771d0fcf.3be16.18c6650324c.Coremail.2111191@tongji.edu.cn> References: <771d0fcf.3be16.18c6650324c.Coremail.2111191@tongji.edu.cn> Message-ID: > On 14 Dec 2023, at 4:13?AM, 2111191--- via petsc-users wrote: > > Dear SLEPc Developers, > > I a am student from Tongji University. Recently I am trying to write a c++ program for matrix solving, which requires importing the PETSc library that you have developed. However a lot of errors occur in the cpp file when I use #include directly. I also try to use extern "C" but it gives me the error in the picture below. Is there a good way to use the PETSc library in a c++ program? (I compiled using cmake and my compiler is g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)). This compiler (gcc 4.8.5) is known to not be C++11 compliant, but you are using the -std=c++11 flag. Furthermore, since version 3.18 (or maybe slightly later), PETSc requires a C++11-compliant compiler if using C++. Could you switch to a newer compiler, or try to reconfigure? Also, you should not put all the include inside an extern { }. In any case, you?ll need to send the compilation error log and configure.log to petsc-maint at mcs.anl.gov if you want further help, as we won?t be able to give a better diagnosis with just the currently provided information. Thanks, Pierre > My cmakelists.txt is: > > cmake_minimum_required(VERSION 3.1.0) > > set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE) > > set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH}) > set(SLEPC $ENV{SLEPC_DIR}/$ENV{PETSC_ARCH}) > set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig:${SLEPC}/lib/pkgconfig) > > set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11") > set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99") > > project(test) > > add_executable(${PROJECT_NAME} eigen_test2.cpp) > find_package(PkgConfig REQUIRED) > > pkg_search_module(PETSc REQUIRED IMPORTED_TARGET PETSc) > target_link_libraries(${PROJECT_NAME} PkgConfig::PETSc) > > The testing code is:eigen_test2.cpp > extern "C"{ > //#include > #include > #include > #include > #include > } > > int main(int argc,char **argv) > { > return 0; > } > > > > Best regards > > Weijie Xu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Dec 14 06:52:58 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Dec 2023 07:52:58 -0500 Subject: [petsc-users] =?utf-8?b?5o2V6I63?= In-Reply-To: <376cb39c.11543.18c6305b919.Coremail.2111191@tongji.edu.cn> References: <376cb39c.11543.18c6305b919.Coremail.2111191@tongji.edu.cn> Message-ID: On Thu, Dec 14, 2023 at 1:27?AM 2111191--- via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear SLEPc Developers, > > I a am student from Tongji University. Recently I am trying to write a c++ > program for matrix solving, which requires importing the PETSc library that > you have developed. However a lot of errors occur in the cpp file when I > use #include directly. I also try to use extern "C" but it > gives me the error in the picture below. Is there a good way to use the > PETSc library in a c++ program? (I compiled using cmake and my compiler is > g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)). > > My cmakelists.txt is: > > cmake_minimum_required(VERSION 3.1.0) > > set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE) > > set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH}) > set(SLEPC $ENV{SLEPC_DIR}/$ENV{PETSC_ARCH}) > set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig:${SLEPC}/lib/pkgconfig) > > set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11") > set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99") > > project(test) > > add_executable(${PROJECT_NAME} eigen_test2.cpp) > find_package(PkgConfig REQUIRED) > > pkg_search_module(PETSc REQUIRED IMPORTED_TARGET PETSc) > target_link_libraries(${PROJECT_NAME} PkgConfig::PETSc) > > The testing code is:eigen_test2.cpp > First, get rid of the "extern C" in front of the headers. Thanks, Matt > extern "C"{ > //#include > #include > #include > #include > #include > } > > int main(int argc,char **argv) > { > return 0; > } > > > > Best regards > > Weijie Xu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Pierre.LEDAC at cea.fr Thu Dec 14 09:15:22 2023 From: Pierre.LEDAC at cea.fr (LEDAC Pierre) Date: Thu, 14 Dec 2023 15:15:22 +0000 Subject: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower) convergence for classical AMG (sequential and especially in parallel) In-Reply-To: References: <4c9f02898f324fcd8be1fe5dcc9f0416@cea.fr>, Message-ID: <6fee51e3277d470f9f2d4d51f0bd453a@cea.fr> Hello Mark, Thanks for your answer. Indeed, I didn't see the information that classical AMG was not really supported: -solver2_pc_gamg_type : Type of AMG method (only 'agg' supported and useful) (one of) classical geo agg (PCGAMGSetType) We switched very recently from GAMG("agg") to GAMG("classical") for a weak scaling test up to 32000 cores, where we saw very good scalability with GAMG("classical") compared to GAMG("agg"). But it was with PETSc 3.14... So today, we are going to upgrade to 3.20 and focus on GAMG("agg") or Hypre Classical AMG. We will see how it compares. May I ask you what is your point of view of the current state of the GPU versions of GAMG("agg") versus Hypre AMG Classical ? In fact, the reason of our move from 3.14 to 3.20 is to take advantage of all the progress in PETSc and Hypre on accelerated solvers/preconditioners during the last 2 years. Greatly appreciate your help, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?43 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Mark Adams Envoy? : mercredi 13 d?cembre 2023 20:54:17 ? : LEDAC Pierre Cc : petsc-users at mcs.anl.gov; BRUNETON Adrien Objet : Re: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower) convergence for classical AMG (sequential and especially in parallel) Hi Pierre, Sorry I missed this post and your issues were brought to my attention today. First, the classic version is not supported well. The postdoc that wrote the code is long gone and I don't know the code at all. It is really a reference implementation that someone could build on and is not meant for production. In 10 years you are the first user that has connected us. The hypre package is a very good AMG solver and it uses classical AMG as the main solver. I wrote GAMG ("agg") which is a smoothed aggregation AMG solver and is very different from classical. I would suggest you move to hypre or '-pc_gamg_type agg'. The coarsening was developed in this time frame and there was a lot of churn as a new strategy for aggressive coarsening did not work well for some users and I had to add the old method in and then made it the default (again). This change missed v3.20, but you can get the old aggressive strategy with '-pc_gamg_aggressive_square_graph'. Check with -options_left to check that it is being used. As far as your output (nice formatting, thank you), the coarse grid is smaller in the new code. rows=41, cols=41 | rows=30, cols=30 "square graph" should fix this. You can also try not using aggressive coarsening with: You could try '-pc_gamg_aggressive_coarsening 0' Let me know how it goes and let's try to get you into a more sustainable state ... I really try not to change this code but sometimes need to. Thanks, Mark On Mon, Oct 9, 2023 at 10:43?AM LEDAC Pierre > wrote: Hello all, I am struggling to find the same convergence in iterations when using classical algebric multigrid in my code with PETSc 3.20 compared to PETSc 3.14. I am using in order to solve a Poisson system: -ksp_type cg -pc_type gamg -pc_gamg_type classical I read the different releases notes between 3.15 and 3.20: https://petsc.org/release/changes/317 https://petsc.org/main/manualpages/PC/PCGAMGSetThreshold/ And have a look at the archive mailing list (especially this one: https://www.mail-archive.com/petsc-users at mcs.anl.gov/msg46688.html) so I added some other options to try to have the same behaviour than PETSc 3.14: -ksp_type cg -pc_type gamg -pc_gamg_type classical -mg_levels_pc_type sor -pc_gamg_threshold 0. It improves the convergence but there still a different convergence though (26 vs 18 iterations). On another of my test case, the number of levels is different (e.g. 6 vs 4) also, and here it is the same, but with a different coarsening according to the output from the -ksp_view option The main point is that the convergence dramatically degrades in parallel on a third test case, so I can't upgrade to PETSc 3.20 for now unhappily. I send you the partial report (petsc_314_vs_petsc_320.ksp_view) with -ksp_view (left PETSc 3.14, right PETSc 3.20) and the configure/command line options used (in petsc_XXX_petsc.TU files). Could my issue related to the following 3.18 change ? I have not tried the first one. * Remove PCGAMGSetSymGraph() and -pc_gamg_sym_graph. The user should now indicate symmetry and structural symmetry using MatSetOption() and GAMG will symmetrize the graph if a symmetric options is not set * Change -pc_gamg_reuse_interpolation default from false to true. Any advice would be greatly appreciated, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?43 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Dec 14 10:50:59 2023 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 14 Dec 2023 11:50:59 -0500 Subject: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower) convergence for classical AMG (sequential and especially in parallel) In-Reply-To: <6fee51e3277d470f9f2d4d51f0bd453a@cea.fr> References: <4c9f02898f324fcd8be1fe5dcc9f0416@cea.fr> <6fee51e3277d470f9f2d4d51f0bd453a@cea.fr> Message-ID: On Thu, Dec 14, 2023 at 10:15?AM LEDAC Pierre wrote: > Hello Mark, > > > Thanks for your answer. Indeed, I didn't see the information that > classical AMG was not really supported: > > > -solver2_pc_gamg_type : Type of AMG method (only > 'agg' supported and useful) (one of) classical geo agg (PCGAMGSetType) > > We switched very recently from GAMG("agg") to GAMG("classical") for a weak > scaling test up to 32000 cores, where we saw very good scalability with GAMG("classical") > compared to GAMG("agg"). But it was with PETSc 3.14... > AMG is sensitive to parameters. What PDE and discretization are you solving? For example, I recently optimized the Q2 Laplacian benchmark and found good scaling with -pc_gamg_threshold 0.04 -pc_gamg_threshold_scale .25 Hypre scaled well without optimization (see below), > > So today, we are going to upgrade to 3.20 and focus on GAMG("agg") or > Hypre Classical AMG. We will see how it compares. > You might want to update to v3.20.2 That has some of my recent GAMG updates. > May I ask you what is your point of view of the current state of the GPU > versions of GAMG("agg") versus Hypre AMG Classical ? > Hypre is well supported with several developers over decades, whereas I really just maintain GAMG + I add some things like anisotropy support recently/currently. But, I build on the PETSc sparse linear algebra that is well supported in PETSc and hypre, and we have several good people doing that. TL;DR Both run the solve and matrix setup phase on the GPU. Hypre puts the graph setup phase on the GPU, but this phase is 1) not well suited to GPUs and 2) is amortized in most applications (just done once). GAMG is easier to deal with because it is built-in and the interface to hypre can be fragile with respect to GPUs (eg, if you use '-mat_type hypre') in my experience. If performance is critical and you have the time to put into it, hypre will be a good option, and GAMG can be a backup. > > In fact, the reason of our move from 3.14 to 3.20 is to take advantage of > all the progress in PETSc and Hypre on accelerated solvers/preconditioners > during the last 2 years. > > And I can give you advice on GAMG parameters, if you send me the output with '-info :pc' (and 'grep GAMG'). Thanks, Mark > Greatly appreciate your help, > > Pierre LEDAC > Commissariat ? l??nergie atomique et aux ?nergies alternatives > Centre de SACLAY > DES/ISAS/DM2S/SGLS/LCAN > B?timent 451 ? point courrier n?43 > F-91191 Gif-sur-Yvette > +33 1 69 08 04 03 > +33 6 83 42 05 79 > ------------------------------ > *De :* Mark Adams > *Envoy? :* mercredi 13 d?cembre 2023 20:54:17 > *? :* LEDAC Pierre > *Cc :* petsc-users at mcs.anl.gov; BRUNETON Adrien > *Objet :* Re: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower) > convergence for classical AMG (sequential and especially in parallel) > > Hi Pierre, > > Sorry I missed this post and your issues were brought to my attention > today. > > First, the classic version is not supported well. The postdoc that wrote > the code is long gone and I don't know the code at all. > It is really a reference implementation that someone could build on and is > not meant for production. > In 10 years you are the first user that has connected us. > > The hypre package is a very good AMG solver and it uses classical AMG as > the main solver. > I wrote GAMG ("agg") which is a smoothed aggregation AMG solver and is > very different from classical. > I would suggest you move to hypre or '-pc_gamg_type agg'. > > The coarsening was developed in this time frame and there was a lot of > churn as a new strategy for aggressive coarsening did not work well for > some users and I had to add the old method in and then made it the default > (again). > This change missed v3.20, but you can get the old aggressive strategy with > '-pc_gamg_aggressive_square_graph'. > Check with -options_left to check that it is being used. > > As far as your output (nice formatting, thank you), the coarse grid is > smaller in the new code. > rows=41, cols=41 | rows=30, cols=30 > "square graph" should fix this. > > You can also try not using aggressive coarsening with: > You could try '-pc_gamg_aggressive_coarsening 0' > > Let me know how it goes and let's try to get you into a more sustainable > state ... I really try not to change this code but sometimes need to. > > Thanks, > Mark > > > > > > On Mon, Oct 9, 2023 at 10:43?AM LEDAC Pierre wrote: > >> Hello all, >> >> >> I am struggling to find the same convergence in iterations when using >> classical algebric multigrid in my code with PETSc 3.20 compared to PETSc >> 3.14. >> >> >> I am using in order to solve a Poisson system: >> >> *-ksp_type cg -pc_type gamg -pc_gamg_type classical* >> >> >> I read the different releases notes between 3.15 and 3.20: >> >> https://petsc.org/release/changes/317 >> >> https://petsc.org/main/manualpages/PC/PCGAMGSetThreshold/ >> >> >> And have a look at the archive mailing list (especially this one: >> https://www.mail-archive.com/petsc-users at mcs.anl.gov/msg46688.html) >> >> so I added some other options to try to have the same behaviour than >> PETSc 3.14: >> >> >> *-ksp_type cg -pc_type gamg -pc_gamg_type classical *-mg_levels_pc_type >> sor -pc_gamg_threshold 0. >> >> >> It improves the convergence but there still a different convergence >> though (26 vs 18 iterations). >> >> On another of my test case, the number of levels is different (e.g. 6 vs >> 4) also, and here it is the same, but with a different coarsening according >> to the output from the -ksp_view option >> >> The main point is that the convergence dramatically degrades in parallel >> on a third test case, so I can't upgrade to PETSc 3.20 for now unhappily. >> >> I send you the partial report (petsc_314_vs_petsc_320.ksp_view) with >> -ksp_view (left PETSc 3.14, right PETSc 3.20) and the configure/command >> line options used (in petsc_XXX_petsc.TU files). >> >> >> Could my issue related to the following 3.18 change ? I have not tried >> the first one. >> >> >> - >> >> Remove PCGAMGSetSymGraph() and -pc_gamg_sym_graph. The user should >> now indicate symmetry and structural symmetry using MatSetOption >> () and GAMG >> will symmetrize the graph if a symmetric options is not set >> - >> >> Change -pc_gamg_reuse_interpolation default from false to true. >> >> >> Any advice would be greatly appreciated, >> >> >> Pierre LEDAC >> Commissariat ? l??nergie atomique et aux ?nergies alternatives >> Centre de SACLAY >> DES/ISAS/DM2S/SGLS/LCAN >> B?timent 451 ? point courrier n?43 >> F-91191 Gif-sur-Yvette >> +33 1 69 08 04 03 >> +33 6 83 42 05 79 >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Thu Dec 14 13:02:04 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 14 Dec 2023 13:02:04 -0600 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> Message-ID: Hello Pierre, Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look? Thanks, Sreeram On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet wrote: > Hello Sreeram, > KSPCG (PETSc implementation of CG) does not handle solves with multiple > columns at once. > There is only a single native PETSc KSP implementation which handles > solves with multiple columns at once: KSPPREONLY. > If you use --download-hpddm, you can use a CG (or GMRES, or more advanced > methods) implementation which handles solves with multiple columns at once > (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); > KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). > I?m the main author of HPDDM, there is preliminary support for device > matrices, but if it?s not working as intended/not faster than column by > column, I?d be happy to have a deeper look (maybe in private), because most > (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., > solvers that treat right-hand sides in a single go) are using plain host > matrices. > > Thanks, > Pierre > > PS: you could have a look at > https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to > understand the philosophy behind block iterative methods in PETSc (and in > HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was > developed in the context of this paper to produce Figures 2-3. Note that > this paper is now slightly outdated, since then, PCHYPRE and PCMG (among > others) have been made ?PCMatApply()-ready?. > > On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat wrote: > > Hello Pierre, > > I am trying out the KSPMatSolve with the BoomerAMG preconditioner. > However, I am noticing that it is still solving column by column (this is > stated explicitly in the info dump attached). I looked at the code for > KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, > it should do the batched solve, though I'm not sure where that gets set. > > I am using the options -pc_type hypre -pc_hypre_type boomeramg when > running the code. > > Can you please help me with this? > > Thanks, > Sreeram > > > On Thu, Dec 7, 2023 at 4:04?PM Mark Adams wrote: > >> N.B., AMGX interface is a bit experimental. >> Mark >> >> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat >> wrote: >> >>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build >>> correctly was also tricky so hopefully the HYPRE build will be easier. >>> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet wrote: >>> >>>> >>>> >>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat >>>> wrote: >>>> >>>> Thank you Barry and Pierre; I will proceed with the first option. >>>> >>>> I want to use the AMGX preconditioner for the KSP. I will try it out >>>> and see how it performs. >>>> >>>> >>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has >>>> no PCMatApply() implementation. >>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >>>> implementation. >>>> But let us know if you need assistance figuring things out. >>>> >>>> Thanks, >>>> Pierre >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet wrote: >>>> >>>>> To expand on Barry?s answer, we have observed repeatedly that >>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can >>>>> reproduce this on your own with >>>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>> Also, I?m guessing you are using some sort of preconditioner within >>>>> your KSP. >>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>>>> right-hand sides column by column, which is very inefficient. >>>>> You could run your code with -info dump and send us dump.0 to see what >>>>> needs to be done on our end to make things more efficient, should you not >>>>> be satisfied with the current performance of the code. >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>>>> >>>>> >>>>> >>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat >>>>> wrote: >>>>> >>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n >>>>> x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has >>>>> size n. The data for v can be stored either in column-major or row-major >>>>> order. Now, I want to do 2 types of operations: >>>>> >>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>> >>>>> From what I have read on the documentation, I can think of 2 >>>>> approaches. >>>>> >>>>> 1. Get the pointer to the data in v (column-major) and use it to >>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the >>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve >>>>> with R and V. >>>>> >>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with >>>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that it >>>>> is a multiple RHS system and act accordingly. >>>>> >>>>> Which would be the more efficient option? >>>>> >>>>> >>>>> Use 1. >>>>> >>>>> >>>>> As a side-note, I am also wondering if there is a way to use row-major >>>>> storage of the vector v. >>>>> >>>>> >>>>> No >>>>> >>>>> The reason is that this could allow for more coalesced memory access >>>>> when doing matvecs. >>>>> >>>>> >>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products >>>>> for the computation so in theory they should already be well-optimized >>>>> >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> >>>>> >>>> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dump.0 Type: application/octet-stream Size: 1565937 bytes Desc: not available URL: From facklerpw at ornl.gov Thu Dec 14 13:05:33 2023 From: facklerpw at ornl.gov (Fackler, Philip) Date: Thu, 14 Dec 2023 19:05:33 +0000 Subject: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing allocation behavior Message-ID: I'm using the following sequence of functions related to the Jacobian matrix: DMDACreate1d(..., &da); DMSetFromOptions(da); DMSetUp(da); DMSetMatType(da, MATAIJKOKKOS); DMSetMatrixPreallocateSkip(da, PETSC_TRUE); Mat J; DMCreateMatrix(da, &J); MatSetPreallocationCOO(J, ...); I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix. [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13] Can someone help me understand this? Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 54775 bytes Desc: image.png URL: From pierre at joliv.et Thu Dec 14 13:12:28 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 14 Dec 2023 20:12:28 +0100 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> Message-ID: > On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat wrote: > > Hello Pierre, > > Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look? Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c). Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage. Thanks, Pierre > Thanks, > Sreeram > > On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet > wrote: >> Hello Sreeram, >> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once. >> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY. >> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices. >> >> Thanks, >> Pierre >> >> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?. >> >>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat > wrote: >>> >>> Hello Pierre, >>> >>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. >>> >>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code. >>> >>> Can you please help me with this? >>> >>> Thanks, >>> Sreeram >>> >>> >>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams > wrote: >>>> N.B., AMGX interface is a bit experimental. >>>> Mark >>>> >>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat > wrote: >>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier. >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet > wrote: >>>>>> >>>>>> >>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat > wrote: >>>>>>> >>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>> >>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs. >>>>>> >>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation. >>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. >>>>>> But let us know if you need assistance figuring things out. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet > wrote: >>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP. >>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient. >>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith > wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat > wrote: >>>>>>>>>> >>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: >>>>>>>>>> >>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>>> >>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. >>>>>>>>>> >>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. >>>>>>>>>> >>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. >>>>>>>>>> >>>>>>>>>> Which would be the more efficient option? >>>>>>>>> >>>>>>>>> Use 1. >>>>>>>>>> >>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. >>>>>>>>> >>>>>>>>> No >>>>>>>>> >>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs. >>>>>>>>> >>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Sreeram >>>>>>>> >>>>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Dec 14 14:49:25 2023 From: jed at jedbrown.org (Jed Brown) Date: Thu, 14 Dec 2023 13:49:25 -0700 Subject: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing allocation behavior In-Reply-To: References: Message-ID: <871qboech6.fsf@jedbrown.org> 17 GB for a 1D DMDA, wow. :-) Could you try applying this diff to make it work for DMDA (it's currently handled by DMPlex)? diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c index cad4d926504..bd2a3bda635 100644 --- i/src/dm/impls/da/fdda.c +++ w/src/dm/impls/da/fdda.c @@ -675,19 +675,21 @@ PetscErrorCode DMCreateMatrix_DA(DM da, Mat *J) specialized setting routines depend only on the particular preallocation details of the matrix, not the type itself. */ - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij)); - if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij)); - if (!aij) { - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij)); - if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij)); - if (!baij) { - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij)); - if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij)); - if (!sbaij) { - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell)); - if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell)); + if (!dm->prealloc_skip) { // Flag is likely set when user intends to use MatSetPreallocationCOO() + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij)); + if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij)); + if (!aij) { + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij)); + if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij)); + if (!baij) { + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij)); + if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij)); + if (!sbaij) { + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell)); + if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell)); + } + if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is)); } - if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is)); } } if (aij) { "Fackler, Philip via petsc-users" writes: > I'm using the following sequence of functions related to the Jacobian matrix: > > DMDACreate1d(..., &da); > DMSetFromOptions(da); > DMSetUp(da); > DMSetMatType(da, MATAIJKOKKOS); > DMSetMatrixPreallocateSkip(da, PETSC_TRUE); > Mat J; > DMCreateMatrix(da, &J); > MatSetPreallocationCOO(J, ...); > > I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix. > > [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13] > > Can someone help me understand this? > > Thanks, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory From knepley at gmail.com Thu Dec 14 15:19:01 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Dec 2023 16:19:01 -0500 Subject: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing allocation behavior In-Reply-To: References: Message-ID: On Thu, Dec 14, 2023 at 2:06?PM Fackler, Philip via petsc-users < petsc-users at mcs.anl.gov> wrote: > I'm using the following sequence of functions related to the Jacobian > matrix: > > DMDACreate1d(..., &da); > DMSetFromOptions(da); > DMSetUp(da); > DMSetMatType(da, MATAIJKOKKOS); > DMSetMatrixPreallocateSkip(da, PETSC_TRUE); > Mat J; > DMCreateMatrix(da, &J); > MatSetPreallocationCOO(J, ...); > > I recently added the call to DMSetMatrixPreallocateSkip, hoping the > allocation would be delayed to MatSetPreallocationCOO, and that it would > require less memory. The documentation > says > that the data structures will not be preallocated. > You are completely correct. DMDA is just ignoring this flag. We will fix it. Thanks for catching this. Matt > The following data from heaptrack shows that the allocation is still > happening in the call to DMCreateMatrix. > > > Can someone help me understand this? > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 54775 bytes Desc: not available URL: From jed at jedbrown.org Thu Dec 14 15:27:53 2023 From: jed at jedbrown.org (Jed Brown) Date: Thu, 14 Dec 2023 14:27:53 -0700 Subject: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing allocation behavior In-Reply-To: <871qboech6.fsf@jedbrown.org> References: <871qboech6.fsf@jedbrown.org> Message-ID: <87wmtgcw4m.fsf@jedbrown.org> I had a one-character typo in the diff above. This MR to release should work now. https://gitlab.com/petsc/petsc/-/merge_requests/7120 Jed Brown writes: > 17 GB for a 1D DMDA, wow. :-) > > Could you try applying this diff to make it work for DMDA (it's currently handled by DMPlex)? > > diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c > index cad4d926504..bd2a3bda635 100644 > --- i/src/dm/impls/da/fdda.c > +++ w/src/dm/impls/da/fdda.c > @@ -675,19 +675,21 @@ PetscErrorCode DMCreateMatrix_DA(DM da, Mat *J) > specialized setting routines depend only on the particular preallocation > details of the matrix, not the type itself. > */ > - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij)); > - if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij)); > - if (!aij) { > - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij)); > - if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij)); > - if (!baij) { > - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij)); > - if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij)); > - if (!sbaij) { > - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell)); > - if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell)); > + if (!dm->prealloc_skip) { // Flag is likely set when user intends to use MatSetPreallocationCOO() > + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij)); > + if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij)); > + if (!aij) { > + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij)); > + if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij)); > + if (!baij) { > + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij)); > + if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij)); > + if (!sbaij) { > + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell)); > + if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell)); > + } > + if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is)); > } > - if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is)); > } > } > if (aij) { > > > "Fackler, Philip via petsc-users" writes: > >> I'm using the following sequence of functions related to the Jacobian matrix: >> >> DMDACreate1d(..., &da); >> DMSetFromOptions(da); >> DMSetUp(da); >> DMSetMatType(da, MATAIJKOKKOS); >> DMSetMatrixPreallocateSkip(da, PETSC_TRUE); >> Mat J; >> DMCreateMatrix(da, &J); >> MatSetPreallocationCOO(J, ...); >> >> I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix. >> >> [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13] >> >> Can someone help me understand this? >> >> Thanks, >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory From srvenkat at utexas.edu Thu Dec 14 16:45:00 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 14 Dec 2023 16:45:00 -0600 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> Message-ID: Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex). I'll also try out some of the BoomerAMG options to see if that helps. Thanks, Sreeram On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet wrote: > > > On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat wrote: > > Hello Pierre, > > Thank you for your reply. I tried out the HPDDM CG as you said, and it > seems to be doing the batched solves, but the KSP is not converging due to > a NaN or Inf being generated. I also noticed there are a lot of > host-to-device and device-to-host copies of the matrices (the non-batched > KSP solve did not have any memcopies). I have attached dump.0 again. Could > you please take a look? > > > Yes, but you?d need to send me something I can run with your set of > options (if you are more confident doing this in private, you can remove > the list from c/c). > Not all BoomerAMG smoothers handle blocks of right-hand sides, and there > is not much error checking, so instead of erroring out, this may be the > reason why you are getting garbage. > > Thanks, > Pierre > > Thanks, > Sreeram > > On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet wrote: > >> Hello Sreeram, >> KSPCG (PETSc implementation of CG) does not handle solves with multiple >> columns at once. >> There is only a single native PETSc KSP implementation which handles >> solves with multiple columns at once: KSPPREONLY. >> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced >> methods) implementation which handles solves with multiple columns at once >> (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); >> KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >> I?m the main author of HPDDM, there is preliminary support for device >> matrices, but if it?s not working as intended/not faster than column by >> column, I?d be happy to have a deeper look (maybe in private), because most >> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., >> solvers that treat right-hand sides in a single go) are using plain host >> matrices. >> >> Thanks, >> Pierre >> >> PS: you could have a look at >> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to >> understand the philosophy behind block iterative methods in PETSc (and in >> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was >> developed in the context of this paper to produce Figures 2-3. Note that >> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among >> others) have been made ?PCMatApply()-ready?. >> >> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat >> wrote: >> >> Hello Pierre, >> >> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. >> However, I am noticing that it is still solving column by column (this is >> stated explicitly in the info dump attached). I looked at the code for >> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is >> true, it should do the batched solve, though I'm not sure where that gets >> set. >> >> I am using the options -pc_type hypre -pc_hypre_type boomeramg when >> running the code. >> >> Can you please help me with this? >> >> Thanks, >> Sreeram >> >> >> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams wrote: >> >>> N.B., AMGX interface is a bit experimental. >>> Mark >>> >>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat >>> wrote: >>> >>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build >>>> correctly was also tricky so hopefully the HYPRE build will be easier. >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet wrote: >>>> >>>>> >>>>> >>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat >>>>> wrote: >>>>> >>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>> >>>>> I want to use the AMGX preconditioner for the KSP. I will try it out >>>>> and see how it performs. >>>>> >>>>> >>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has >>>>> no PCMatApply() implementation. >>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >>>>> implementation. >>>>> But let us know if you need assistance figuring things out. >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet wrote: >>>>> >>>>>> To expand on Barry?s answer, we have observed repeatedly that >>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can >>>>>> reproduce this on your own with >>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>> Also, I?m guessing you are using some sort of preconditioner within >>>>>> your KSP. >>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>>>>> right-hand sides column by column, which is very inefficient. >>>>>> You could run your code with -info dump and send us dump.0 to see >>>>>> what needs to be done on our end to make things more efficient, should you >>>>>> not be satisfied with the current performance of the code. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>>>>> >>>>>> >>>>>> >>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat >>>>>> wrote: >>>>>> >>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size >>>>>> n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has >>>>>> size n. The data for v can be stored either in column-major or row-major >>>>>> order. Now, I want to do 2 types of operations: >>>>>> >>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>> >>>>>> From what I have read on the documentation, I can think of 2 >>>>>> approaches. >>>>>> >>>>>> 1. Get the pointer to the data in v (column-major) and use it to >>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the >>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve >>>>>> with R and V. >>>>>> >>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with >>>>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that it >>>>>> is a multiple RHS system and act accordingly. >>>>>> >>>>>> Which would be the more efficient option? >>>>>> >>>>>> >>>>>> Use 1. >>>>>> >>>>>> >>>>>> As a side-note, I am also wondering if there is a way to use >>>>>> row-major storage of the vector v. >>>>>> >>>>>> >>>>>> No >>>>>> >>>>>> The reason is that this could allow for more coalesced memory access >>>>>> when doing matvecs. >>>>>> >>>>>> >>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products >>>>>> for the computation so in theory they should already be well-optimized >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Sreeram >>>>>> >>>>>> >>>>>> >>>>> >> >> >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vittorio.sciortino at uniba.it Thu Dec 14 22:20:10 2023 From: vittorio.sciortino at uniba.it (Vittorio Sciortino) Date: Fri, 15 Dec 2023 05:20:10 +0100 Subject: [petsc-users] PETSc configuration to solve Poisson equation on a 2D cartesian grid of points with nVidia GPUs (CUDA) Message-ID: <5b01e23b0de85bc1c5f09b58ee29ef79@uniba.it> Dear PETSc developers, My name is Vittorio Sciortion, I am a PhD student in Italy and I am really curious about the applications and possibilities of your library. I would ask you two questions about PETSc. My study case consists in the development of a 2D electrostatic Particle In Cell code which simulates a plasma interacting with the shaped surface of adjacent divertor mono-blocks. This type of scenario requires to solve the electro-static Poisson equation on the whole set of grid nodes (a cartesian grid) applying some boundary conditions. Currently, we are using the KSPSolve subroutine set to apply the gmres iterative method in conjunction with hypre (used as pre-conditioner). Some boundary conditons are necessary for our specific problem (Dirichlet and Neumann conditions on specific line of points). I have two small curiosity about the possibilities offered by your library, which is very interesting: 1. are we using the best possible pair to solve our problem? 2. currently, PETSc is compiled with openMP parallelization and the iterative method is executed on the CPU. Is it possible to configure the compilation of our library to execute these iterations on a nVidia GPU? Which are the best compilation options that you suggest for your library? thank you in advance Greetings Vittorio Sciortino PhD student in Physics Bari, Italy Recently, I sent a subscribe request to the users mailing list using another e-mail, because this one could be deactivated in two/three months. private email: vsciortino.phdcourse at gmail.com -- Vittorio Sciortino ________________________________________________________________________________________________ Sostieni la formazione e la ricerca universitaria con il tuo 5 per mille all'Universit? di Bari. Firma la casella "Finanziamento della ricerca scientifica e della Universit?" indicando il codice fiscale 80002170720. Il tuo contributo pu? fare la differenza: oggi pi? che mai! ________________________________________________________________________________________________ From pierre at joliv.et Fri Dec 15 01:01:10 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 15 Dec 2023 08:01:10 +0100 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> Message-ID: <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> > On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat wrote: > > Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex). You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files (I?m guessing your are using double-precision scalars with 32-bit PetscInt). > I'll also try out some of the BoomerAMG options to see if that helps. These should work (this is where all ?PCMatApply()-ready? PC are being tested): https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with HIP). I?m aware the performance should not be optimal (see your comment about host/device copies), I?ve money to hire someone to work on this but: a) I need to find the correct engineer/post-doc, b) I currently don?t have good use cases (of course, I could generate a synthetic benchmark, for science). So even if you send me the three Mat, a MWE would be appreciated if the KSPMatSolve() is performance-critical for you (see point b) from above). Thanks, Pierre > Thanks, > Sreeram > > On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet > wrote: >> >> >>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat > wrote: >>> >>> Hello Pierre, >>> >>> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look? >> >> Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c). >> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage. >> >> Thanks, >> Pierre >> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet > wrote: >>>> Hello Sreeram, >>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once. >>>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY. >>>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices. >>>> >>>> Thanks, >>>> Pierre >>>> >>>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?. >>>> >>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat > wrote: >>>>> >>>>> Hello Pierre, >>>>> >>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. >>>>> >>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code. >>>>> >>>>> Can you please help me with this? >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> >>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams > wrote: >>>>>> N.B., AMGX interface is a bit experimental. >>>>>> Mark >>>>>> >>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat > wrote: >>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet > wrote: >>>>>>>> >>>>>>>> >>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat > wrote: >>>>>>>>> >>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>>>> >>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs. >>>>>>>> >>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation. >>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. >>>>>>>> But let us know if you need assistance figuring things out. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sreeram >>>>>>>>> >>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet > wrote: >>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP. >>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient. >>>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Pierre >>>>>>>>>> >>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith > wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat > wrote: >>>>>>>>>>>> >>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: >>>>>>>>>>>> >>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>>>>> >>>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. >>>>>>>>>>>> >>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. >>>>>>>>>>>> >>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. >>>>>>>>>>>> >>>>>>>>>>>> Which would be the more efficient option? >>>>>>>>>>> >>>>>>>>>>> Use 1. >>>>>>>>>>>> >>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. >>>>>>>>>>> >>>>>>>>>>> No >>>>>>>>>>> >>>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs. >>>>>>>>>>> >>>>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Sreeram >>>>>>>>>> >>>>>>>> >>>>> >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Dec 15 08:17:55 2023 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 15 Dec 2023 09:17:55 -0500 Subject: [petsc-users] PETSc configuration to solve Poisson equation on a 2D cartesian grid of points with nVidia GPUs (CUDA) In-Reply-To: <5b01e23b0de85bc1c5f09b58ee29ef79@uniba.it> References: <5b01e23b0de85bc1c5f09b58ee29ef79@uniba.it> Message-ID: Hi Vittorio, PETSc does provide support for your application and some of us (eg, me and Matt) work with fusion PIC applications. 1) I am not sure how you handle boundary conditions with a Cartesian grid so let me give two responses: 1.1) With Cartesian grids, geometric multigrid may be usable and that can be fast and easier to use. PETSc supports geometric and algebraic multigrid, including interfaces to third party libraries like hypre. Hypre is an excellent solver, but you can probably use CG as your KSP method instead of GMRES. 1.2) PETSc provides support for unstructured mesh management and discretizations and you switch to an unstructured grid, but I understand we all have priorities. Unstructured grids are probably a better long term solution for you. 2) PETSc is portable with linear algebra back-ends that execute on any "device". Our OpenMP support is only through the Kokkos back-end and we have custom CUDA and HIP backends that are built on vendor libraries. The Kokkos back-end also supports CUDA, HIP and SYCL and we rely on Kokkos any other architectures at this point. BTW, the Kokkos back-end also has an option to use vendor back-ends or Kokkos Kernels for linear algebra and they are often better than the vendors libraries. Hope this helps, Mark On Fri, Dec 15, 2023 at 12:41?AM Vittorio Sciortino < vittorio.sciortino at uniba.it> wrote: > > Dear PETSc developers, > > My name is Vittorio Sciortion, I am a PhD student in Italy and I am > really curious about the applications and possibilities of your > library. I would ask you two questions about PETSc. > > My study case consists in the development of a 2D electrostatic Particle > In Cell code which simulates a plasma interacting with the shaped > surface of adjacent divertor mono-blocks. > This type of scenario requires to solve the electro-static Poisson > equation on the whole set of grid nodes (a cartesian grid) applying some > boundary conditions. > Currently, we are using the KSPSolve subroutine set to apply the gmres > iterative method in conjunction with hypre (used as pre-conditioner). > Some boundary conditons are necessary for our specific problem > (Dirichlet and Neumann conditions on specific line of points). > I have two small curiosity about the possibilities offered by your > library, which is very interesting: > > 1. are we using the best possible pair to solve our problem? > > 2. currently, PETSc is compiled with openMP parallelization and the > iterative method is executed on the CPU. > Is it possible to configure the compilation of our library to execute > these iterations on a nVidia GPU? Which are the best compilation options > that you suggest for your library? > > thank you in advance > Greetings > Vittorio Sciortino > PhD student in Physics > Bari, Italy > > Recently, I sent a subscribe request to the users mailing list using > another e-mail, because this one could be deactivated in two/three > months. private email: vsciortino.phdcourse at gmail.com > -- > Vittorio Sciortino > > ________________________________________________________________________________________________ > Sostieni la formazione e la ricerca universitaria con il tuo 5 per mille > all'Universit? di Bari. > Firma la casella "Finanziamento della ricerca scientifica e della > Universit?" > indicando il codice fiscale 80002170720. > > Il tuo contributo pu? fare la differenza: oggi pi? che mai! > > ________________________________________________________________________________________________ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Sat Dec 16 11:25:54 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Sat, 16 Dec 2023 18:25:54 +0100 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> Message-ID: <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> Unfortunately, I am not able to reproduce such a failure with your input matrix. I?ve used ex79 that I linked previously and the system is properly solved. $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs ascii::ascii_info Linear solve converged due to CONVERGED_RTOL iterations 6 Mat Object: 1 MPI process type: seqaijcusparse rows=289, cols=289 total: nonzeros=2401, allocated nonzeros=2401 total number of mallocs used during MatSetValues calls=0 not using I-node routines Mat Object: 1 MPI process type: seqdensecuda rows=289, cols=10 total: nonzeros=2890, allocated nonzeros=2890 total number of mallocs used during MatSetValues calls=0 You mentioned in a subsequent email that you are interested in systems with at most 1E6 unknowns, and up to 1E4 right-hand sides. I?m not sure you can expect significant gains from using GPU for such systems. Probably, the fastest approach would indeed be -pc_type lu -ksp_type preonly -ksp_matsolve_batch_size 100 or something, depending on the memory available on your host. Thanks, Pierre > On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat wrote: > > Here are the ksp_view files. I set the options -ksp_error_if_not_converged to try to get the vectors that caused the error. I noticed that some of the KSPMatSolves converge while others don't. In the code, the solves are called as: > > input vector v --> insert data of v into a dense mat --> KSPMatSolve() --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output vector w -- output w > > The operator used in the KSP is a Laplacian-like operator, and the MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve with a biharmonic-like operator. I can also run it with only the first KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP reportedly converges after 0 iterations (see the next line), but this causes problems in other parts of the code later on. > > I saw that sometimes the first KSPMatSolve "converges" after 0 iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything. > > I'll keep trying different options and also try to get the MWE made (this KSPMatSolve is pretty performance critical for us). > > Thanks for all your help, > Sreeram > > On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet > wrote: >> >>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat > wrote: >>> >>> Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex). >> >> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files (I?m guessing your are using double-precision scalars with 32-bit PetscInt). >> >>> I'll also try out some of the BoomerAMG options to see if that helps. >> >> These should work (this is where all ?PCMatApply()-ready? PC are being tested): https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 >> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with HIP). >> I?m aware the performance should not be optimal (see your comment about host/device copies), I?ve money to hire someone to work on this but: a) I need to find the correct engineer/post-doc, b) I currently don?t have good use cases (of course, I could generate a synthetic benchmark, for science). >> So even if you send me the three Mat, a MWE would be appreciated if the KSPMatSolve() is performance-critical for you (see point b) from above). >> >> Thanks, >> Pierre >> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet > wrote: >>>> >>>> >>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat > wrote: >>>>> >>>>> Hello Pierre, >>>>> >>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look? >>>> >>>> Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c). >>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage. >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet > wrote: >>>>>> Hello Sreeram, >>>>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once. >>>>>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY. >>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>>>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?. >>>>>> >>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat > wrote: >>>>>>> >>>>>>> Hello Pierre, >>>>>>> >>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. >>>>>>> >>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code. >>>>>>> >>>>>>> Can you please help me with this? >>>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> >>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams > wrote: >>>>>>>> N.B., AMGX interface is a bit experimental. >>>>>>>> Mark >>>>>>>> >>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat > wrote: >>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sreeram >>>>>>>>> >>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet > wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat > wrote: >>>>>>>>>>> >>>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>>>>>> >>>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs. >>>>>>>>>> >>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation. >>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. >>>>>>>>>> But let us know if you need assistance figuring things out. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Pierre >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Sreeram >>>>>>>>>>> >>>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet > wrote: >>>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP. >>>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient. >>>>>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Pierre >>>>>>>>>>>> >>>>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith > wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat > wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>>>>>>> >>>>>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Which would be the more efficient option? >>>>>>>>>>>>> >>>>>>>>>>>>> Use 1. >>>>>>>>>>>>>> >>>>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. >>>>>>>>>>>>> >>>>>>>>>>>>> No >>>>>>>>>>>>> >>>>>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs. >>>>>>>>>>>>> >>>>>>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Sreeram >>>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joauma.marichal at uclouvain.be Mon Dec 18 04:09:36 2023 From: joauma.marichal at uclouvain.be (Joauma Marichal) Date: Mon, 18 Dec 2023 10:09:36 +0000 Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors In-Reply-To: References: Message-ID: Hello, Sorry for the delay. I attach the file that I obtain when running the code with the debug mode. Thanks for your help. Best regards, Joauma De : Matthew Knepley Date : jeudi, 23 novembre 2023 ? 15:32 ? : Joauma Marichal Cc : petsc-maint at mcs.anl.gov , petsc-users at mcs.anl.gov Objet : Re: [petsc-maint] DMSwarm on multiple processors On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal > wrote: Hello, My problem persists? Is there anything I could try? Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It does allocation, and the failure is in libc, and it only happens on larger examples, so I suspect some allocation problem. Can you rebuild with debugging and run this example? Then we can see if the allocation fails. Thanks, Matt Thanks a lot. Best regards, Joauma De : Matthew Knepley > Date : mercredi, 25 octobre 2023 ? 14:45 ? : Joauma Marichal > Cc : petsc-maint at mcs.anl.gov >, petsc-users at mcs.anl.gov > Objet : Re: [petsc-maint] DMSwarm on multiple processors On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint > wrote: Hello, I am using the DMSwarm library in some Eulerian-Lagrangian approach to have vapor bubbles in water. I have obtained nice results recently and wanted to perform bigger simulations. Unfortunately, when I increase the number of processors used to run the simulation, I get the following error: free(): invalid size [cns136:590327] *** Process received signal *** [cns136:590327] Signal: Aborted (6) [cns136:590327] Signal code: (-6) [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20] [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f] [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05] [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037] [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c] [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac] [cns136:590327] [ 6] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] [cns136:590327] [ 7] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642] [cns136:590327] [ 8] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] [cns136:590327] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] [cns136:590327] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] [cns136:590327] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] [cns136:590327] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20] [cns136:590327] [13] ./cobpor[0x4418dc] [cns136:590327] [14] ./cobpor[0x408b63] [cns136:590327] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] [cns136:590327] [16] ./cobpor[0x40bdee] [cns136:590327] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited on signal 6 (Aborted). -------------------------------------------------------------------------- When I reduce the number of processors the error disappears and when I run my code without the vapor bubbles it also works. The problem seems to take place at this moment: DMCreate(PETSC_COMM_WORLD,swarm); DMSetType(*swarm,DMSWARM); DMSetDimension(*swarm,3); DMSwarmSetType(*swarm,DMSWARM_PIC); DMSwarmSetCellDM(*swarm,*dmcell); Thanks a lot for your help. Things that would help us track this down: 1) The smallest example where it fails 2) The smallest number of processes where it fails 3) A stack trace of the failure 4) A simple example that we can run that also fails Thanks, Matt Best regards, Joauma -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: slurm-3184479.out Type: application/octet-stream Size: 55415 bytes Desc: slurm-3184479.out URL: From knepley at gmail.com Mon Dec 18 05:00:02 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 18 Dec 2023 06:00:02 -0500 Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors In-Reply-To: References: Message-ID: On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal < joauma.marichal at uclouvain.be> wrote: > Hello, > > > > Sorry for the delay. I attach the file that I obtain when running the code > with the debug mode. > Okay, we can now see where this is happening: malloc_consolidate(): invalid chunk size [cns263:3265170] *** Process received signal *** [cns263:3265170] Signal: Aborted (6) [cns263:3265170] Signal code: (-6) [cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20] [cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f] [cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05] [cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037] [cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c] [cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68] [cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18] [cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822] [cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc] [cns263:3265170] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625] [cns263:3265170] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07] [cns263:3265170] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b] [cns263:3265170] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9] [cns263:3265170] [13] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea] [cns263:3265170] [14] ./cobpor[0x402de8] [cns263:3265170] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3] [cns263:3265170] [16] ./cobpor[0x40304e] [cns263:3265170] *** End of error message *** However, this is not great. First, the amount of memory being allocated is quite small, and this does not appear to be an Out of Memory error. Second, the error occurs in libc: malloc_consolidate(): invalid chunk size which means something is wrong internally. I agree with this analysis ( https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error) that says you have probably overwritten memory somewhere in your code. I recommend running under valgrind, or using Address Sanitizer from clang. Thanks, Matt Thanks for your help. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *jeudi, 23 novembre 2023 ? 15:32 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal < > joauma.marichal at uclouvain.be> wrote: > > Hello, > > > > My problem persists? Is there anything I could try? > > > > Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It > does allocation, and the failure > > is in libc, and it only happens on larger examples, so I suspect some > allocation problem. Can you rebuild with debugging and run this example? > Then we can see if the allocation fails. > > > > Thanks, > > Matt > > > > Thanks a lot. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *mercredi, 25 octobre 2023 ? 14:45 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint < > petsc-maint at mcs.anl.gov> wrote: > > Hello, > > > > I am using the DMSwarm library in some Eulerian-Lagrangian approach to > have vapor bubbles in water. > > I have obtained nice results recently and wanted to perform bigger > simulations. Unfortunately, when I increase the number of processors used > to run the simulation, I get the following error: > > > > free(): invalid size > > [cns136:590327] *** Process received signal *** > > [cns136:590327] Signal: Aborted (6) > > [cns136:590327] Signal code: (-6) > > [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20] > > [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f] > > [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05] > > [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037] > > [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c] > > [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac] > > [cns136:590327] [ 6] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] > > [cns136:590327] [ 7] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642] > > [cns136:590327] [ 8] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] > > [cns136:590327] [ 9] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] > > [cns136:590327] [10] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] > > [cns136:590327] [11] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] > > [cns136:590327] [12] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20] > > [cns136:590327] [13] ./cobpor[0x4418dc] > > [cns136:590327] [14] ./cobpor[0x408b63] > > [cns136:590327] [15] > /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] > > [cns136:590327] [16] ./cobpor[0x40bdee] > > [cns136:590327] *** End of error message *** > > -------------------------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code. Per user-direction, the job has been aborted. > > -------------------------------------------------------------------------- > > -------------------------------------------------------------------------- > > mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited > on signal 6 (Aborted). > > -------------------------------------------------------------------------- > > > > When I reduce the number of processors the error disappears and when I run > my code without the vapor bubbles it also works. > > The problem seems to take place at this moment: > > > > DMCreate(PETSC_COMM_WORLD,swarm); > > DMSetType(*swarm,DMSWARM); > > DMSetDimension(*swarm,3); > > DMSwarmSetType(*swarm,DMSWARM_PIC); > > DMSwarmSetCellDM(*swarm,*dmcell); > > > > > > Thanks a lot for your help. > > > > Things that would help us track this down: > > > > 1) The smallest example where it fails > > > > 2) The smallest number of processes where it fails > > > > 3) A stack trace of the failure > > > > 4) A simple example that we can run that also fails > > > > Thanks, > > > > Matt > > > > Best regards, > > > > Joauma > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From facklerpw at ornl.gov Mon Dec 18 12:54:27 2023 From: facklerpw at ornl.gov (Fackler, Philip) Date: Mon, 18 Dec 2023 18:54:27 +0000 Subject: [petsc-users] [EXTERNAL] Re: Call to DMSetMatrixPreallocateSkip not changing allocation behavior In-Reply-To: <87wmtgcw4m.fsf@jedbrown.org> References: <871qboech6.fsf@jedbrown.org> <87wmtgcw4m.fsf@jedbrown.org> Message-ID: Jed, That seems to have worked (ridiculously well). It's now 55MB, and it's happening in the call to MatSetPreallocationCOO. Thank you, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Jed Brown Sent: Thursday, December 14, 2023 16:27 To: Fackler, Philip ; petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net Subject: [EXTERNAL] Re: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing allocation behavior I had a one-character typo in the diff above. This MR to release should work now. https://urldefense.us/v2/url?u=https-3A__gitlab.com_petsc_petsc_-2D_merge-5Frequests_7120&d=DwIBAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=DAkLCjn8leYU-uJ-kfNEQMhPZWx9lzc4d5KgIR-RZWQ&m=v9sHqomCGBRWotign4NcwYwOpszOJehUGs_EO3eGn4SSZqxnfK7Iv15-X8nO1lii&s=h_jIP-6WcIjR6LssfGrV6Z2DojlN_w7Me4-a4rBE074&e= Jed Brown writes: > 17 GB for a 1D DMDA, wow. :-) > > Could you try applying this diff to make it work for DMDA (it's currently handled by DMPlex)? > > diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c > index cad4d926504..bd2a3bda635 100644 > --- i/src/dm/impls/da/fdda.c > +++ w/src/dm/impls/da/fdda.c > @@ -675,19 +675,21 @@ PetscErrorCode DMCreateMatrix_DA(DM da, Mat *J) > specialized setting routines depend only on the particular preallocation > details of the matrix, not the type itself. > */ > - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij)); > - if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij)); > - if (!aij) { > - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij)); > - if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij)); > - if (!baij) { > - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij)); > - if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij)); > - if (!sbaij) { > - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell)); > - if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell)); > + if (!dm->prealloc_skip) { // Flag is likely set when user intends to use MatSetPreallocationCOO() > + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij)); > + if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij)); > + if (!aij) { > + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij)); > + if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij)); > + if (!baij) { > + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij)); > + if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij)); > + if (!sbaij) { > + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell)); > + if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell)); > + } > + if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is)); > } > - if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is)); > } > } > if (aij) { > > > "Fackler, Philip via petsc-users" writes: > >> I'm using the following sequence of functions related to the Jacobian matrix: >> >> DMDACreate1d(..., &da); >> DMSetFromOptions(da); >> DMSetUp(da); >> DMSetMatType(da, MATAIJKOKKOS); >> DMSetMatrixPreallocateSkip(da, PETSC_TRUE); >> Mat J; >> DMCreateMatrix(da, &J); >> MatSetPreallocationCOO(J, ...); >> >> I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix. >> >> [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13] >> >> Can someone help me understand this? >> >> Thanks, >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Dec 18 13:47:06 2023 From: jed at jedbrown.org (Jed Brown) Date: Mon, 18 Dec 2023 12:47:06 -0700 Subject: [petsc-users] [EXTERNAL] Re: Call to DMSetMatrixPreallocateSkip not changing allocation behavior In-Reply-To: References: <871qboech6.fsf@jedbrown.org> <87wmtgcw4m.fsf@jedbrown.org> Message-ID: <874jgfb8ed.fsf@jedbrown.org> Great, thanks for letting us know. It'll merge to release shortly and thus be in petsc >= 3.20.3. "Fackler, Philip" writes: > Jed, > > That seems to have worked (ridiculously well). It's now 55MB, and it's happening in the call to MatSetPreallocationCOO. > > Thank you, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Jed Brown > Sent: Thursday, December 14, 2023 16:27 > To: Fackler, Philip ; petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net > Subject: [EXTERNAL] Re: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing allocation behavior > > I had a one-character typo in the diff above. This MR to release should work now. > > https://urldefense.us/v2/url?u=https-3A__gitlab.com_petsc_petsc_-2D_merge-5Frequests_7120&d=DwIBAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=DAkLCjn8leYU-uJ-kfNEQMhPZWx9lzc4d5KgIR-RZWQ&m=v9sHqomCGBRWotign4NcwYwOpszOJehUGs_EO3eGn4SSZqxnfK7Iv15-X8nO1lii&s=h_jIP-6WcIjR6LssfGrV6Z2DojlN_w7Me4-a4rBE074&e= > > Jed Brown writes: > >> 17 GB for a 1D DMDA, wow. :-) >> >> Could you try applying this diff to make it work for DMDA (it's currently handled by DMPlex)? >> >> diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c >> index cad4d926504..bd2a3bda635 100644 >> --- i/src/dm/impls/da/fdda.c >> +++ w/src/dm/impls/da/fdda.c >> @@ -675,19 +675,21 @@ PetscErrorCode DMCreateMatrix_DA(DM da, Mat *J) >> specialized setting routines depend only on the particular preallocation >> details of the matrix, not the type itself. >> */ >> - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij)); >> - if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij)); >> - if (!aij) { >> - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij)); >> - if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij)); >> - if (!baij) { >> - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij)); >> - if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij)); >> - if (!sbaij) { >> - PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell)); >> - if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell)); >> + if (!dm->prealloc_skip) { // Flag is likely set when user intends to use MatSetPreallocationCOO() >> + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij)); >> + if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij)); >> + if (!aij) { >> + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij)); >> + if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij)); >> + if (!baij) { >> + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij)); >> + if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij)); >> + if (!sbaij) { >> + PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell)); >> + if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell)); >> + } >> + if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is)); >> } >> - if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is)); >> } >> } >> if (aij) { >> >> >> "Fackler, Philip via petsc-users" writes: >> >>> I'm using the following sequence of functions related to the Jacobian matrix: >>> >>> DMDACreate1d(..., &da); >>> DMSetFromOptions(da); >>> DMSetUp(da); >>> DMSetMatType(da, MATAIJKOKKOS); >>> DMSetMatrixPreallocateSkip(da, PETSC_TRUE); >>> Mat J; >>> DMCreateMatrix(da, &J); >>> MatSetPreallocationCOO(J, ...); >>> >>> I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix. >>> >>> [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13] >>> >>> Can someone help me understand this? >>> >>> Thanks, >>> >>> Philip Fackler >>> Research Software Engineer, Application Engineering Group >>> Advanced Computing Systems Research Section >>> Computer Science and Mathematics Division >>> Oak Ridge National Laboratory From joauma.marichal at uclouvain.be Tue Dec 19 04:10:56 2023 From: joauma.marichal at uclouvain.be (Joauma Marichal) Date: Tue, 19 Dec 2023 10:10:56 +0000 Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors In-Reply-To: References: Message-ID: Hello, I have used Address Sanitizer to check any memory errors. On my computer, no errors are found. Unfortunately, on the supercomputer that I am using, I get lots of errors? I attach my log files (running on 1 and 70 procs). Do you have any idea of what I could do? Thanks a lot for your help. Best regards, Joauma De : Matthew Knepley Date : lundi, 18 d?cembre 2023 ? 12:00 ? : Joauma Marichal Cc : petsc-maint at mcs.anl.gov , petsc-users at mcs.anl.gov Objet : Re: [petsc-maint] DMSwarm on multiple processors On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal > wrote: Hello, Sorry for the delay. I attach the file that I obtain when running the code with the debug mode. Okay, we can now see where this is happening: malloc_consolidate(): invalid chunk size [cns263:3265170] *** Process received signal *** [cns263:3265170] Signal: Aborted (6) [cns263:3265170] Signal code: (-6) [cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20] [cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f] [cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05] [cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037] [cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c] [cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68] [cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18] [cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822] [cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc] [cns263:3265170] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625] [cns263:3265170] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07] [cns263:3265170] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b] [cns263:3265170] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9] [cns263:3265170] [13] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea] [cns263:3265170] [14] ./cobpor[0x402de8] [cns263:3265170] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3] [cns263:3265170] [16] ./cobpor[0x40304e] [cns263:3265170] *** End of error message *** However, this is not great. First, the amount of memory being allocated is quite small, and this does not appear to be an Out of Memory error. Second, the error occurs in libc: malloc_consolidate(): invalid chunk size which means something is wrong internally. I agree with this analysis (https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error) that says you have probably overwritten memory somewhere in your code. I recommend running under valgrind, or using Address Sanitizer from clang. Thanks, Matt Thanks for your help. Best regards, Joauma De : Matthew Knepley > Date : jeudi, 23 novembre 2023 ? 15:32 ? : Joauma Marichal > Cc : petsc-maint at mcs.anl.gov >, petsc-users at mcs.anl.gov > Objet : Re: [petsc-maint] DMSwarm on multiple processors On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal > wrote: Hello, My problem persists? Is there anything I could try? Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It does allocation, and the failure is in libc, and it only happens on larger examples, so I suspect some allocation problem. Can you rebuild with debugging and run this example? Then we can see if the allocation fails. Thanks, Matt Thanks a lot. Best regards, Joauma De : Matthew Knepley > Date : mercredi, 25 octobre 2023 ? 14:45 ? : Joauma Marichal > Cc : petsc-maint at mcs.anl.gov >, petsc-users at mcs.anl.gov > Objet : Re: [petsc-maint] DMSwarm on multiple processors On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint > wrote: Hello, I am using the DMSwarm library in some Eulerian-Lagrangian approach to have vapor bubbles in water. I have obtained nice results recently and wanted to perform bigger simulations. Unfortunately, when I increase the number of processors used to run the simulation, I get the following error: free(): invalid size [cns136:590327] *** Process received signal *** [cns136:590327] Signal: Aborted (6) [cns136:590327] Signal code: (-6) [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20] [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f] [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05] [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037] [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c] [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac] [cns136:590327] [ 6] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] [cns136:590327] [ 7] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642] [cns136:590327] [ 8] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] [cns136:590327] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] [cns136:590327] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] [cns136:590327] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] [cns136:590327] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20] [cns136:590327] [13] ./cobpor[0x4418dc] [cns136:590327] [14] ./cobpor[0x408b63] [cns136:590327] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] [cns136:590327] [16] ./cobpor[0x40bdee] [cns136:590327] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited on signal 6 (Aborted). -------------------------------------------------------------------------- When I reduce the number of processors the error disappears and when I run my code without the vapor bubbles it also works. The problem seems to take place at this moment: DMCreate(PETSC_COMM_WORLD,swarm); DMSetType(*swarm,DMSWARM); DMSetDimension(*swarm,3); DMSwarmSetType(*swarm,DMSWARM_PIC); DMSwarmSetCellDM(*swarm,*dmcell); Thanks a lot for your help. Things that would help us track this down: 1) The smallest example where it fails 2) The smallest number of processes where it fails 3) A stack trace of the failure 4) A simple example that we can run that also fails Thanks, Matt Best regards, Joauma -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log_1proc Type: application/octet-stream Size: 14935 bytes Desc: log_1proc URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log_70proc Type: application/octet-stream Size: 174297 bytes Desc: log_70proc URL: From knepley at gmail.com Tue Dec 19 07:29:58 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Dec 2023 08:29:58 -0500 Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors In-Reply-To: References: Message-ID: On Tue, Dec 19, 2023 at 5:11?AM Joauma Marichal < joauma.marichal at uclouvain.be> wrote: > Hello, > > > > I have used Address Sanitizer to check any memory errors. On my computer, > no errors are found. Unfortunately, on the supercomputer that I am using, I > get lots of errors? I attach my log files (running on 1 and 70 procs). > > Do you have any idea of what I could do? > Run the same parallel configuration as you do on the supercomputer. If that is fine, I would suggest Address Sanitizer there. Something is corrupting the stack, and it appears that it is connected to that machine, rather than the library. Do you have access to a second parallel machine? Thanks, Matt > Thanks a lot for your help. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *lundi, 18 d?cembre 2023 ? 12:00 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal < > joauma.marichal at uclouvain.be> wrote: > > Hello, > > > > Sorry for the delay. I attach the file that I obtain when running the code > with the debug mode. > > > > Okay, we can now see where this is happening: > > > > malloc_consolidate(): invalid chunk size > [cns263:3265170] *** Process received signal *** > [cns263:3265170] Signal: Aborted (6) > [cns263:3265170] Signal code: (-6) > [cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20] > [cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f] > [cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05] > [cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037] > [cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c] > [cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68] > [cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18] > [cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822] > [cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc] > [cns263:3265170] [ 9] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625] > [cns263:3265170] [10] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07] > [cns263:3265170] [11] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b] > [cns263:3265170] [12] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9] > [cns263:3265170] [13] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea] > [cns263:3265170] [14] ./cobpor[0x402de8] > [cns263:3265170] [15] > /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3] > [cns263:3265170] [16] ./cobpor[0x40304e] > [cns263:3265170] *** End of error message *** > > > > However, this is not great. First, the amount of memory being allocated is > quite small, and this does not appear to be an Out of Memory error. Second, > the error occurs in libc: > > > > malloc_consolidate(): invalid chunk size > > > > which means something is wrong internally. I agree with this analysis ( > https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error) > that says you have probably overwritten memory somewhere in your code. I > recommend running under valgrind, or using Address Sanitizer from clang. > > > > Thanks, > > > > Matt > > > > Thanks for your help. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *jeudi, 23 novembre 2023 ? 15:32 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal < > joauma.marichal at uclouvain.be> wrote: > > Hello, > > > > My problem persists? Is there anything I could try? > > > > Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It > does allocation, and the failure > > is in libc, and it only happens on larger examples, so I suspect some > allocation problem. Can you rebuild with debugging and run this example? > Then we can see if the allocation fails. > > > > Thanks, > > Matt > > > > Thanks a lot. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *mercredi, 25 octobre 2023 ? 14:45 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint < > petsc-maint at mcs.anl.gov> wrote: > > Hello, > > > > I am using the DMSwarm library in some Eulerian-Lagrangian approach to > have vapor bubbles in water. > > I have obtained nice results recently and wanted to perform bigger > simulations. Unfortunately, when I increase the number of processors used > to run the simulation, I get the following error: > > > > free(): invalid size > > [cns136:590327] *** Process received signal *** > > [cns136:590327] Signal: Aborted (6) > > [cns136:590327] Signal code: (-6) > > [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20] > > [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f] > > [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05] > > [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037] > > [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c] > > [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac] > > [cns136:590327] [ 6] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] > > [cns136:590327] [ 7] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642] > > [cns136:590327] [ 8] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] > > [cns136:590327] [ 9] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] > > [cns136:590327] [10] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] > > [cns136:590327] [11] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] > > [cns136:590327] [12] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20] > > [cns136:590327] [13] ./cobpor[0x4418dc] > > [cns136:590327] [14] ./cobpor[0x408b63] > > [cns136:590327] [15] > /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] > > [cns136:590327] [16] ./cobpor[0x40bdee] > > [cns136:590327] *** End of error message *** > > -------------------------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code. Per user-direction, the job has been aborted. > > -------------------------------------------------------------------------- > > -------------------------------------------------------------------------- > > mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited > on signal 6 (Aborted). > > -------------------------------------------------------------------------- > > > > When I reduce the number of processors the error disappears and when I run > my code without the vapor bubbles it also works. > > The problem seems to take place at this moment: > > > > DMCreate(PETSC_COMM_WORLD,swarm); > > DMSetType(*swarm,DMSWARM); > > DMSetDimension(*swarm,3); > > DMSwarmSetType(*swarm,DMSWARM_PIC); > > DMSwarmSetCellDM(*swarm,*dmcell); > > > > > > Thanks a lot for your help. > > > > Things that would help us track this down: > > > > 1) The smallest example where it fails > > > > 2) The smallest number of processes where it fails > > > > 3) A stack trace of the failure > > > > 4) A simple example that we can run that also fails > > > > Thanks, > > > > Matt > > > > Best regards, > > > > Joauma > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sawsan.shatanawi at wsu.edu Tue Dec 19 19:28:52 2023 From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad) Date: Wed, 20 Dec 2023 01:28:52 +0000 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code Message-ID: Hello everyone, I hope this email finds you well. My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. I am kindly asking if someone can help me, I would be happy to share my code with him/her. Please find the attached file contains a list of errors I have gotten Thank you in advance for your time and assistance. Best regards, Sawsan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: out.txt URL: From srvenkat at utexas.edu Wed Dec 20 01:42:27 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Wed, 20 Dec 2023 13:12:27 +0530 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> Message-ID: Ok, I think the error I'm getting has something to do with how the multiple solves are being done in succession. I'll try to see if there's anything I'm doing wrong there. One question about the -pc_type lu -ksp_type preonly method: do you know which parts of the solve (factorization/triangular solves) are done on host and which are done on device? Thanks, Sreeram On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet wrote: > Unfortunately, I am not able to reproduce such a failure with your input > matrix. > I?ve used ex79 that I linked previously and the system is properly solved. > $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg > -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs > ascii::ascii_info > Linear solve converged due to CONVERGED_RTOL iterations 6 > Mat Object: 1 MPI process > type: seqaijcusparse > rows=289, cols=289 > total: nonzeros=2401, allocated nonzeros=2401 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > Mat Object: 1 MPI process > type: seqdensecuda > rows=289, cols=10 > total: nonzeros=2890, allocated nonzeros=2890 > total number of mallocs used during MatSetValues calls=0 > > You mentioned in a subsequent email that you are interested in systems > with at most 1E6 unknowns, and up to 1E4 right-hand sides. > I?m not sure you can expect significant gains from using GPU for such > systems. > Probably, the fastest approach would indeed be -pc_type lu -ksp_type > preonly -ksp_matsolve_batch_size 100 or something, depending on the memory > available on your host. > > Thanks, > Pierre > > On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat wrote: > > Here are the ksp_view files. I set the options > -ksp_error_if_not_converged to try to get the vectors that caused the > error. I noticed that some of the KSPMatSolves converge while others don't. > In the code, the solves are called as: > > input vector v --> insert data of v into a dense mat --> KSPMatSolve() --> > MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output > vector w -- output w > > The operator used in the KSP is a Laplacian-like operator, and the > MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve > with a biharmonic-like operator. I can also run it with only the first > KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP > reportedly converges after 0 iterations (see the next line), but this > causes problems in other parts of the code later on. > > I saw that sometimes the first KSPMatSolve "converges" after 0 iterations > due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I > tried setting ksp_min_it, but that didn't seem to do anything. > > I'll keep trying different options and also try to get the MWE made (this > KSPMatSolve is pretty performance critical for us). > > Thanks for all your help, > Sreeram > > On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet wrote: > >> >> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat >> wrote: >> >> Thanks, I will try to create a minimal reproducible example. This may >> take me some time though, as I need to figure out how to extract only the >> relevant parts (the full program this solve is used in is getting quite >> complex). >> >> >> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat >> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files >> (I?m guessing your are using double-precision scalars with 32-bit PetscInt). >> >> I'll also try out some of the BoomerAMG options to see if that helps. >> >> >> These should work (this is where all ?PCMatApply()-ready? PC are being >> tested): >> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 >> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with >> HIP). >> I?m aware the performance should not be optimal (see your comment about >> host/device copies), I?ve money to hire someone to work on this but: a) I >> need to find the correct engineer/post-doc, b) I currently don?t have good >> use cases (of course, I could generate a synthetic benchmark, for science). >> So even if you send me the three Mat, a MWE would be appreciated if the >> KSPMatSolve() is performance-critical for you (see point b) from above). >> >> Thanks, >> Pierre >> >> Thanks, >> Sreeram >> >> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet wrote: >> >>> >>> >>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat >>> wrote: >>> >>> Hello Pierre, >>> >>> Thank you for your reply. I tried out the HPDDM CG as you said, and it >>> seems to be doing the batched solves, but the KSP is not converging due to >>> a NaN or Inf being generated. I also noticed there are a lot of >>> host-to-device and device-to-host copies of the matrices (the non-batched >>> KSP solve did not have any memcopies). I have attached dump.0 again. Could >>> you please take a look? >>> >>> >>> Yes, but you?d need to send me something I can run with your set of >>> options (if you are more confident doing this in private, you can remove >>> the list from c/c). >>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there >>> is not much error checking, so instead of erroring out, this may be the >>> reason why you are getting garbage. >>> >>> Thanks, >>> Pierre >>> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet wrote: >>> >>>> Hello Sreeram, >>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple >>>> columns at once. >>>> There is only a single native PETSc KSP implementation which handles >>>> solves with multiple columns at once: KSPPREONLY. >>>> If you use --download-hpddm, you can use a CG (or GMRES, or more >>>> advanced methods) implementation which handles solves with multiple columns >>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, >>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>> I?m the main author of HPDDM, there is preliminary support for device >>>> matrices, but if it?s not working as intended/not faster than column by >>>> column, I?d be happy to have a deeper look (maybe in private), because most >>>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., >>>> solvers that treat right-hand sides in a single go) are using plain host >>>> matrices. >>>> >>>> Thanks, >>>> Pierre >>>> >>>> PS: you could have a look at >>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to >>>> understand the philosophy behind block iterative methods in PETSc (and in >>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was >>>> developed in the context of this paper to produce Figures 2-3. Note that >>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among >>>> others) have been made ?PCMatApply()-ready?. >>>> >>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat >>>> wrote: >>>> >>>> Hello Pierre, >>>> >>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. >>>> However, I am noticing that it is still solving column by column (this is >>>> stated explicitly in the info dump attached). I looked at the code for >>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is >>>> true, it should do the batched solve, though I'm not sure where that gets >>>> set. >>>> >>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when >>>> running the code. >>>> >>>> Can you please help me with this? >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> >>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams wrote: >>>> >>>>> N.B., AMGX interface is a bit experimental. >>>>> Mark >>>>> >>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat >>>>> wrote: >>>>> >>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build >>>>>> correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>> >>>>>> Thanks, >>>>>> Sreeram >>>>>> >>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat >>>>>>> wrote: >>>>>>> >>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>> >>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out >>>>>>> and see how it performs. >>>>>>> >>>>>>> >>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus >>>>>>> has no PCMatApply() implementation. >>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >>>>>>> implementation. >>>>>>> But let us know if you need assistance figuring things out. >>>>>>> >>>>>>> Thanks, >>>>>>> Pierre >>>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet >>>>>>> wrote: >>>>>>> >>>>>>>> To expand on Barry?s answer, we have observed repeatedly that >>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can >>>>>>>> reproduce this on your own with >>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>> Also, I?m guessing you are using some sort of preconditioner within >>>>>>>> your KSP. >>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>>>>>>> right-hand sides column by column, which is very inefficient. >>>>>>>> You could run your code with -info dump and send us dump.0 to see >>>>>>>> what needs to be done on our end to make things more efficient, should you >>>>>>>> not be satisfied with the current performance of the code. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat >>>>>>>> wrote: >>>>>>>> >>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of >>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where >>>>>>>> v_i has size n. The data for v can be stored either in column-major or >>>>>>>> row-major order. Now, I want to do 2 types of operations: >>>>>>>> >>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>> >>>>>>>> From what I have read on the documentation, I can think of 2 >>>>>>>> approaches. >>>>>>>> >>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to >>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the >>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve >>>>>>>> with R and V. >>>>>>>> >>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly >>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that >>>>>>>> it is a multiple RHS system and act accordingly. >>>>>>>> >>>>>>>> Which would be the more efficient option? >>>>>>>> >>>>>>>> >>>>>>>> Use 1. >>>>>>>> >>>>>>>> >>>>>>>> As a side-note, I am also wondering if there is a way to use >>>>>>>> row-major storage of the vector v. >>>>>>>> >>>>>>>> >>>>>>>> No >>>>>>>> >>>>>>>> The reason is that this could allow for more coalesced memory >>>>>>>> access when doing matvecs. >>>>>>>> >>>>>>>> >>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products >>>>>>>> for the computation so in theory they should already be well-optimized >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sreeram >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>> >>>> >>>> >>> >>> >>> >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Wed Dec 20 01:51:21 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Wed, 20 Dec 2023 08:51:21 +0100 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> Message-ID: > On 20 Dec 2023, at 8:42?AM, Sreeram R Venkat wrote: > > Ok, I think the error I'm getting has something to do with how the multiple solves are being done in succession. I'll try to see if there's anything I'm doing wrong there. > > One question about the -pc_type lu -ksp_type preonly method: do you know which parts of the solve (factorization/triangular solves) are done on host and which are done on device? I think only the triangular solves can be done on device. Since you have many right-hand sides, it may not be that bad. GPU people will hopefully give you a more insightful answer. Thanks, Pierre > Thanks, > Sreeram > > On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet > wrote: >> Unfortunately, I am not able to reproduce such a failure with your input matrix. >> I?ve used ex79 that I linked previously and the system is properly solved. >> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs ascii::ascii_info >> Linear solve converged due to CONVERGED_RTOL iterations 6 >> Mat Object: 1 MPI process >> type: seqaijcusparse >> rows=289, cols=289 >> total: nonzeros=2401, allocated nonzeros=2401 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> Mat Object: 1 MPI process >> type: seqdensecuda >> rows=289, cols=10 >> total: nonzeros=2890, allocated nonzeros=2890 >> total number of mallocs used during MatSetValues calls=0 >> >> You mentioned in a subsequent email that you are interested in systems with at most 1E6 unknowns, and up to 1E4 right-hand sides. >> I?m not sure you can expect significant gains from using GPU for such systems. >> Probably, the fastest approach would indeed be -pc_type lu -ksp_type preonly -ksp_matsolve_batch_size 100 or something, depending on the memory available on your host. >> >> Thanks, >> Pierre >> >>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat > wrote: >>> >>> Here are the ksp_view files. I set the options -ksp_error_if_not_converged to try to get the vectors that caused the error. I noticed that some of the KSPMatSolves converge while others don't. In the code, the solves are called as: >>> >>> input vector v --> insert data of v into a dense mat --> KSPMatSolve() --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output vector w -- output w >>> >>> The operator used in the KSP is a Laplacian-like operator, and the MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve with a biharmonic-like operator. I can also run it with only the first KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP reportedly converges after 0 iterations (see the next line), but this causes problems in other parts of the code later on. >>> >>> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything. >>> >>> I'll keep trying different options and also try to get the MWE made (this KSPMatSolve is pretty performance critical for us). >>> >>> Thanks for all your help, >>> Sreeram >>> >>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet > wrote: >>>> >>>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat > wrote: >>>>> >>>>> Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex). >>>> >>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files (I?m guessing your are using double-precision scalars with 32-bit PetscInt). >>>> >>>>> I'll also try out some of the BoomerAMG options to see if that helps. >>>> >>>> These should work (this is where all ?PCMatApply()-ready? PC are being tested): https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 >>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with HIP). >>>> I?m aware the performance should not be optimal (see your comment about host/device copies), I?ve money to hire someone to work on this but: a) I need to find the correct engineer/post-doc, b) I currently don?t have good use cases (of course, I could generate a synthetic benchmark, for science). >>>> So even if you send me the three Mat, a MWE would be appreciated if the KSPMatSolve() is performance-critical for you (see point b) from above). >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet > wrote: >>>>>> >>>>>> >>>>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat > wrote: >>>>>>> >>>>>>> Hello Pierre, >>>>>>> >>>>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look? >>>>>> >>>>>> Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c). >>>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet > wrote: >>>>>>>> Hello Sreeram, >>>>>>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once. >>>>>>>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY. >>>>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>>>>>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?. >>>>>>>> >>>>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat > wrote: >>>>>>>>> >>>>>>>>> Hello Pierre, >>>>>>>>> >>>>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. >>>>>>>>> >>>>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code. >>>>>>>>> >>>>>>>>> Can you please help me with this? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sreeram >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams > wrote: >>>>>>>>>> N.B., AMGX interface is a bit experimental. >>>>>>>>>> Mark >>>>>>>>>> >>>>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat > wrote: >>>>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Sreeram >>>>>>>>>>> >>>>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet > wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat > wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>>>>>>>> >>>>>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs. >>>>>>>>>>>> >>>>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation. >>>>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. >>>>>>>>>>>> But let us know if you need assistance figuring things out. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Pierre >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Sreeram >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet > wrote: >>>>>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP. >>>>>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient. >>>>>>>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Pierre >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith > wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat > wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Which would be the more efficient option? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Use 1. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> No >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Sreeram >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From joauma.marichal at uclouvain.be Wed Dec 20 03:12:43 2023 From: joauma.marichal at uclouvain.be (Joauma Marichal) Date: Wed, 20 Dec 2023 09:12:43 +0000 Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors In-Reply-To: References: Message-ID: Hello, I used Address Sanitizer on my laptop and I have no leaks. I do have access to another machine (managed by the same people as the previous one) but I obtain similar errors? Thanks again for your help. Best regards, Joauma De : Matthew Knepley Date : mardi, 19 d?cembre 2023 ? 14:30 ? : Joauma Marichal Cc : petsc-maint at mcs.anl.gov , petsc-users at mcs.anl.gov Objet : Re: [petsc-maint] DMSwarm on multiple processors On Tue, Dec 19, 2023 at 5:11?AM Joauma Marichal > wrote: Hello, I have used Address Sanitizer to check any memory errors. On my computer, no errors are found. Unfortunately, on the supercomputer that I am using, I get lots of errors? I attach my log files (running on 1 and 70 procs). Do you have any idea of what I could do? Run the same parallel configuration as you do on the supercomputer. If that is fine, I would suggest Address Sanitizer there. Something is corrupting the stack, and it appears that it is connected to that machine, rather than the library. Do you have access to a second parallel machine? Thanks, Matt Thanks a lot for your help. Best regards, Joauma De : Matthew Knepley > Date : lundi, 18 d?cembre 2023 ? 12:00 ? : Joauma Marichal > Cc : petsc-maint at mcs.anl.gov >, petsc-users at mcs.anl.gov > Objet : Re: [petsc-maint] DMSwarm on multiple processors On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal > wrote: Hello, Sorry for the delay. I attach the file that I obtain when running the code with the debug mode. Okay, we can now see where this is happening: malloc_consolidate(): invalid chunk size [cns263:3265170] *** Process received signal *** [cns263:3265170] Signal: Aborted (6) [cns263:3265170] Signal code: (-6) [cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20] [cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f] [cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05] [cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037] [cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c] [cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68] [cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18] [cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822] [cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc] [cns263:3265170] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625] [cns263:3265170] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07] [cns263:3265170] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b] [cns263:3265170] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9] [cns263:3265170] [13] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea] [cns263:3265170] [14] ./cobpor[0x402de8] [cns263:3265170] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3] [cns263:3265170] [16] ./cobpor[0x40304e] [cns263:3265170] *** End of error message *** However, this is not great. First, the amount of memory being allocated is quite small, and this does not appear to be an Out of Memory error. Second, the error occurs in libc: malloc_consolidate(): invalid chunk size which means something is wrong internally. I agree with this analysis (https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error) that says you have probably overwritten memory somewhere in your code. I recommend running under valgrind, or using Address Sanitizer from clang. Thanks, Matt Thanks for your help. Best regards, Joauma De : Matthew Knepley > Date : jeudi, 23 novembre 2023 ? 15:32 ? : Joauma Marichal > Cc : petsc-maint at mcs.anl.gov >, petsc-users at mcs.anl.gov > Objet : Re: [petsc-maint] DMSwarm on multiple processors On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal > wrote: Hello, My problem persists? Is there anything I could try? Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It does allocation, and the failure is in libc, and it only happens on larger examples, so I suspect some allocation problem. Can you rebuild with debugging and run this example? Then we can see if the allocation fails. Thanks, Matt Thanks a lot. Best regards, Joauma De : Matthew Knepley > Date : mercredi, 25 octobre 2023 ? 14:45 ? : Joauma Marichal > Cc : petsc-maint at mcs.anl.gov >, petsc-users at mcs.anl.gov > Objet : Re: [petsc-maint] DMSwarm on multiple processors On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint > wrote: Hello, I am using the DMSwarm library in some Eulerian-Lagrangian approach to have vapor bubbles in water. I have obtained nice results recently and wanted to perform bigger simulations. Unfortunately, when I increase the number of processors used to run the simulation, I get the following error: free(): invalid size [cns136:590327] *** Process received signal *** [cns136:590327] Signal: Aborted (6) [cns136:590327] Signal code: (-6) [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20] [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f] [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05] [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037] [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c] [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac] [cns136:590327] [ 6] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] [cns136:590327] [ 7] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642] [cns136:590327] [ 8] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] [cns136:590327] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] [cns136:590327] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] [cns136:590327] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] [cns136:590327] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20] [cns136:590327] [13] ./cobpor[0x4418dc] [cns136:590327] [14] ./cobpor[0x408b63] [cns136:590327] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] [cns136:590327] [16] ./cobpor[0x40bdee] [cns136:590327] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited on signal 6 (Aborted). -------------------------------------------------------------------------- When I reduce the number of processors the error disappears and when I run my code without the vapor bubbles it also works. The problem seems to take place at this moment: DMCreate(PETSC_COMM_WORLD,swarm); DMSetType(*swarm,DMSWARM); DMSetDimension(*swarm,3); DMSwarmSetType(*swarm,DMSWARM_PIC); DMSwarmSetCellDM(*swarm,*dmcell); Thanks a lot for your help. Things that would help us track this down: 1) The smallest example where it fails 2) The smallest number of processes where it fails 3) A stack trace of the failure 4) A simple example that we can run that also fails Thanks, Matt Best regards, Joauma -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log_1proc_SC2 Type: application/octet-stream Size: 89034 bytes Desc: log_1proc_SC2 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log_70proc_SC2 Type: application/octet-stream Size: 172731 bytes Desc: log_70proc_SC2 URL: From mfadams at lbl.gov Wed Dec 20 04:48:07 2023 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 20 Dec 2023 05:48:07 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: Message-ID: I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. If this is what you want then you can tell the matrix to let you do that. Otherwise you have a bug. Mark On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello everyone, > > I hope this email finds you well. > > My Name is Sawsan Shatanawi, and I am currently working on developing a > Fortran code for simulating groundwater flow in a 3D system. The code > involves solving a nonlinear system, and I have created the matrix to be > solved using the PCG solver and Picard iteration. However, when I tried > to assign it as a PETSc matrix I started getting a lot of error messages. > > I am kindly asking if someone can help me, I would be happy to share my > code with him/her. > > Please find the attached file contains a list of errors I have gotten > > Thank you in advance for your time and assistance. > > Best regards, > > Sawsan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 20 06:58:43 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 20 Dec 2023 07:58:43 -0500 Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors In-Reply-To: References: Message-ID: On Wed, Dec 20, 2023 at 4:12?AM Joauma Marichal < joauma.marichal at uclouvain.be> wrote: > Hello, > > > > I used Address Sanitizer on my laptop and I have no leaks. > > I do have access to another machine (managed by the same people as the > previous one) but I obtain similar errors? > Let me understand: 1) You have run the exact same problem on two different parallel machines, and gotten the same error, meaning on the second machine, it printed malloc_consolidate(): invalid chunk size Is this true? 2) You have run the exact same problem on the same number of processes on your own machine under Address Sanitizer with no errors? Thanks, Matt > Thanks again for your help. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *mardi, 19 d?cembre 2023 ? 14:30 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Tue, Dec 19, 2023 at 5:11?AM Joauma Marichal < > joauma.marichal at uclouvain.be> wrote: > > Hello, > > > > I have used Address Sanitizer to check any memory errors. On my computer, > no errors are found. Unfortunately, on the supercomputer that I am using, I > get lots of errors? I attach my log files (running on 1 and 70 procs). > > Do you have any idea of what I could do? > > > > Run the same parallel configuration as you do on the supercomputer. If > that is fine, I would suggest Address Sanitizer there. Something is > corrupting the stack, and it appears that it is connected to that machine, > rather than the library. Do you have access to a second parallel machine? > > > > Thanks, > > > > Matt > > > > Thanks a lot for your help. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *lundi, 18 d?cembre 2023 ? 12:00 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal < > joauma.marichal at uclouvain.be> wrote: > > Hello, > > > > Sorry for the delay. I attach the file that I obtain when running the code > with the debug mode. > > > > Okay, we can now see where this is happening: > > > > malloc_consolidate(): invalid chunk size > [cns263:3265170] *** Process received signal *** > [cns263:3265170] Signal: Aborted (6) > [cns263:3265170] Signal code: (-6) > [cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20] > [cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f] > [cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05] > [cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037] > [cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c] > [cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68] > [cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18] > [cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822] > [cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc] > [cns263:3265170] [ 9] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625] > [cns263:3265170] [10] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07] > [cns263:3265170] [11] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b] > [cns263:3265170] [12] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9] > [cns263:3265170] [13] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea] > [cns263:3265170] [14] ./cobpor[0x402de8] > [cns263:3265170] [15] > /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3] > [cns263:3265170] [16] ./cobpor[0x40304e] > [cns263:3265170] *** End of error message *** > > > > However, this is not great. First, the amount of memory being allocated is > quite small, and this does not appear to be an Out of Memory error. Second, > the error occurs in libc: > > > > malloc_consolidate(): invalid chunk size > > > > which means something is wrong internally. I agree with this analysis ( > https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error) > that says you have probably overwritten memory somewhere in your code. I > recommend running under valgrind, or using Address Sanitizer from clang. > > > > Thanks, > > > > Matt > > > > Thanks for your help. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *jeudi, 23 novembre 2023 ? 15:32 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal < > joauma.marichal at uclouvain.be> wrote: > > Hello, > > > > My problem persists? Is there anything I could try? > > > > Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It > does allocation, and the failure > > is in libc, and it only happens on larger examples, so I suspect some > allocation problem. Can you rebuild with debugging and run this example? > Then we can see if the allocation fails. > > > > Thanks, > > Matt > > > > Thanks a lot. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *mercredi, 25 octobre 2023 ? 14:45 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint < > petsc-maint at mcs.anl.gov> wrote: > > Hello, > > > > I am using the DMSwarm library in some Eulerian-Lagrangian approach to > have vapor bubbles in water. > > I have obtained nice results recently and wanted to perform bigger > simulations. Unfortunately, when I increase the number of processors used > to run the simulation, I get the following error: > > > > free(): invalid size > > [cns136:590327] *** Process received signal *** > > [cns136:590327] Signal: Aborted (6) > > [cns136:590327] Signal code: (-6) > > [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20] > > [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f] > > [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05] > > [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037] > > [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c] > > [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac] > > [cns136:590327] [ 6] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] > > [cns136:590327] [ 7] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642] > > [cns136:590327] [ 8] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] > > [cns136:590327] [ 9] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] > > [cns136:590327] [10] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] > > [cns136:590327] [11] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] > > [cns136:590327] [12] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20] > > [cns136:590327] [13] ./cobpor[0x4418dc] > > [cns136:590327] [14] ./cobpor[0x408b63] > > [cns136:590327] [15] > /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] > > [cns136:590327] [16] ./cobpor[0x40bdee] > > [cns136:590327] *** End of error message *** > > -------------------------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code. Per user-direction, the job has been aborted. > > -------------------------------------------------------------------------- > > -------------------------------------------------------------------------- > > mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited > on signal 6 (Aborted). > > -------------------------------------------------------------------------- > > > > When I reduce the number of processors the error disappears and when I run > my code without the vapor bubbles it also works. > > The problem seems to take place at this moment: > > > > DMCreate(PETSC_COMM_WORLD,swarm); > > DMSetType(*swarm,DMSWARM); > > DMSetDimension(*swarm,3); > > DMSwarmSetType(*swarm,DMSWARM_PIC); > > DMSwarmSetCellDM(*swarm,*dmcell); > > > > > > Thanks a lot for your help. > > > > Things that would help us track this down: > > > > 1) The smallest example where it fails > > > > 2) The smallest number of processes where it fails > > > > 3) A stack trace of the failure > > > > 4) A simple example that we can run that also fails > > > > Thanks, > > > > Matt > > > > Best regards, > > > > Joauma > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 20 07:58:01 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 20 Dec 2023 08:58:01 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: Message-ID: On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello everyone, > > I hope this email finds you well. > > My Name is Sawsan Shatanawi, and I am currently working on developing a > Fortran code for simulating groundwater flow in a 3D system. The code > involves solving a nonlinear system, and I have created the matrix to be > solved using the PCG solver and Picard iteration. However, when I tried > to assign it as a PETSc matrix I started getting a lot of error messages. > > I am kindly asking if someone can help me, I would be happy to share my > code with him/her. > > Please find the attached file contains a list of errors I have gotten > This error indicates that your preallocation is not sufficient for the values you want to insert. Now in PETSc you can just remove your preallocation, and PETSc will automatically allocate correctly. Thanks, Matt > Thank you in advance for your time and assistance. > > Best regards, > > Sawsan > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sawsan.shatanawi at wsu.edu Wed Dec 20 08:36:35 2023 From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad) Date: Wed, 20 Dec 2023 14:36:35 +0000 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: Message-ID: Hello, I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it Get Outlook for iOS ________________________________ From: Mark Adams Sent: Wednesday, December 20, 2023 2:48 AM To: Shatanawi, Sawsan Muhammad Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. If this is what you want then you can tell the matrix to let you do that. Otherwise you have a bug. Mark On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello everyone, I hope this email finds you well. My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. I am kindly asking if someone can help me, I would be happy to share my code with him/her. Please find the attached file contains a list of errors I have gotten Thank you in advance for your time and assistance. Best regards, Sawsan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Dec 20 08:44:47 2023 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 20 Dec 2023 09:44:47 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: Message-ID: Did you set preallocation values when you created the matrix? Don't do that. On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad < sawsan.shatanawi at wsu.edu> wrote: > Hello, > > I am trying to create a sparse matrix( which is as I believe a zero > matrix) then adding some nonzero elements to it over a loop, then > assembling it > > Get Outlook for iOS > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 2:48 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > I am guessing that you are creating a matrix, adding to it, finalizing it > ("assembly"), and then adding to it again, which is fine, but you are > adding new non-zeros to the sparsity pattern. > If this is what you want then you can tell the matrix to let you do that. > Otherwise you have a bug. > > Mark > > On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > >> Hello everyone, >> >> I hope this email finds you well. >> >> My Name is Sawsan Shatanawi, and I am currently working on developing a >> Fortran code for simulating groundwater flow in a 3D system. The code >> involves solving a nonlinear system, and I have created the matrix to be >> solved using the PCG solver and Picard iteration. However, when I tried >> to assign it as a PETSc matrix I started getting a lot of error messages. >> >> I am kindly asking if someone can help me, I would be happy to share my >> code with him/her. >> >> Please find the attached file contains a list of errors I have gotten >> >> Thank you in advance for your time and assistance. >> >> Best regards, >> >> Sawsan >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Wed Dec 20 10:11:03 2023 From: y.hu at mpie.de (Yi Hu) Date: Wed, 20 Dec 2023 17:11:03 +0100 Subject: [petsc-users] fortran interface to snes matrix-free jacobian Message-ID: Dear PETSc team, My ?solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence? I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? Are these fortran subroutines available? I saw an example in ts module as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES. ? Thanks for your help. Best wishes, Yi ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Dec 20 10:40:27 2023 From: jed at jedbrown.org (Jed Brown) Date: Wed, 20 Dec 2023 09:40:27 -0700 Subject: [petsc-users] fortran interface to snes matrix-free jacobian In-Reply-To: References: Message-ID: <87h6kcakuc.fsf@jedbrown.org> Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator. https://petsc.org/release/manual/snes/#jacobian-evaluation Yi Hu writes: > Dear PETSc team, > > My ?solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence? > > I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? > > Are these fortran subroutines available? I saw an example in ts module as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES. ? > > Thanks for your help. > > Best wishes, > Yi > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- From y.hu at mpie.de Wed Dec 20 10:44:24 2023 From: y.hu at mpie.de (Yi Hu) Date: Wed, 20 Dec 2023 17:44:24 +0100 Subject: [petsc-users] fortran interface to snes matrix-free jacobian In-Reply-To: <87h6kcakuc.fsf@jedbrown.org> References: <87h6kcakuc.fsf@jedbrown.org> Message-ID: Dear Jed, Thanks for your reply. I have an analytical one to implement. Best, Yi -----Original Message----- From: Jed Brown Sent: Wednesday, December 20, 2023 5:40 PM To: Yi Hu ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator. https://petsc.org/release/manual/snes/#jacobian-evaluation Yi Hu writes: > Dear PETSc team, > > My ?solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence? > > I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? > > Are these fortran subroutines available? I saw an example in ts module > as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES. > > Thanks for your help. > > Best wishes, > Yi > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are only > valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- From jed at jedbrown.org Wed Dec 20 10:52:16 2023 From: jed at jedbrown.org (Jed Brown) Date: Wed, 20 Dec 2023 09:52:16 -0700 Subject: [petsc-users] fortran interface to snes matrix-free jacobian In-Reply-To: References: <87h6kcakuc.fsf@jedbrown.org> Message-ID: <87a5q4akan.fsf@jedbrown.org> Then just use MatShell. I see the docs need some work to clarify this, but MatCreateSNESMF is to specify matrix-free finite differencing from code (perhaps where one wants to customize parameters). Yi Hu writes: > Dear Jed, > > Thanks for your reply. I have an analytical one to implement. > > Best, Yi > > -----Original Message----- > From: Jed Brown > Sent: Wednesday, December 20, 2023 5:40 PM > To: Yi Hu ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian > > Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator. > > https://petsc.org/release/manual/snes/#jacobian-evaluation > > Yi Hu writes: > >> Dear PETSc team, >> >> My ?solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence? >> >> I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? >> >> Are these fortran subroutines available? I saw an example in ts module >> as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES. >> >> Thanks for your help. >> >> Best wishes, >> Yi >> >> ------------------------------------------------- >> Stay up to date and follow us on LinkedIn, Twitter and YouTube. >> >> Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 >> D-40237 D?sseldorf >> >> Handelsregister B 2533 >> Amtsgericht D?sseldorf >> >> Gesch?ftsf?hrung >> Prof. Dr. Gerhard Dehm >> Prof. Dr. J?rg Neugebauer >> Prof. Dr. Dierk Raabe >> Dr. Kai de Weldige >> >> Ust.-Id.-Nr.: DE 11 93 58 514 >> Steuernummer: 105 5891 1000 >> >> >> Please consider that invitations and e-mails of our institute are only >> valid if they end with ?@mpie.de. >> If you are not sure of the validity please contact rco at mpie.de >> >> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails >> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. >> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de >> ------------------------------------------------- > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- From bsmith at petsc.dev Wed Dec 20 13:14:53 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 20 Dec 2023 14:14:53 -0500 Subject: [petsc-users] fortran interface to snes matrix-free jacobian In-Reply-To: References: <87h6kcakuc.fsf@jedbrown.org> Message-ID: <45B09E03-4B23-4DE2-B4BC-7DD44629E0FD@petsc.dev> > On Dec 20, 2023, at 11:44?AM, Yi Hu wrote: > > Dear Jed, > > Thanks for your reply. I have an analytical one to implement. > > Best, Yi > > -----Original Message----- > From: Jed Brown > Sent: Wednesday, December 20, 2023 5:40 PM > To: Yi Hu ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian > > Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator. > > https://petsc.org/release/manual/snes/#jacobian-evaluation > > Yi Hu writes: > >> Dear PETSc team, >> >> My solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence? You can use DMSNESCreateJacobianMF() (MatCreateSNESMF is not appropriate when you are providing the operation). >> >> I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? Not exactly. Do not use DMDASNESsetJacobianLocal() use DMSNESCreateJacobianMF() to create a Mat J where you create the SNES and use SNESSetJacobian() and pass the J matrix in along with myJacobian(). >> >> Are these fortran subroutines available? I saw an example in ts module >> as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES. >> >> Thanks for your help. >> >> Best wishes, >> Yi >> >> ------------------------------------------------- >> Stay up to date and follow us on LinkedIn, Twitter and YouTube. >> >> Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 >> D-40237 D?sseldorf >> >> Handelsregister B 2533 >> Amtsgericht D?sseldorf >> >> Gesch?ftsf?hrung >> Prof. Dr. Gerhard Dehm >> Prof. Dr. J?rg Neugebauer >> Prof. Dr. Dierk Raabe >> Dr. Kai de Weldige >> >> Ust.-Id.-Nr.: DE 11 93 58 514 >> Steuernummer: 105 5891 1000 >> >> >> Please consider that invitations and e-mails of our institute are only >> valid if they end with ?@mpie.de. >> If you are not sure of the validity please contact rco at mpie.de >> >> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails >> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. >> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de >> ------------------------------------------------- > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > From bsmith at petsc.dev Wed Dec 20 13:34:02 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 20 Dec 2023 14:34:02 -0500 Subject: [petsc-users] fortran interface to snes matrix-free jacobian In-Reply-To: <45B09E03-4B23-4DE2-B4BC-7DD44629E0FD@petsc.dev> References: <87h6kcakuc.fsf@jedbrown.org> <45B09E03-4B23-4DE2-B4BC-7DD44629E0FD@petsc.dev> Message-ID: <8A145074-7057-4329-8A2A-C37510287BAD@petsc.dev> I apologize; please ignore my answer below. Use MatCreateShell() as indicated by Jed. > On Dec 20, 2023, at 2:14?PM, Barry Smith wrote: > > > >> On Dec 20, 2023, at 11:44?AM, Yi Hu wrote: >> >> Dear Jed, >> >> Thanks for your reply. I have an analytical one to implement. >> >> Best, Yi >> >> -----Original Message----- >> From: Jed Brown >> Sent: Wednesday, December 20, 2023 5:40 PM >> To: Yi Hu ; petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian >> >> Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator. >> >> https://petsc.org/release/manual/snes/#jacobian-evaluation >> >> Yi Hu writes: >> >>> Dear PETSc team, >>> >>> My solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence? > > You can use DMSNESCreateJacobianMF() (MatCreateSNESMF is not appropriate when you are providing the operation). > > >>> >>> I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? > > Not exactly. Do not use DMDASNESsetJacobianLocal() use DMSNESCreateJacobianMF() to create a Mat J where you create the SNES and use SNESSetJacobian() and pass the J matrix in along with myJacobian(). > >>> >>> Are these fortran subroutines available? I saw an example in ts module >>> as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES. >>> >>> Thanks for your help. >>> >>> Best wishes, >>> Yi >>> >>> ------------------------------------------------- >>> Stay up to date and follow us on LinkedIn, Twitter and YouTube. >>> >>> Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 >>> D-40237 D?sseldorf >>> >>> Handelsregister B 2533 >>> Amtsgericht D?sseldorf >>> >>> Gesch?ftsf?hrung >>> Prof. Dr. Gerhard Dehm >>> Prof. Dr. J?rg Neugebauer >>> Prof. Dr. Dierk Raabe >>> Dr. Kai de Weldige >>> >>> Ust.-Id.-Nr.: DE 11 93 58 514 >>> Steuernummer: 105 5891 1000 >>> >>> >>> Please consider that invitations and e-mails of our institute are only >>> valid if they end with ?@mpie.de. >>> If you are not sure of the validity please contact rco at mpie.de >>> >>> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails >>> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. >>> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de >>> ------------------------------------------------- >> >> >> ------------------------------------------------- >> Stay up to date and follow us on LinkedIn, Twitter and YouTube. >> >> Max-Planck-Institut f?r Eisenforschung GmbH >> Max-Planck-Stra?e 1 >> D-40237 D?sseldorf >> >> Handelsregister B 2533 >> Amtsgericht D?sseldorf >> >> Gesch?ftsf?hrung >> Prof. Dr. Gerhard Dehm >> Prof. Dr. J?rg Neugebauer >> Prof. Dr. Dierk Raabe >> Dr. Kai de Weldige >> >> Ust.-Id.-Nr.: DE 11 93 58 514 >> Steuernummer: 105 5891 1000 >> >> >> Please consider that invitations and e-mails of our institute are >> only valid if they end with ?@mpie.de. >> If you are not sure of the validity please contact rco at mpie.de >> >> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails >> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. >> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de >> ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.c.hall at duke.edu Wed Dec 20 15:19:54 2023 From: kenneth.c.hall at duke.edu (Kenneth C Hall) Date: Wed, 20 Dec 2023 21:19:54 +0000 Subject: [petsc-users] SLEPc/NEP for shell matrice T(lambda) and T'(lambda) In-Reply-To: <19A0679D-2A50-4E64-A805-F26582562B9A@dsic.upv.es> References: <89E53665-4C0D-4583-9C90-13C4C108A4EA@dsic.upv.es> <442B3841-B668-4185-9C6F-D03CA481CA26@dsic.upv.es> <19A0679D-2A50-4E64-A805-F26582562B9A@dsic.upv.es> Message-ID: <94CFE0C0-7A64-4E9E-800B-B18CEAF83BFF@duke.edu> Jose, I have been revisiting the issue of SLEPc/NEP for shell matrices T(lambda) and T'(lambda). I am having problems running SLEPc/NEP with -nep_type nleigs. I have compiled two versions of PETSc/SLEPc: petsc-arch-real / slepc-arch-real ./configure --with-cc=gcc-13 --with-cxx=g++-13 --with-fc=gfortran --COPTFLAGS='-O3 -fopenmp' --CXXOPTFLAGS='-O3 -fopenmp' --FOPTFLAGS='-O3 -fopenmp' --with-debugging=1 --with-logging=1 --with-scalar-type=real --with-precision=double --download-fblaslapack --with-openmp --with-mpi=0 petsc-arch-complex / slepc-arch-complex ./configure --with-cc=gcc-13 --with-cxx=g++-13 --with-fc=gfortran --COPTFLAGS='-O3 -fopenmp' --CXXOPTFLAGS='-O3 -fopenmp' --FOPTFLAGS='-O3 -fopenmp' --with-debugging=1 --with-logging=1 --with-scalar-type=complex --with-precision=double --download-fblaslapack --with-openmp --with-mpi=0 I use gfortran on an Apple Mac Mini M1. Both the PETSc and SLEPc versions are the latest development versions as of today (a6690fd8 and 267bd1cd, respectively). I ran the ex54f90 test cases: % main-arch-real -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none % main-arch-complex -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none % main-arch-real -nep_type nleigs -rg_interval_endpoints 0.2,1.1 -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none % main-arch-complex -nep_type nleigs -rg_interval_endpoints 0.2,1.1,-.1,.1 -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none Both the slp cases ran as expected and gave the correct answer. However, both the real and complex architectures failed for the nleigs case. For the complex case, none of the callback functions appear to have been called. For the real case, only the MatMult_A routine appears to be called, 100 times and returns each time, sweeping over lambda from 0.2 to 1.1. Any suggestions would be welcome. Best regards, Kenneth Hall ?On 10/18/23, 9:16 AM, "Jose E. Roman" > wrote: By the way, the MATOP_DESTROY stuff produced segmentation fault in some compilers (in gfortran it worked well). The reason was having the callback functions inside CONTAINS, that is why we have removed it and used regular subroutines instead. Jose > El 18 oct 2023, a las 15:11, Kenneth C Hall > escribi?: > > Jose, > > Thank you. I have downloaded and will take a look. I will try the new example and then implement in my actual problem. I will keep you posted as to my results. > > Thank you and best regards, > Kenneth > > From: Jose E. Roman > > Sent: Tuesday, October 17, 2023 2:31 PM > To: Kenneth C Hall > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] SLEPc/NEP for shell matrice T(lambda) and T'(lambda) > > Kenneth, > > I have worked a bit more on your example and put it in SLEPc https://urldefense.com/v3/__https://gitlab.com/slepc/slepc/-/merge_requests/596__;!!OToaGQ!oSqCpmczx5VDi5025aO5T3WqW-MwGnKUSzxKVkdyXTHo9vuxP4GYnDfMoYxavvWRAA0WdcwX3tiVaiXWT0dh2-o$ > This version also has MATOP_DESTROY to avoid memory leaks. > > Thanks. > Jose > > > > El 12 oct 2023, a las 20:59, Kenneth C Hall > escribi?: > > > > Jose, > > > > Thanks very much for this. I will give it a try and let you know how it works. > > > > Best regards, > > Kenneth > > > > From: Jose E. Roman > > > Date: Thursday, October 12, 2023 at 2:12 PM > > To: Kenneth C Hall > > > Cc: petsc-users at mcs.anl.gov > > > Subject: Re: [petsc-users] SLEPc/NEP for shell matrice T(lambda) and T'(lambda) > > > > I am attaching your example modified with the context stuff. > > > > With the PETSc branch that I indicated, now it works with NLEIGS, for instance: > > > > $ ./test_nep -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none -rg_interval_endpoints 0.2,1.1 -nep_target 0.8 -nep_nev 5 -n 400 -nep_monitor -nep_view -nep_error_relative ::ascii_info_detail > > > > And also other solvers such as SLP: > > > > $ ./test_nep -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none -nep_target 0.8 -nep_nev 5 -n 400 -nep_monitor -nep_error_relative ::ascii_info_detail > > > > I will clean the example code an add it as a SLEPc example. > > > > Regards, > > Jose > > > > > > > El 11 oct 2023, a las 17:27, Kenneth C Hall > escribi?: > > > > > > Jose, > > > > > > Thanks very much for your help with this. Greatly appreciated. I will look at the MR. Please let me know if you do get the Fortran example working. > > > > > > Thanks, and best regards, > > > Kenneth > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: nep_transript.txt URL: From sawsan.shatanawi at wsu.edu Wed Dec 20 19:52:07 2023 From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad) Date: Thu, 21 Dec 2023 01:52:07 +0000 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: Message-ID: Hello, I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) I appreciate your help? Sawsan ________________________________ From: Mark Adams Sent: Wednesday, December 20, 2023 6:44 AM To: Shatanawi, Sawsan Muhammad Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Did you set preallocation values when you created the matrix? Don't do that. On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: Hello, I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it Get Outlook for iOS ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 2:48 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. If this is what you want then you can tell the matrix to let you do that. Otherwise you have a bug. Mark On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello everyone, I hope this email finds you well. My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. I am kindly asking if someone can help me, I would be happy to share my code with him/her. Please find the attached file contains a list of errors I have gotten Thank you in advance for your time and assistance. Best regards, Sawsan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Matrix_RHS.F90 Type: application/octet-stream Size: 7250 bytes Desc: Matrix_RHS.F90 URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: out.txt URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: solver.F90 Type: application/octet-stream Size: 6717 bytes Desc: solver.F90 URL: From bsmith at petsc.dev Wed Dec 20 20:32:10 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 20 Dec 2023 21:32:10 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: Message-ID: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Instead of call PCCreate(PETSC_COMM_WORLD, pc, ierr) call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver do call KSPGetPC(ksp,pc,ierr) call PCSetType(pc, PCILU,ierr) Do not call KSPSetUp(). It will be taken care of automatically during the solve > On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users wrote: > > Hello, > I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. > I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) > > I appreciate your help? > > Sawsan > From: Mark Adams > > Sent: Wednesday, December 20, 2023 6:44 AM > To: Shatanawi, Sawsan Muhammad > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code > > [EXTERNAL EMAIL] > Did you set preallocation values when you created the matrix? > Don't do that. > > On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: > Hello, > > I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it > > Get Outlook for iOS > From: Mark Adams > > Sent: Wednesday, December 20, 2023 2:48 AM > To: Shatanawi, Sawsan Muhammad > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code > > [EXTERNAL EMAIL] > I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. > If this is what you want then you can tell the matrix to let you do that. > Otherwise you have a bug. > > Mark > > On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > Hello everyone, > > I hope this email finds you well. > > My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. > > I am kindly asking if someone can help me, I would be happy to share my code with him/her. > > Please find the attached file contains a list of errors I have gotten > > Thank you in advance for your time and assistance. > Best regards, > > Sawsan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sawsan.shatanawi at wsu.edu Wed Dec 20 20:49:13 2023 From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad) Date: Thu, 21 Dec 2023 02:49:13 +0000 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: Hello Barry, Thank you a lot for your help, Now I am getting the attached error message. Bests, Sawsan ________________________________ From: Barry Smith Sent: Wednesday, December 20, 2023 6:32 PM To: Shatanawi, Sawsan Muhammad Cc: Mark Adams ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Instead of call PCCreate(PETSC_COMM_WORLD, pc, ierr) call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver do call KSPGetPC(ksp,pc,ierr) call PCSetType(pc, PCILU,ierr) Do not call KSPSetUp(). It will be taken care of automatically during the solve On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users wrote: Hello, I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) I appreciate your help? Sawsan ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 6:44 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Did you set preallocation values when you created the matrix? Don't do that. On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: Hello, I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it Get Outlook for iOS ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 2:48 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. If this is what you want then you can tell the matrix to let you do that. Otherwise you have a bug. Mark On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello everyone, I hope this email finds you well. My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. I am kindly asking if someone can help me, I would be happy to share my code with him/her. Please find the attached file contains a list of errors I have gotten Thank you in advance for your time and assistance. Best regards, Sawsan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: out2.txt URL: From knepley at gmail.com Wed Dec 20 20:54:33 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 20 Dec 2023 21:54:33 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello Barry, > > Thank you a lot for your help, Now I am getting the attached error message. > Do not destroy the PC from KSPGetPC() THanks, Matt > Bests, > Sawsan > ------------------------------ > *From:* Barry Smith > *Sent:* Wednesday, December 20, 2023 6:32 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Mark Adams ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > > Instead of > > call PCCreate(PETSC_COMM_WORLD, pc, ierr) > call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) > call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the > KSP solver > > do > > call KSPGetPC(ksp,pc,ierr) > call PCSetType(pc, PCILU,ierr) > > Do not call KSPSetUp(). It will be taken care of automatically during the > solve > > > > On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, > I don't think that I set preallocation values when I created the matrix, > would you please have look at my code. It is just the petsc related part > from my code. > I was able to fix some of the error messages. Now I have a new set of > error messages related to the KSP solver (attached) > > I appreciate your help > > Sawsan > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 6:44 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > Did you set preallocation values when you created the matrix? > Don't do that. > > On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad < > sawsan.shatanawi at wsu.edu> wrote: > > Hello, > > I am trying to create a sparse matrix( which is as I believe a zero > matrix) then adding some nonzero elements to it over a loop, then > assembling it > > Get Outlook for iOS > > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 2:48 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > I am guessing that you are creating a matrix, adding to it, finalizing it > ("assembly"), and then adding to it again, which is fine, but you are > adding new non-zeros to the sparsity pattern. > If this is what you want then you can tell the matrix to let you do that. > Otherwise you have a bug. > > Mark > > On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello everyone, > > I hope this email finds you well. > > My Name is Sawsan Shatanawi, and I am currently working on developing a > Fortran code for simulating groundwater flow in a 3D system. The code > involves solving a nonlinear system, and I have created the matrix to be > solved using the PCG solver and Picard iteration. However, when I tried > to assign it as a PETSc matrix I started getting a lot of error messages. > > I am kindly asking if someone can help me, I would be happy to share my > code with him/her. > > Please find the attached file contains a list of errors I have gotten > > Thank you in advance for your time and assistance. > > Best regards, > > Sawsan > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sawsan.shatanawi at wsu.edu Wed Dec 20 21:02:17 2023 From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad) Date: Thu, 21 Dec 2023 03:02:17 +0000 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: Hello Matthew, Thank you for your help. I am sorry that I keep coming back with my error messages, but I reached a point that I don't know how to fix them, and I don't understand them easily. The list of errors is getting shorter, now I am getting the attached error messages Thank you again, Sawsan ________________________________ From: Matthew Knepley Sent: Wednesday, December 20, 2023 6:54 PM To: Shatanawi, Sawsan Muhammad Cc: Barry Smith ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello Barry, Thank you a lot for your help, Now I am getting the attached error message. Do not destroy the PC from KSPGetPC() THanks, Matt Bests, Sawsan ________________________________ From: Barry Smith > Sent: Wednesday, December 20, 2023 6:32 PM To: Shatanawi, Sawsan Muhammad > Cc: Mark Adams >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Instead of call PCCreate(PETSC_COMM_WORLD, pc, ierr) call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the KSP solver do call KSPGetPC(ksp,pc,ierr) call PCSetType(pc, PCILU,ierr) Do not call KSPSetUp(). It will be taken care of automatically during the solve On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello, I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code. I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached) I appreciate your help Sawsan ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 6:44 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] Did you set preallocation values when you created the matrix? Don't do that. On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad > wrote: Hello, I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it Get Outlook for iOS ________________________________ From: Mark Adams > Sent: Wednesday, December 20, 2023 2:48 AM To: Shatanawi, Sawsan Muhammad > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code [EXTERNAL EMAIL] I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern. If this is what you want then you can tell the matrix to let you do that. Otherwise you have a bug. Mark On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: Hello everyone, I hope this email finds you well. My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages. I am kindly asking if someone can help me, I would be happy to share my code with him/her. Please find the attached file contains a list of errors I have gotten Thank you in advance for your time and assistance. Best regards, Sawsan -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: out3.txt URL: From srvenkat at utexas.edu Wed Dec 20 22:04:09 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 21 Dec 2023 09:34:09 +0530 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> Message-ID: Would using the CHOLMOD Cholesky factorization ( https://petsc.org/release/manualpages/Mat/MATSOLVERCHOLMOD/) let us do the factorization on device as well? On Wed, Dec 20, 2023 at 1:21?PM Pierre Jolivet wrote: > > > On 20 Dec 2023, at 8:42?AM, Sreeram R Venkat wrote: > > Ok, I think the error I'm getting has something to do with how the > multiple solves are being done in succession. I'll try to see if there's > anything I'm doing wrong there. > > One question about the -pc_type lu -ksp_type preonly method: do you know > which parts of the solve (factorization/triangular solves) are done on host > and which are done on device? > > > I think only the triangular solves can be done on device. > Since you have many right-hand sides, it may not be that bad. > GPU people will hopefully give you a more insightful answer. > > Thanks, > Pierre > > Thanks, > Sreeram > > On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet wrote: > >> Unfortunately, I am not able to reproduce such a failure with your input >> matrix. >> I?ve used ex79 that I linked previously and the system is properly solved. >> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg >> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs >> ascii::ascii_info >> Linear solve converged due to CONVERGED_RTOL iterations 6 >> Mat Object: 1 MPI process >> type: seqaijcusparse >> rows=289, cols=289 >> total: nonzeros=2401, allocated nonzeros=2401 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> Mat Object: 1 MPI process >> type: seqdensecuda >> rows=289, cols=10 >> total: nonzeros=2890, allocated nonzeros=2890 >> total number of mallocs used during MatSetValues calls=0 >> >> You mentioned in a subsequent email that you are interested in systems >> with at most 1E6 unknowns, and up to 1E4 right-hand sides. >> I?m not sure you can expect significant gains from using GPU for such >> systems. >> Probably, the fastest approach would indeed be -pc_type lu -ksp_type >> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory >> available on your host. >> >> Thanks, >> Pierre >> >> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat wrote: >> >> Here are the ksp_view files. I set the options >> -ksp_error_if_not_converged to try to get the vectors that caused the >> error. I noticed that some of the KSPMatSolves converge while others don't. >> In the code, the solves are called as: >> >> input vector v --> insert data of v into a dense mat --> KSPMatSolve() >> --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output >> vector w -- output w >> >> The operator used in the KSP is a Laplacian-like operator, and the >> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve >> with a biharmonic-like operator. I can also run it with only the first >> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP >> reportedly converges after 0 iterations (see the next line), but this >> causes problems in other parts of the code later on. >> >> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations >> due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I >> tried setting ksp_min_it, but that didn't seem to do anything. >> >> I'll keep trying different options and also try to get the MWE made (this >> KSPMatSolve is pretty performance critical for us). >> >> Thanks for all your help, >> Sreeram >> >> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet wrote: >> >>> >>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat >>> wrote: >>> >>> Thanks, I will try to create a minimal reproducible example. This may >>> take me some time though, as I need to figure out how to extract only the >>> relevant parts (the full program this solve is used in is getting quite >>> complex). >>> >>> >>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat >>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files >>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt). >>> >>> I'll also try out some of the BoomerAMG options to see if that helps. >>> >>> >>> These should work (this is where all ?PCMatApply()-ready? PC are being >>> tested): >>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 >>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with >>> HIP). >>> I?m aware the performance should not be optimal (see your comment about >>> host/device copies), I?ve money to hire someone to work on this but: a) I >>> need to find the correct engineer/post-doc, b) I currently don?t have good >>> use cases (of course, I could generate a synthetic benchmark, for science). >>> So even if you send me the three Mat, a MWE would be appreciated if the >>> KSPMatSolve() is performance-critical for you (see point b) from above). >>> >>> Thanks, >>> Pierre >>> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet wrote: >>> >>>> >>>> >>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat >>>> wrote: >>>> >>>> Hello Pierre, >>>> >>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it >>>> seems to be doing the batched solves, but the KSP is not converging due to >>>> a NaN or Inf being generated. I also noticed there are a lot of >>>> host-to-device and device-to-host copies of the matrices (the non-batched >>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could >>>> you please take a look? >>>> >>>> >>>> Yes, but you?d need to send me something I can run with your set of >>>> options (if you are more confident doing this in private, you can remove >>>> the list from c/c). >>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and >>>> there is not much error checking, so instead of erroring out, this may be >>>> the reason why you are getting garbage. >>>> >>>> Thanks, >>>> Pierre >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet >>>> wrote: >>>> >>>>> Hello Sreeram, >>>>> KSPCG (PETSc implementation of CG) does not handle solves with >>>>> multiple columns at once. >>>>> There is only a single native PETSc KSP implementation which handles >>>>> solves with multiple columns at once: KSPPREONLY. >>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more >>>>> advanced methods) implementation which handles solves with multiple columns >>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, >>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>>> I?m the main author of HPDDM, there is preliminary support for device >>>>> matrices, but if it?s not working as intended/not faster than column by >>>>> column, I?d be happy to have a deeper look (maybe in private), because most >>>>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., >>>>> solvers that treat right-hand sides in a single go) are using plain host >>>>> matrices. >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>> PS: you could have a look at >>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to >>>>> understand the philosophy behind block iterative methods in PETSc (and in >>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was >>>>> developed in the context of this paper to produce Figures 2-3. Note that >>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among >>>>> others) have been made ?PCMatApply()-ready?. >>>>> >>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat >>>>> wrote: >>>>> >>>>> Hello Pierre, >>>>> >>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. >>>>> However, I am noticing that it is still solving column by column (this is >>>>> stated explicitly in the info dump attached). I looked at the code for >>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is >>>>> true, it should do the batched solve, though I'm not sure where that gets >>>>> set. >>>>> >>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when >>>>> running the code. >>>>> >>>>> Can you please help me with this? >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> >>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams wrote: >>>>> >>>>>> N.B., AMGX interface is a bit experimental. >>>>>> Mark >>>>>> >>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat >>>>>> wrote: >>>>>> >>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build >>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat >>>>>>>> wrote: >>>>>>>> >>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>>> >>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it >>>>>>>> out and see how it performs. >>>>>>>> >>>>>>>> >>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus >>>>>>>> has no PCMatApply() implementation. >>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >>>>>>>> implementation. >>>>>>>> But let us know if you need assistance figuring things out. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sreeram >>>>>>>> >>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet >>>>>>>> wrote: >>>>>>>> >>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that >>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can >>>>>>>>> reproduce this on your own with >>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>>> Also, I?m guessing you are using some sort of preconditioner >>>>>>>>> within your KSP. >>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>>>>>>>> right-hand sides column by column, which is very inefficient. >>>>>>>>> You could run your code with -info dump and send us dump.0 to see >>>>>>>>> what needs to be done on our end to make things more efficient, should you >>>>>>>>> not be satisfied with the current performance of the code. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Pierre >>>>>>>>> >>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of >>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where >>>>>>>>> v_i has size n. The data for v can be stored either in column-major or >>>>>>>>> row-major order. Now, I want to do 2 types of operations: >>>>>>>>> >>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>> >>>>>>>>> From what I have read on the documentation, I can think of 2 >>>>>>>>> approaches. >>>>>>>>> >>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to >>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the >>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve >>>>>>>>> with R and V. >>>>>>>>> >>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly >>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that >>>>>>>>> it is a multiple RHS system and act accordingly. >>>>>>>>> >>>>>>>>> Which would be the more efficient option? >>>>>>>>> >>>>>>>>> >>>>>>>>> Use 1. >>>>>>>>> >>>>>>>>> >>>>>>>>> As a side-note, I am also wondering if there is a way to use >>>>>>>>> row-major storage of the vector v. >>>>>>>>> >>>>>>>>> >>>>>>>>> No >>>>>>>>> >>>>>>>>> The reason is that this could allow for more coalesced memory >>>>>>>>> access when doing matvecs. >>>>>>>>> >>>>>>>>> >>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector >>>>>>>>> products for the computation so in theory they should already be >>>>>>>>> well-optimized >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sreeram >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Dec 21 05:04:57 2023 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 21 Dec 2023 12:04:57 +0100 Subject: [petsc-users] SLEPc/NEP for shell matrice T(lambda) and T'(lambda) In-Reply-To: <94CFE0C0-7A64-4E9E-800B-B18CEAF83BFF@duke.edu> References: <89E53665-4C0D-4583-9C90-13C4C108A4EA@dsic.upv.es> <442B3841-B668-4185-9C6F-D03CA481CA26@dsic.upv.es> <19A0679D-2A50-4E64-A805-F26582562B9A@dsic.upv.es> <94CFE0C0-7A64-4E9E-800B-B18CEAF83BFF@duke.edu> Message-ID: <874379C7-D9DB-4CC2-94BF-198FEC9B8E49@dsic.upv.es> The errors are strange. The traceback points to harmless operations. Likely memory corruption, as the message says. Those tests are included in SLEPc pipelines, they are run with serveral Linux distributions, with several compilers. Also, on my macOS it runs cleanly, although my configuration is different from yours. I don't have access to an M1 computer. Also, using gcc instead of clang from xcode may have unexpected side effects, I don't know. I would try with less agressive optimization flags, e.g., --COPTFLAGS=-O --CXXOPTFLAGS=-O --FOPTFLAGS=-O (or even remove them completely). Maybe try also --with-debugging=0. Another thing you can try is change the BLAS/LAPACK, e.g., removing --download-fblaslapack or replace it with --download-netlib-lapack See also the FAQ https://petsc.org/release/faq/#what-does-corrupt-argument-or-caught-signal-or-segv-or-segmentation-violation-or-bus-error-mean-can-i-use-valgrind-or-cuda-memcheck-to-debug-memory-corruption-issues Jose > El 20 dic 2023, a las 22:19, Kenneth C Hall escribi?: > > Jose, > > I have been revisiting the issue of SLEPc/NEP for shell matrices T(lambda) and T'(lambda). > I am having problems running SLEPc/NEP with -nep_type nleigs. > > I have compiled two versions of PETSc/SLEPc: > > petsc-arch-real / slepc-arch-real > ./configure --with-cc=gcc-13 --with-cxx=g++-13 --with-fc=gfortran --COPTFLAGS='-O3 -fopenmp' --CXXOPTFLAGS='-O3 -fopenmp' > --FOPTFLAGS='-O3 -fopenmp' --with-debugging=1 --with-logging=1 --with-scalar-type=real --with-precision=double > --download-fblaslapack --with-openmp --with-mpi=0 > > > petsc-arch-complex / slepc-arch-complex > ./configure --with-cc=gcc-13 --with-cxx=g++-13 --with-fc=gfortran --COPTFLAGS='-O3 -fopenmp' --CXXOPTFLAGS='-O3 -fopenmp' > --FOPTFLAGS='-O3 -fopenmp' --with-debugging=1 --with-logging=1 --with-scalar-type=complex --with-precision=double > --download-fblaslapack --with-openmp --with-mpi=0 > > I use gfortran on an Apple Mac Mini M1. Both the PETSc and SLEPc versions are the latest development versions as of today (a6690fd8 and 267bd1cd, respectively). > > I ran the ex54f90 test cases: > > % main-arch-real -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none > % main-arch-complex -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none > % main-arch-real -nep_type nleigs -rg_interval_endpoints 0.2,1.1 -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none > % main-arch-complex -nep_type nleigs -rg_interval_endpoints 0.2,1.1,-.1,.1 -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none > > Both the slp cases ran as expected and gave the correct answer. > However, both the real and complex architectures failed for the nleigs case. > > For the complex case, none of the callback functions appear to have been called. > For the real case, only the MatMult_A routine appears to be called, 100 times and returns each time, sweeping over lambda from 0.2 to 1.1. > > Any suggestions would be welcome. > > Best regards, > Kenneth Hall > From knepley at gmail.com Thu Dec 21 05:48:04 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 21 Dec 2023 06:48:04 -0500 Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code In-Reply-To: References: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev> Message-ID: On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad < sawsan.shatanawi at wsu.edu> wrote: > Hello Matthew, > > Thank you for your help. I am sorry that I keep coming back with my error > messages, but I reached a point that I don't know how to fix them, and I > don't understand them easily. > The list of errors is getting shorter, now I am getting the attached error > messages > You are overwriting memory somewhere, but we cannot see the code, and thus cannot tell where. You can figure this out by running in the debugger, using -start_in_debugger, which launches a debugger window, and then 'cont' to run until the error, and then 'where' to print the stack trace. Thanks, Matt > Thank you again, > > Sawsan > ------------------------------ > *From:* Matthew Knepley > *Sent:* Wednesday, December 20, 2023 6:54 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello Barry, > > Thank you a lot for your help, Now I am getting the attached error message. > > > Do not destroy the PC from KSPGetPC() > > THanks, > > Matt > > > Bests, > Sawsan > ------------------------------ > *From:* Barry Smith > *Sent:* Wednesday, December 20, 2023 6:32 PM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* Mark Adams ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > > *[EXTERNAL EMAIL]* > > Instead of > > call PCCreate(PETSC_COMM_WORLD, pc, ierr) > call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU) > call KSPSetPC(ksp, pc,ierr) ! Associate the preconditioner with the > KSP solver > > do > > call KSPGetPC(ksp,pc,ierr) > call PCSetType(pc, PCILU,ierr) > > Do not call KSPSetUp(). It will be taken care of automatically during the > solve > > > > On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, > I don't think that I set preallocation values when I created the matrix, > would you please have look at my code. It is just the petsc related part > from my code. > I was able to fix some of the error messages. Now I have a new set of > error messages related to the KSP solver (attached) > > I appreciate your help > > Sawsan > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 6:44 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > Did you set preallocation values when you created the matrix? > Don't do that. > > On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad < > sawsan.shatanawi at wsu.edu> wrote: > > Hello, > > I am trying to create a sparse matrix( which is as I believe a zero > matrix) then adding some nonzero elements to it over a loop, then > assembling it > > Get Outlook for iOS > > ------------------------------ > *From:* Mark Adams > *Sent:* Wednesday, December 20, 2023 2:48 AM > *To:* Shatanawi, Sawsan Muhammad > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran > Groundwater Flow Simulation Code > > *[EXTERNAL EMAIL]* > I am guessing that you are creating a matrix, adding to it, finalizing it > ("assembly"), and then adding to it again, which is fine, but you are > adding new non-zeros to the sparsity pattern. > If this is what you want then you can tell the matrix to let you do that. > Otherwise you have a bug. > > Mark > > On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users > wrote: > > Hello everyone, > > I hope this email finds you well. > > My Name is Sawsan Shatanawi, and I am currently working on developing a > Fortran code for simulating groundwater flow in a 3D system. The code > involves solving a nonlinear system, and I have created the matrix to be > solved using the PCG solver and Picard iteration. However, when I tried > to assign it as a PETSc matrix I started getting a lot of error messages. > > I am kindly asking if someone can help me, I would be happy to share my > code with him/her. > > Please find the attached file contains a list of errors I have gotten > > Thank you in advance for your time and assistance. > > Best regards, > > Sawsan > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Dec 21 05:53:43 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 21 Dec 2023 06:53:43 -0500 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> Message-ID: On Thu, Dec 21, 2023 at 6:46?AM Sreeram R Venkat wrote: > Ok, I think the error I'm getting has something to do with how the > multiple solves are being done in succession. I'll try to see if there's > anything I'm doing wrong there. > > One question about the -pc_type lu -ksp_type preonly method: do you know > which parts of the solve (factorization/triangular solves) are done on host > and which are done on device? > For SEQDENSE, I believe both the factorization and solve is on device. It is hard to see, but I believe the dispatch code is here: https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/dense/seq/cupm/matseqdensecupm.hpp?ref_type=heads#L368 Thanks, Matt > Thanks, > Sreeram > > On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet wrote: > >> Unfortunately, I am not able to reproduce such a failure with your input >> matrix. >> I?ve used ex79 that I linked previously and the system is properly solved. >> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg >> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs >> ascii::ascii_info >> Linear solve converged due to CONVERGED_RTOL iterations 6 >> Mat Object: 1 MPI process >> type: seqaijcusparse >> rows=289, cols=289 >> total: nonzeros=2401, allocated nonzeros=2401 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> Mat Object: 1 MPI process >> type: seqdensecuda >> rows=289, cols=10 >> total: nonzeros=2890, allocated nonzeros=2890 >> total number of mallocs used during MatSetValues calls=0 >> >> You mentioned in a subsequent email that you are interested in systems >> with at most 1E6 unknowns, and up to 1E4 right-hand sides. >> I?m not sure you can expect significant gains from using GPU for such >> systems. >> Probably, the fastest approach would indeed be -pc_type lu -ksp_type >> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory >> available on your host. >> >> Thanks, >> Pierre >> >> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat wrote: >> >> Here are the ksp_view files. I set the options >> -ksp_error_if_not_converged to try to get the vectors that caused the >> error. I noticed that some of the KSPMatSolves converge while others don't. >> In the code, the solves are called as: >> >> input vector v --> insert data of v into a dense mat --> KSPMatSolve() >> --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output >> vector w -- output w >> >> The operator used in the KSP is a Laplacian-like operator, and the >> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve >> with a biharmonic-like operator. I can also run it with only the first >> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP >> reportedly converges after 0 iterations (see the next line), but this >> causes problems in other parts of the code later on. >> >> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations >> due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I >> tried setting ksp_min_it, but that didn't seem to do anything. >> >> I'll keep trying different options and also try to get the MWE made (this >> KSPMatSolve is pretty performance critical for us). >> >> Thanks for all your help, >> Sreeram >> >> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet wrote: >> >>> >>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat >>> wrote: >>> >>> Thanks, I will try to create a minimal reproducible example. This may >>> take me some time though, as I need to figure out how to extract only the >>> relevant parts (the full program this solve is used in is getting quite >>> complex). >>> >>> >>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat >>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files >>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt). >>> >>> I'll also try out some of the BoomerAMG options to see if that helps. >>> >>> >>> These should work (this is where all ?PCMatApply()-ready? PC are being >>> tested): >>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 >>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with >>> HIP). >>> I?m aware the performance should not be optimal (see your comment about >>> host/device copies), I?ve money to hire someone to work on this but: a) I >>> need to find the correct engineer/post-doc, b) I currently don?t have good >>> use cases (of course, I could generate a synthetic benchmark, for science). >>> So even if you send me the three Mat, a MWE would be appreciated if the >>> KSPMatSolve() is performance-critical for you (see point b) from above). >>> >>> Thanks, >>> Pierre >>> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet wrote: >>> >>>> >>>> >>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat >>>> wrote: >>>> >>>> Hello Pierre, >>>> >>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it >>>> seems to be doing the batched solves, but the KSP is not converging due to >>>> a NaN or Inf being generated. I also noticed there are a lot of >>>> host-to-device and device-to-host copies of the matrices (the non-batched >>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could >>>> you please take a look? >>>> >>>> >>>> Yes, but you?d need to send me something I can run with your set of >>>> options (if you are more confident doing this in private, you can remove >>>> the list from c/c). >>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and >>>> there is not much error checking, so instead of erroring out, this may be >>>> the reason why you are getting garbage. >>>> >>>> Thanks, >>>> Pierre >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet >>>> wrote: >>>> >>>>> Hello Sreeram, >>>>> KSPCG (PETSc implementation of CG) does not handle solves with >>>>> multiple columns at once. >>>>> There is only a single native PETSc KSP implementation which handles >>>>> solves with multiple columns at once: KSPPREONLY. >>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more >>>>> advanced methods) implementation which handles solves with multiple columns >>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, >>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>>> I?m the main author of HPDDM, there is preliminary support for device >>>>> matrices, but if it?s not working as intended/not faster than column by >>>>> column, I?d be happy to have a deeper look (maybe in private), because most >>>>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., >>>>> solvers that treat right-hand sides in a single go) are using plain host >>>>> matrices. >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>> PS: you could have a look at >>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to >>>>> understand the philosophy behind block iterative methods in PETSc (and in >>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was >>>>> developed in the context of this paper to produce Figures 2-3. Note that >>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among >>>>> others) have been made ?PCMatApply()-ready?. >>>>> >>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat >>>>> wrote: >>>>> >>>>> Hello Pierre, >>>>> >>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. >>>>> However, I am noticing that it is still solving column by column (this is >>>>> stated explicitly in the info dump attached). I looked at the code for >>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is >>>>> true, it should do the batched solve, though I'm not sure where that gets >>>>> set. >>>>> >>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when >>>>> running the code. >>>>> >>>>> Can you please help me with this? >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> >>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams wrote: >>>>> >>>>>> N.B., AMGX interface is a bit experimental. >>>>>> Mark >>>>>> >>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat >>>>>> wrote: >>>>>> >>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build >>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat >>>>>>>> wrote: >>>>>>>> >>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>>> >>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it >>>>>>>> out and see how it performs. >>>>>>>> >>>>>>>> >>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus >>>>>>>> has no PCMatApply() implementation. >>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >>>>>>>> implementation. >>>>>>>> But let us know if you need assistance figuring things out. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sreeram >>>>>>>> >>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet >>>>>>>> wrote: >>>>>>>> >>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that >>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can >>>>>>>>> reproduce this on your own with >>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>>> Also, I?m guessing you are using some sort of preconditioner >>>>>>>>> within your KSP. >>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>>>>>>>> right-hand sides column by column, which is very inefficient. >>>>>>>>> You could run your code with -info dump and send us dump.0 to see >>>>>>>>> what needs to be done on our end to make things more efficient, should you >>>>>>>>> not be satisfied with the current performance of the code. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Pierre >>>>>>>>> >>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of >>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where >>>>>>>>> v_i has size n. The data for v can be stored either in column-major or >>>>>>>>> row-major order. Now, I want to do 2 types of operations: >>>>>>>>> >>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>> >>>>>>>>> From what I have read on the documentation, I can think of 2 >>>>>>>>> approaches. >>>>>>>>> >>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to >>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the >>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve >>>>>>>>> with R and V. >>>>>>>>> >>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly >>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that >>>>>>>>> it is a multiple RHS system and act accordingly. >>>>>>>>> >>>>>>>>> Which would be the more efficient option? >>>>>>>>> >>>>>>>>> >>>>>>>>> Use 1. >>>>>>>>> >>>>>>>>> >>>>>>>>> As a side-note, I am also wondering if there is a way to use >>>>>>>>> row-major storage of the vector v. >>>>>>>>> >>>>>>>>> >>>>>>>>> No >>>>>>>>> >>>>>>>>> The reason is that this could allow for more coalesced memory >>>>>>>>> access when doing matvecs. >>>>>>>>> >>>>>>>>> >>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector >>>>>>>>> products for the computation so in theory they should already be >>>>>>>>> well-optimized >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sreeram >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >> >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ngoetting at itp.uni-bremen.de Thu Dec 21 07:35:37 2023 From: ngoetting at itp.uni-bremen.de (=?UTF-8?Q?Niclas_G=C3=B6tting?=) Date: Thu, 21 Dec 2023 14:35:37 +0100 Subject: [petsc-users] TS docs wrong URLs in Examples Message-ID: Hi all, I noticed that all links to the examples under https://petsc.org/release/manualpages/TS/TS/ point to the wrong URL. Instead of src/ts/**/*, they point to src/sys/**/*, which does not seem to be right. This definitely is a minor issue, but I couldn't see an obvious fix via "Edit this page", so here is the e-mail. Best regards Niclas From bsmith at petsc.dev Thu Dec 21 11:51:23 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 21 Dec 2023 12:51:23 -0500 Subject: [petsc-users] TS docs wrong URLs in Examples In-Reply-To: References: Message-ID: Thanks for letting us know, we'll take a look at it. Barry > On Dec 21, 2023, at 8:35?AM, Niclas G?tting wrote: > > Hi all, > > I noticed that all links to the examples under https://petsc.org/release/manualpages/TS/TS/ point to the wrong URL. Instead of src/ts/**/*, they point to src/sys/**/*, which does not seem to be right. This definitely is a minor issue, but I couldn't see an obvious fix via "Edit this page", so here is the e-mail. > > Best regards > Niclas > From junchao.zhang at gmail.com Thu Dec 21 12:38:24 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 21 Dec 2023 12:38:24 -0600 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> Message-ID: On Thu, Dec 21, 2023 at 5:54?AM Matthew Knepley wrote: > On Thu, Dec 21, 2023 at 6:46?AM Sreeram R Venkat > wrote: > >> Ok, I think the error I'm getting has something to do with how the >> multiple solves are being done in succession. I'll try to see if there's >> anything I'm doing wrong there. >> >> One question about the -pc_type lu -ksp_type preonly method: do you know >> which parts of the solve (factorization/triangular solves) are done on host >> and which are done on device? >> > > For SEQDENSE, I believe both the factorization and solve is on device. It > is hard to see, but I believe the dispatch code is here: > Yes, it is correct. > > > https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/dense/seq/cupm/matseqdensecupm.hpp?ref_type=heads#L368 > > Thanks, > > Matt > > >> Thanks, >> Sreeram >> >> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet wrote: >> >>> Unfortunately, I am not able to reproduce such a failure with your input >>> matrix. >>> I?ve used ex79 that I linked previously and the system is properly >>> solved. >>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg >>> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs >>> ascii::ascii_info >>> Linear solve converged due to CONVERGED_RTOL iterations 6 >>> Mat Object: 1 MPI process >>> type: seqaijcusparse >>> rows=289, cols=289 >>> total: nonzeros=2401, allocated nonzeros=2401 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> Mat Object: 1 MPI process >>> type: seqdensecuda >>> rows=289, cols=10 >>> total: nonzeros=2890, allocated nonzeros=2890 >>> total number of mallocs used during MatSetValues calls=0 >>> >>> You mentioned in a subsequent email that you are interested in systems >>> with at most 1E6 unknowns, and up to 1E4 right-hand sides. >>> I?m not sure you can expect significant gains from using GPU for such >>> systems. >>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type >>> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory >>> available on your host. >>> >>> Thanks, >>> Pierre >>> >>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat >>> wrote: >>> >>> Here are the ksp_view files. I set the options >>> -ksp_error_if_not_converged to try to get the vectors that caused the >>> error. I noticed that some of the KSPMatSolves converge while others don't. >>> In the code, the solves are called as: >>> >>> input vector v --> insert data of v into a dense mat --> KSPMatSolve() >>> --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output >>> vector w -- output w >>> >>> The operator used in the KSP is a Laplacian-like operator, and the >>> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve >>> with a biharmonic-like operator. I can also run it with only the first >>> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP >>> reportedly converges after 0 iterations (see the next line), but this >>> causes problems in other parts of the code later on. >>> >>> I saw that sometimes the first KSPMatSolve "converges" after 0 >>> iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a >>> NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything. >>> >>> I'll keep trying different options and also try to get the MWE made >>> (this KSPMatSolve is pretty performance critical for us). >>> >>> Thanks for all your help, >>> Sreeram >>> >>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet wrote: >>> >>>> >>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat >>>> wrote: >>>> >>>> Thanks, I will try to create a minimal reproducible example. This may >>>> take me some time though, as I need to figure out how to extract only the >>>> relevant parts (the full program this solve is used in is getting quite >>>> complex). >>>> >>>> >>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat >>>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files >>>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt). >>>> >>>> I'll also try out some of the BoomerAMG options to see if that helps. >>>> >>>> >>>> These should work (this is where all ?PCMatApply()-ready? PC are being >>>> tested): >>>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 >>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not >>>> with HIP). >>>> I?m aware the performance should not be optimal (see your comment about >>>> host/device copies), I?ve money to hire someone to work on this but: a) I >>>> need to find the correct engineer/post-doc, b) I currently don?t have good >>>> use cases (of course, I could generate a synthetic benchmark, for science). >>>> So even if you send me the three Mat, a MWE would be appreciated if the >>>> KSPMatSolve() is performance-critical for you (see point b) from above). >>>> >>>> Thanks, >>>> Pierre >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet wrote: >>>> >>>>> >>>>> >>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat >>>>> wrote: >>>>> >>>>> Hello Pierre, >>>>> >>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it >>>>> seems to be doing the batched solves, but the KSP is not converging due to >>>>> a NaN or Inf being generated. I also noticed there are a lot of >>>>> host-to-device and device-to-host copies of the matrices (the non-batched >>>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could >>>>> you please take a look? >>>>> >>>>> >>>>> Yes, but you?d need to send me something I can run with your set of >>>>> options (if you are more confident doing this in private, you can remove >>>>> the list from c/c). >>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and >>>>> there is not much error checking, so instead of erroring out, this may be >>>>> the reason why you are getting garbage. >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet >>>>> wrote: >>>>> >>>>>> Hello Sreeram, >>>>>> KSPCG (PETSc implementation of CG) does not handle solves with >>>>>> multiple columns at once. >>>>>> There is only a single native PETSc KSP implementation which handles >>>>>> solves with multiple columns at once: KSPPREONLY. >>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more >>>>>> advanced methods) implementation which handles solves with multiple columns >>>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, >>>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>>>> I?m the main author of HPDDM, there is preliminary support for device >>>>>> matrices, but if it?s not working as intended/not faster than column by >>>>>> column, I?d be happy to have a deeper look (maybe in private), because most >>>>>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., >>>>>> solvers that treat right-hand sides in a single go) are using plain host >>>>>> matrices. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>> PS: you could have a look at >>>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to >>>>>> understand the philosophy behind block iterative methods in PETSc (and in >>>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was >>>>>> developed in the context of this paper to produce Figures 2-3. Note that >>>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among >>>>>> others) have been made ?PCMatApply()-ready?. >>>>>> >>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat >>>>>> wrote: >>>>>> >>>>>> Hello Pierre, >>>>>> >>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. >>>>>> However, I am noticing that it is still solving column by column (this is >>>>>> stated explicitly in the info dump attached). I looked at the code for >>>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is >>>>>> true, it should do the batched solve, though I'm not sure where that gets >>>>>> set. >>>>>> >>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when >>>>>> running the code. >>>>>> >>>>>> Can you please help me with this? >>>>>> >>>>>> Thanks, >>>>>> Sreeram >>>>>> >>>>>> >>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams wrote: >>>>>> >>>>>>> N.B., AMGX interface is a bit experimental. >>>>>>> Mark >>>>>>> >>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat >>>>>>> wrote: >>>>>>> >>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build >>>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sreeram >>>>>>>> >>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>>>> >>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it >>>>>>>>> out and see how it performs. >>>>>>>>> >>>>>>>>> >>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus >>>>>>>>> has no PCMatApply() implementation. >>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >>>>>>>>> implementation. >>>>>>>>> But let us know if you need assistance figuring things out. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Pierre >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sreeram >>>>>>>>> >>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that >>>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can >>>>>>>>>> reproduce this on your own with >>>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner >>>>>>>>>> within your KSP. >>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>>>>>>>>> right-hand sides column by column, which is very inefficient. >>>>>>>>>> You could run your code with -info dump and send us dump.0 to see >>>>>>>>>> what needs to be done on our end to make things more efficient, should you >>>>>>>>>> not be satisfied with the current performance of the code. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Pierre >>>>>>>>>> >>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of >>>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where >>>>>>>>>> v_i has size n. The data for v can be stored either in column-major or >>>>>>>>>> row-major order. Now, I want to do 2 types of operations: >>>>>>>>>> >>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>>> >>>>>>>>>> From what I have read on the documentation, I can think of 2 >>>>>>>>>> approaches. >>>>>>>>>> >>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to >>>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the >>>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve >>>>>>>>>> with R and V. >>>>>>>>>> >>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly >>>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that >>>>>>>>>> it is a multiple RHS system and act accordingly. >>>>>>>>>> >>>>>>>>>> Which would be the more efficient option? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Use 1. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> As a side-note, I am also wondering if there is a way to use >>>>>>>>>> row-major storage of the vector v. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> No >>>>>>>>>> >>>>>>>>>> The reason is that this could allow for more coalesced memory >>>>>>>>>> access when doing matvecs. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector >>>>>>>>>> products for the computation so in theory they should already be >>>>>>>>>> well-optimized >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Sreeram >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Thu Dec 21 14:25:45 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 21 Dec 2023 21:25:45 +0100 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> Message-ID: > On 21 Dec 2023, at 7:38?PM, Junchao Zhang wrote: > > > > > On Thu, Dec 21, 2023 at 5:54?AM Matthew Knepley > wrote: >> On Thu, Dec 21, 2023 at 6:46?AM Sreeram R Venkat > wrote: >>> Ok, I think the error I'm getting has something to do with how the multiple solves are being done in succession. I'll try to see if there's anything I'm doing wrong there. >>> >>> One question about the -pc_type lu -ksp_type preonly method: do you know which parts of the solve (factorization/triangular solves) are done on host and which are done on device? >> >> For SEQDENSE, I believe both the factorization and solve is on device. It is hard to see, but I believe the dispatch code is here: > Yes, it is correct. But Sreeram matrix is sparse, so this does not really matter. Sreeram, I don?t enough about the internals of CHOLMOD (and its interface in PETSc) to know which part is done on host and which part is done on device. By the way, you mentioned a very high number of right-hand sides (> 1E4) for a moderately-sized problem (~ 1E6). Is there a particular reason why you need so many of them? Have you considered doing some sort of deflation to reduce the number of solves? Thanks, Pierre >> >> https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/dense/seq/cupm/matseqdensecupm.hpp?ref_type=heads#L368 >> >> Thanks, >> >> Matt >> >>> Thanks, >>> Sreeram >>> >>> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet > wrote: >>>> Unfortunately, I am not able to reproduce such a failure with your input matrix. >>>> I?ve used ex79 that I linked previously and the system is properly solved. >>>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs ascii::ascii_info >>>> Linear solve converged due to CONVERGED_RTOL iterations 6 >>>> Mat Object: 1 MPI process >>>> type: seqaijcusparse >>>> rows=289, cols=289 >>>> total: nonzeros=2401, allocated nonzeros=2401 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node routines >>>> Mat Object: 1 MPI process >>>> type: seqdensecuda >>>> rows=289, cols=10 >>>> total: nonzeros=2890, allocated nonzeros=2890 >>>> total number of mallocs used during MatSetValues calls=0 >>>> >>>> You mentioned in a subsequent email that you are interested in systems with at most 1E6 unknowns, and up to 1E4 right-hand sides. >>>> I?m not sure you can expect significant gains from using GPU for such systems. >>>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type preonly -ksp_matsolve_batch_size 100 or something, depending on the memory available on your host. >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat > wrote: >>>>> >>>>> Here are the ksp_view files. I set the options -ksp_error_if_not_converged to try to get the vectors that caused the error. I noticed that some of the KSPMatSolves converge while others don't. In the code, the solves are called as: >>>>> >>>>> input vector v --> insert data of v into a dense mat --> KSPMatSolve() --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output vector w -- output w >>>>> >>>>> The operator used in the KSP is a Laplacian-like operator, and the MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve with a biharmonic-like operator. I can also run it with only the first KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP reportedly converges after 0 iterations (see the next line), but this causes problems in other parts of the code later on. >>>>> >>>>> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything. >>>>> >>>>> I'll keep trying different options and also try to get the MWE made (this KSPMatSolve is pretty performance critical for us). >>>>> >>>>> Thanks for all your help, >>>>> Sreeram >>>>> >>>>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet > wrote: >>>>>> >>>>>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat > wrote: >>>>>>> >>>>>>> Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex). >>>>>> >>>>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files (I?m guessing your are using double-precision scalars with 32-bit PetscInt). >>>>>> >>>>>>> I'll also try out some of the BoomerAMG options to see if that helps. >>>>>> >>>>>> These should work (this is where all ?PCMatApply()-ready? PC are being tested): https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 >>>>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with HIP). >>>>>> I?m aware the performance should not be optimal (see your comment about host/device copies), I?ve money to hire someone to work on this but: a) I need to find the correct engineer/post-doc, b) I currently don?t have good use cases (of course, I could generate a synthetic benchmark, for science). >>>>>> So even if you send me the three Mat, a MWE would be appreciated if the KSPMatSolve() is performance-critical for you (see point b) from above). >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet > wrote: >>>>>>>> >>>>>>>> >>>>>>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat > wrote: >>>>>>>>> >>>>>>>>> Hello Pierre, >>>>>>>>> >>>>>>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look? >>>>>>>> >>>>>>>> Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c). >>>>>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sreeram >>>>>>>>> >>>>>>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet > wrote: >>>>>>>>>> Hello Sreeram, >>>>>>>>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once. >>>>>>>>>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY. >>>>>>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>>>>>>>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Pierre >>>>>>>>>> >>>>>>>>>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?. >>>>>>>>>> >>>>>>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat > wrote: >>>>>>>>>>> >>>>>>>>>>> Hello Pierre, >>>>>>>>>>> >>>>>>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. >>>>>>>>>>> >>>>>>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code. >>>>>>>>>>> >>>>>>>>>>> Can you please help me with this? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Sreeram >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams > wrote: >>>>>>>>>>>> N.B., AMGX interface is a bit experimental. >>>>>>>>>>>> Mark >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat > wrote: >>>>>>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Sreeram >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet > wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat > wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation. >>>>>>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. >>>>>>>>>>>>>> But let us know if you need assistance figuring things out. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Pierre >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Sreeram >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet > wrote: >>>>>>>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP. >>>>>>>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient. >>>>>>>>>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Pierre >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith > wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat > wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order. Now, I want to do 2 types of operations: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Which would be the more efficient option? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Use 1. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> No >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Sreeram >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Dec 21 16:32:43 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 21 Dec 2023 17:32:43 -0500 Subject: [petsc-users] TS docs wrong URLs in Examples In-Reply-To: References: Message-ID: <2AC654F7-1119-4696-BC92-D0BDF271661E@petsc.dev> https://gitlab.com/petsc/petsc/-/merge_requests/7135 Regex processing is not ideal for this task; I've modified the code to remove most false positive finds. Thanks for reporting the problem, Barry > On Dec 21, 2023, at 8:35?AM, Niclas G?tting wrote: > > Hi all, > > I noticed that all links to the examples under https://petsc.org/release/manualpages/TS/TS/ point to the wrong URL. Instead of src/ts/**/*, they point to src/sys/**/*, which does not seem to be right. This definitely is a minor issue, but I couldn't see an obvious fix via "Edit this page", so here is the e-mail. > > Best regards > Niclas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Thu Dec 21 18:44:24 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Fri, 22 Dec 2023 06:14:24 +0530 Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors In-Reply-To: References: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev> <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et> <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et> <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et> Message-ID: The reason for the large number of RHS's is that the problem originally comes from having to do one solve with 1e10 size matrix. If we make some extra assumptions on our problem, that 1e10 matrix becomes block diagonal with blocks of size 1e6 and all the blocks are the same. The 1e6 is a spatial discretion dimension and the 1e4 is the number of time steps. So that's why we wanted to do the batched solve like this. Thanks, Sreeram On Fri, Dec 22, 2023, 1:56?AM Pierre Jolivet wrote: > > > On 21 Dec 2023, at 7:38?PM, Junchao Zhang wrote: > > > > > On Thu, Dec 21, 2023 at 5:54?AM Matthew Knepley wrote: > >> On Thu, Dec 21, 2023 at 6:46?AM Sreeram R Venkat >> wrote: >> >>> Ok, I think the error I'm getting has something to do with how the >>> multiple solves are being done in succession. I'll try to see if there's >>> anything I'm doing wrong there. >>> >>> One question about the -pc_type lu -ksp_type preonly method: do you know >>> which parts of the solve (factorization/triangular solves) are done on host >>> and which are done on device? >>> >> >> For SEQDENSE, I believe both the factorization and solve is on device. It >> is hard to see, but I believe the dispatch code is here: >> > Yes, it is correct. > > > But Sreeram matrix is sparse, so this does not really matter. > Sreeram, I don?t enough about the internals of CHOLMOD (and its interface > in PETSc) to know which part is done on host and which part is done on > device. > By the way, you mentioned a very high number of right-hand sides (> 1E4) > for a moderately-sized problem (~ 1E6). > Is there a particular reason why you need so many of them? > Have you considered doing some sort of deflation to reduce the number of > solves? > > Thanks, > Pierre > > >> >> https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/dense/seq/cupm/matseqdensecupm.hpp?ref_type=heads#L368 >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Sreeram >>> >>> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet wrote: >>> >>>> Unfortunately, I am not able to reproduce such a failure with your >>>> input matrix. >>>> I?ve used ex79 that I linked previously and the system is properly >>>> solved. >>>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg >>>> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs >>>> ascii::ascii_info >>>> Linear solve converged due to CONVERGED_RTOL iterations 6 >>>> Mat Object: 1 MPI process >>>> type: seqaijcusparse >>>> rows=289, cols=289 >>>> total: nonzeros=2401, allocated nonzeros=2401 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node routines >>>> Mat Object: 1 MPI process >>>> type: seqdensecuda >>>> rows=289, cols=10 >>>> total: nonzeros=2890, allocated nonzeros=2890 >>>> total number of mallocs used during MatSetValues calls=0 >>>> >>>> You mentioned in a subsequent email that you are interested in systems >>>> with at most 1E6 unknowns, and up to 1E4 right-hand sides. >>>> I?m not sure you can expect significant gains from using GPU for such >>>> systems. >>>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type >>>> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory >>>> available on your host. >>>> >>>> Thanks, >>>> Pierre >>>> >>>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat >>>> wrote: >>>> >>>> Here are the ksp_view files. I set the options >>>> -ksp_error_if_not_converged to try to get the vectors that caused the >>>> error. I noticed that some of the KSPMatSolves converge while others don't. >>>> In the code, the solves are called as: >>>> >>>> input vector v --> insert data of v into a dense mat --> KSPMatSolve() >>>> --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output >>>> vector w -- output w >>>> >>>> The operator used in the KSP is a Laplacian-like operator, and the >>>> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve >>>> with a biharmonic-like operator. I can also run it with only the first >>>> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP >>>> reportedly converges after 0 iterations (see the next line), but this >>>> causes problems in other parts of the code later on. >>>> >>>> I saw that sometimes the first KSPMatSolve "converges" after 0 >>>> iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a >>>> NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything. >>>> >>>> I'll keep trying different options and also try to get the MWE made >>>> (this KSPMatSolve is pretty performance critical for us). >>>> >>>> Thanks for all your help, >>>> Sreeram >>>> >>>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet wrote: >>>> >>>>> >>>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat >>>>> wrote: >>>>> >>>>> Thanks, I will try to create a minimal reproducible example. This may >>>>> take me some time though, as I need to figure out how to extract only the >>>>> relevant parts (the full program this solve is used in is getting quite >>>>> complex). >>>>> >>>>> >>>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat >>>>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files >>>>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt). >>>>> >>>>> I'll also try out some of the BoomerAMG options to see if that helps. >>>>> >>>>> >>>>> These should work (this is where all ?PCMatApply()-ready? PC are being >>>>> tested): >>>>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215 >>>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not >>>>> with HIP). >>>>> I?m aware the performance should not be optimal (see your comment >>>>> about host/device copies), I?ve money to hire someone to work on this but: >>>>> a) I need to find the correct engineer/post-doc, b) I currently don?t have >>>>> good use cases (of course, I could generate a synthetic benchmark, for >>>>> science). >>>>> So even if you send me the three Mat, a MWE would be appreciated if >>>>> the KSPMatSolve() is performance-critical for you (see point b) from above). >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet wrote: >>>>> >>>>>> >>>>>> >>>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat >>>>>> wrote: >>>>>> >>>>>> Hello Pierre, >>>>>> >>>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and >>>>>> it seems to be doing the batched solves, but the KSP is not converging due >>>>>> to a NaN or Inf being generated. I also noticed there are a lot of >>>>>> host-to-device and device-to-host copies of the matrices (the non-batched >>>>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could >>>>>> you please take a look? >>>>>> >>>>>> >>>>>> Yes, but you?d need to send me something I can run with your set of >>>>>> options (if you are more confident doing this in private, you can remove >>>>>> the list from c/c). >>>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and >>>>>> there is not much error checking, so instead of erroring out, this may be >>>>>> the reason why you are getting garbage. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>> Thanks, >>>>>> Sreeram >>>>>> >>>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet >>>>>> wrote: >>>>>> >>>>>>> Hello Sreeram, >>>>>>> KSPCG (PETSc implementation of CG) does not handle solves with >>>>>>> multiple columns at once. >>>>>>> There is only a single native PETSc KSP implementation which handles >>>>>>> solves with multiple columns at once: KSPPREONLY. >>>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more >>>>>>> advanced methods) implementation which handles solves with multiple columns >>>>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, >>>>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). >>>>>>> I?m the main author of HPDDM, there is preliminary support for >>>>>>> device matrices, but if it?s not working as intended/not faster than column >>>>>>> by column, I?d be happy to have a deeper look (maybe in private), because >>>>>>> most (if not all) of my users interested in (pseudo-)block Krylov solvers >>>>>>> (i.e., solvers that treat right-hand sides in a single go) are using plain >>>>>>> host matrices. >>>>>>> >>>>>>> Thanks, >>>>>>> Pierre >>>>>>> >>>>>>> PS: you could have a look at >>>>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to >>>>>>> understand the philosophy behind block iterative methods in PETSc (and in >>>>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was >>>>>>> developed in the context of this paper to produce Figures 2-3. Note that >>>>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among >>>>>>> others) have been made ?PCMatApply()-ready?. >>>>>>> >>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat >>>>>>> wrote: >>>>>>> >>>>>>> Hello Pierre, >>>>>>> >>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. >>>>>>> However, I am noticing that it is still solving column by column (this is >>>>>>> stated explicitly in the info dump attached). I looked at the code for >>>>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is >>>>>>> true, it should do the batched solve, though I'm not sure where that gets >>>>>>> set. >>>>>>> >>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when >>>>>>> running the code. >>>>>>> >>>>>>> Can you please help me with this? >>>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>> >>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams wrote: >>>>>>> >>>>>>>> N.B., AMGX interface is a bit experimental. >>>>>>>> Mark >>>>>>>> >>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat < >>>>>>>> srvenkat at utexas.edu> wrote: >>>>>>>> >>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build >>>>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sreeram >>>>>>>>> >>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>>>>>>> >>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it >>>>>>>>>> out and see how it performs. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and >>>>>>>>>> thus has no PCMatApply() implementation. >>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() >>>>>>>>>> implementation. >>>>>>>>>> But let us know if you need assistance figuring things out. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Pierre >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Sreeram >>>>>>>>>> >>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that >>>>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can >>>>>>>>>>> reproduce this on your own with >>>>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner >>>>>>>>>>> within your KSP. >>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of >>>>>>>>>>> right-hand sides column by column, which is very inefficient. >>>>>>>>>>> You could run your code with -info dump and send us dump.0 to >>>>>>>>>>> see what needs to be done on our end to make things more efficient, should >>>>>>>>>>> you not be satisfied with the current performance of the code. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Pierre >>>>>>>>>>> >>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat < >>>>>>>>>>> srvenkat at utexas.edu> wrote: >>>>>>>>>>> >>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of >>>>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where >>>>>>>>>>> v_i has size n. The data for v can be stored either in column-major or >>>>>>>>>>> row-major order. Now, I want to do 2 types of operations: >>>>>>>>>>> >>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>>>>> >>>>>>>>>>> From what I have read on the documentation, I can think of 2 >>>>>>>>>>> approaches. >>>>>>>>>>> >>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to >>>>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the >>>>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve >>>>>>>>>>> with R and V. >>>>>>>>>>> >>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly >>>>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that >>>>>>>>>>> it is a multiple RHS system and act accordingly. >>>>>>>>>>> >>>>>>>>>>> Which would be the more efficient option? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Use 1. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> As a side-note, I am also wondering if there is a way to use >>>>>>>>>>> row-major storage of the vector v. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> No >>>>>>>>>>> >>>>>>>>>>> The reason is that this could allow for more coalesced memory >>>>>>>>>>> access when doing matvecs. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector >>>>>>>>>>> products for the computation so in theory they should already be >>>>>>>>>>> well-optimized >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Sreeram >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: