From ihabmohsen at proton.me  Mon Dec  4 06:15:48 2023
From: ihabmohsen at proton.me (Ihab Mohsen)
Date: Mon, 04 Dec 2023 12:15:48 +0000
Subject: [petsc-users] Elevate Safety with ATISystems.com: Your Destination
 for Cutting-Edge Giant Voice and Outdoor Warning Systems
Message-ID: <_uQDVlKGw8oM8VcCteUDjOlfFA9AQJAlQvMGSyZVJRoX5gaVwkBFqQGOGL5Sb16R0VkrSMDbFUgTvonB0i0zCHAiDKddImmX-vMQgHuiOe4=@proton.me>

Ensuring safety and security in outdoor spaces stands as a paramount concern, and[ATISystems.com](http://atisystems.com/)leads the way with innovative solutions crafted to safeguard communities and businesses. Specializing in state-of-the-art giant voice systems and outdoor warning sirens,[ATISystems.com](http://atisystems.com/)offers comprehensive solutions tailored to meet your safety requirements.

What's Available at[ATISystems.com](http://atisystems.com/)?

1. Giant Voice Systems

[ATISystems.com](http://atisystems.com/)stands out in the industry by providing robust giant voice systems designed to effectively broadcast emergency messages across expansive outdoor areas. These systems play a crucial role in emergency preparedness, facilitating clear and immediate communication during critical situations.

2. Outdoor Warning Sirens

Offering a range of outdoor warning sirens,[ATISystems.com](http://atisystems.com/)ensures that communities and facilities have access to resilient alerting mechanisms. These sirens are engineered to emit high-decibel warnings, alerting individuals outdoors to potential threats or emergencies, thus enhancing overall safety protocols.

Why Opt for[ATISystems.com](http://atisystems.com/)for Your Safety Solutions?

1. Cutting-Edge Technology

[ATISystems.com](http://atisystems.com/)leverages cutting-edge technology in the development and deployment of their giant voice systems and outdoor warning sirens. The integration of advanced features ensures reliability and effectiveness precisely when it matters most.

2. Tailored Solutions

Recognizing the uniqueness of each location and scenario,[ATISystems.com](http://atisystems.com/)offers customized solutions to match specific safety requirements. Whether for municipalities, industrial sites, or educational campuses, their systems can be tailored for optimal performance.

3. Dedication to Safety

At the heart of[ATISystems.com](http://atisystems.com/)lies an unwavering commitment to safety. Their solutions are engineered to offer peace of mind, enabling swift and efficient communication during emergencies, thereby minimizing potential risks.

Discover Unmatched Safety Solutions at[ATISystems.com](http://atisystems.com/)

The dedication of[ATISystems.com](http://atisystems.com/)in delivering top-tier giant voice systems and outdoor warning sirens positions them as the go-to resource for enhancing outdoor safety measures. Their comprehensive range of products and services equips you with the necessary tools to mitigate risks and safeguard lives.

Secure Your Environment Today

Explore the cutting-edge solutions offered by[ATISystems.com](http://atisystems.com/)and take essential steps to fortify safety in your outdoor spaces. Delve into their giant voice systems and outdoor warning sirens to enhance your emergency preparedness.

For more information, please visit: [Giant Voice System]([https://atisystems.com](https://atisystems.com/)), [Outdoor Warning System]([https://atisystems.com](https://atisystems.com/)), [Outdoor Warning Siren]([https://atisystems.com](https://atisystems.com/))
Visit [[ATISystems.com](http://atisystems.com/)]([https://atisystems.com](https://atisystems.com/)) today and empower your organization or community with reliable and effective safety solutions. With[ATISystems.com](http://atisystems.com/), safety is priority.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231204/adf4c40b/attachment.html>

From ehabmohsen66 at gmail.com  Mon Dec  4 06:28:46 2023
From: ehabmohsen66 at gmail.com (Ehab Mohsen)
Date: Mon, 4 Dec 2023 14:28:46 +0200
Subject: [petsc-users] Why Choose Digitology.co as Your Digital Marketing
 Agency in Egypt
Message-ID: <CAM=do+hZLJr=FQmsVir6Jr2tWroKRqq_SS3Wi_8wfhfVB3=dSA@mail.gmail.com>

Looking for a standout [digital marketing agency](https://digitology.co) in
Egypt? Look no further than [Digitology.co <http://digitology.co/>](
https://digitology.co)! We take pride in being Egypt's premier digital
marketing agency, delivering top-tier services that enhance your online
presence and drive unparalleled success for your business.

What Sets [Digitology.co <http://digitology.co/>](https://digitology.co)
Apart as Egypt's Best Digital Marketing Agency?

1. Unrivaled Expertise in [Digital Marketing](https://digitology.co)

At [Digitology.co <http://digitology.co/>](https://digitology.co), our team
comprises seasoned professionals with extensive expertise in various
digital marketing facets. From tailored SEO strategies for the Egyptian
market to impactful social media campaigns, our experts craft personalized
solutions to meet your unique business needs.

2. Proven Track Record of Success

Being Egypt's [top digital marketing agency](https://digitology.co), our
track record speaks volumes. Consistently delivering outstanding results
for clients, we've helped achieve higher visibility, increased traffic, and
amplified conversions. Our success stories testify to our commitment to
excellence.

3. Comprehensive [SEO and Online Marketing](https://digitology.co) Approach

We recognize the significance of a holistic approach to SEO and online
marketing. Our strategies encompass diverse techniques like content
optimization, link building, and technical SEO, ensuring prominent search
engine rankings for your website.

Why [Digitology.co <http://digitology.co/>](https://digitology.co) Stands
Out Among Egypt's [SEO Agencies](https://digitology.co)

As a leading [SEO agency in Egypt](https://digitology.co), we focus on
driving organic growth and maximizing online presence. Our tailored SEO
strategies aim to improve website visibility, increase organic traffic, and
ultimately enhance conversions. Understanding the local market well, we
implement resonating strategies for Egyptian audiences.

Choose [Digitology.co <http://digitology.co/>](https://digitology.co) for
Unmatched Digital Marketing Solutions

Choosing us as your digital marketing partner grants you access to
cutting-edge strategies, personalized solutions, and a dedicated team
committed to your success. Our mission is to propel your business to new
heights through innovative, results-oriented digital marketing strategies.

Take the Next Step Towards Success

Ready to elevate your digital presence? Partner with [Digitology.co
<http://digitology.co/>](https://digitology.co), Egypt's [best digital
marketing agency](https://digitology.co). Contact us today to explore how
our tailored solutions can revolutionize your online presence and drive
tangible business growth.
Feel free to explore [Digitology.co <http://digitology.co/>](
https://digitology.co)'s services and witness firsthand how we transform
your digital marketing endeavors. At [Digitology.co <http://digitology.co/>
](https://digitology.co), your success is our priority!
-- 
[image: Best regards,]


Create your own email signature
<https://www.wisestamp.com/create-own-email-signature/?utm_source=promotion&utm_medium=signature&utm_campaign=create_your_own&srcid=>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231204/b9c576c9/attachment-0001.html>

From jordi.manyer at monash.edu  Mon Dec  4 08:44:41 2023
From: jordi.manyer at monash.edu (Jordi Manyer Fuertes)
Date: Tue, 5 Dec 2023 01:44:41 +1100
Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody
Message-ID: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>

Dear PETSc users/developpers,

I am currently trying to use the method `MatNullSpaceCreateRigidBody` 
together with `PCGAMG` to efficiently precondition an elasticity solver 
in 2D/3D.

I have managed to make it work in serial (or with 1 MPI rank) with 
h-independent number of iterations (which is great), but the solver 
diverges in parallel.

I assume it has to do with the coordinate vector I am building the 
null-space with not being correctly setup. The documentation is not that 
clear on which nodes exactly have to be set in each partition. Does it 
require nodes corresponding to owned dofs, or all dofs in each partition 
(owned+ghost)? What ghost layout should the `Vec` have?

Any other tips about what I might be doing wrong?

Thanks,

Jordi


From bsmith at petsc.dev  Mon Dec  4 11:37:15 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 4 Dec 2023 12:37:15 -0500
Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody
In-Reply-To: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>
References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>
Message-ID: <8CBF0F2E-3FE5-4E0D-8924-63CE130F7C6B@petsc.dev>


  To owned DOF. Any ghosting of the problem is not relevant since the null space created is purely a linear algebra thing that is related to the global vector (not local vectors).

  Barry


> On Dec 4, 2023, at 9:44?AM, Jordi Manyer Fuertes via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Dear PETSc users/developpers,
> 
> I am currently trying to use the method `MatNullSpaceCreateRigidBody` together with `PCGAMG` to efficiently precondition an elasticity solver in 2D/3D.
> 
> I have managed to make it work in serial (or with 1 MPI rank) with h-independent number of iterations (which is great), but the solver diverges in parallel.
> 
> I assume it has to do with the coordinate vector I am building the null-space with not being correctly setup. The documentation is not that clear on which nodes exactly have to be set in each partition. Does it require nodes corresponding to owned dofs, or all dofs in each partition (owned+ghost)? What ghost layout should the `Vec` have?
> 
> Any other tips about what I might be doing wrong?
> 
> Thanks,
> 
> Jordi
> 


From knepley at gmail.com  Mon Dec  4 11:46:52 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 4 Dec 2023 12:46:52 -0500
Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody
In-Reply-To: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>
References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>
Message-ID: <CAMYG4G=f75fYSpuC6wMZGE6L9p7su9AaSYA_bgTFM3NKu7_P6w@mail.gmail.com>

On Mon, Dec 4, 2023 at 12:01?PM Jordi Manyer Fuertes via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Dear PETSc users/developpers,
>
> I am currently trying to use the method `MatNullSpaceCreateRigidBody`
> together with `PCGAMG` to efficiently precondition an elasticity solver
> in 2D/3D.
>
> I have managed to make it work in serial (or with 1 MPI rank) with
> h-independent number of iterations (which is great), but the solver
> diverges in parallel.
>
> I assume it has to do with the coordinate vector I am building the
> null-space with not being correctly setup. The documentation is not that
> clear on which nodes exactly have to be set in each partition. Does it
> require nodes corresponding to owned dofs, or all dofs in each partition
> (owned+ghost)? What ghost layout should the `Vec` have?
>
> Any other tips about what I might be doing wrong?
>

What we assume is that you have some elastic problem formulated in primal
unknowns (displacements) so that the solution vector looks like this:

  [ d^0_x d^0_y d^0_z d^1_x ..... ]

or whatever spatial dimension you have. We expect to get a global vector
that looks like that, but instead
of displacements, we get the coordinates that each displacement corresponds
to. We make the generators of translations:

  [ 1 0 0 1 0 0 1 0 0 1 0 0... ]
  [ 0 1 0 0 1 0 0 1 0 0 1 0... ]
  [ 0 0 1 0 0 1 0 0 1 0 0 1... ]

for which we do not need the coordinates, and then the generators of
rotations about each axis, for which
we _do_ need the coordinates, since we need to know how much each point
moves if you rotate about some center.

  Does that make sense?

   Thanks,

      Matt


> Thanks,
>
> Jordi
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231204/dbdd5542/attachment.html>

From jordi.manyer at monash.edu  Tue Dec  5 06:57:23 2023
From: jordi.manyer at monash.edu (Jordi Manyer Fuertes)
Date: Tue, 5 Dec 2023 23:57:23 +1100
Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody
In-Reply-To: <CAMYG4G=f75fYSpuC6wMZGE6L9p7su9AaSYA_bgTFM3NKu7_P6w@mail.gmail.com>
References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>
	<CAMYG4G=f75fYSpuC6wMZGE6L9p7su9AaSYA_bgTFM3NKu7_P6w@mail.gmail.com>
Message-ID: <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu>

Thanks for the prompt response. Both answers look like what I'm doing.

After playing a bit more with solver, I managed to make it run in 
parallel with different boundary conditions (full dirichlet bcs, vs 
mixed newmann + dirichlet). This raises two questions:

- How relevant are boundary conditions (eliminating dirichlet rows/cols 
vs weak newmann bcs) to the solver? Should I modify something when 
changing boundary conditions?

- Also, the solver did well with the old bcs when run in a single 
processor (but not in parallel). This seems odd, since parallel and 
serial behavior should be consistent (or not?). Could it be fault of the 
PCGAMG? I believe the default local solver is ILU, shoud I be changing 
it to LU or something else for these kind of problems?

Thank you both again,

Jordi


On 5/12/23 04:46, Matthew Knepley wrote:
> On Mon, Dec 4, 2023 at 12:01?PM Jordi Manyer Fuertes via petsc-users 
> <petsc-users at mcs.anl.gov> wrote:
>
>     Dear PETSc users/developpers,
>
>     I am currently trying to use the method `MatNullSpaceCreateRigidBody`
>     together with `PCGAMG` to efficiently precondition an elasticity
>     solver
>     in 2D/3D.
>
>     I have managed to make it work in serial (or with 1 MPI rank) with
>     h-independent number of iterations (which is great), but the solver
>     diverges in parallel.
>
>     I assume it has to do with the coordinate vector I am building the
>     null-space with not being correctly setup. The documentation is
>     not that
>     clear on which nodes exactly have to be set in each partition.
>     Does it
>     require nodes corresponding to owned dofs, or all dofs in each
>     partition
>     (owned+ghost)? What ghost layout should the `Vec` have?
>
>     Any other tips about what I might be doing wrong?
>
>
> What we assume is that you have some elastic problem formulated in 
> primal unknowns (displacements) so that the solution vector looks like 
> this:
>
> ? [ d^0_x d^0_y d^0_z d^1_x ..... ]
>
> or whatever spatial dimension you have. We expect to get a global 
> vector that looks like that, but instead
> of displacements, we get the coordinates that each displacement 
> corresponds to. We make the generators of translations:
>
> ? [ 1 0 0 1 0 0 1 0 0 1 0 0... ]
> ? [ 0 1 0 0 1 0 0 1 0 0 1 0... ]
> ? [ 0 0 1 0 0 1 0 0 1 0 0 1... ]
>
> for which we do not need the coordinates, and then the generators of 
> rotations about each axis, for which
> we _do_ need the coordinates, since we need to know how much each 
> point moves if you rotate about some center.
>
> ? Does that make sense?
>
> ? ?Thanks,
>
> ? ? ? Matt
>
>     Thanks,
>
>     Jordi
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
> <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231205/bd372563/attachment.html>

From knepley at gmail.com  Tue Dec  5 07:35:15 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Dec 2023 08:35:15 -0500
Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody
In-Reply-To: <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu>
References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>
	<CAMYG4G=f75fYSpuC6wMZGE6L9p7su9AaSYA_bgTFM3NKu7_P6w@mail.gmail.com>
	<95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu>
Message-ID: <CAMYG4GkfwuuP-Wp4143N87HqupQ2D7UrrgxW4gWc3wTfU7BMQw@mail.gmail.com>

On Tue, Dec 5, 2023 at 7:57?AM Jordi Manyer Fuertes <jordi.manyer at monash.edu>
wrote:

> Thanks for the prompt response. Both answers look like what I'm doing.
>
> After playing a bit more with solver, I managed to make it run in parallel
> with different boundary conditions (full dirichlet bcs, vs mixed newmann +
> dirichlet). This raises two questions:
>
> - How relevant are boundary conditions (eliminating dirichlet rows/cols vs
> weak newmann bcs) to the solver? Should I modify something when changing
> boundary conditions?
>
> The rigid body kernel is independent of boundary conditions, and is only
really important for the coarse grids. However, it is really easy to ruin a
solve with inconsistent boundary conditions, or with conditions which cause
a singularity at a change point.

> - Also, the solver did well with the old bcs when run in a single
> processor (but not in parallel). This seems odd, since parallel and serial
> behavior should be consistent (or not?). Could it be fault of the PCGAMG?
>
This is unlikely. We have many parallel tests of elasticity (SNES ex17,
ex56, ex77, etc). We do not see problems. It seems more likely that the
system might not be assembled correctly in parallel. Did you check that the
matrices match?

> I believe the default local solver is ILU, shoud I be changing it to LU or
> something else for these kind of problems?
>
Do you mean the smoother for AMG? No, the default is Chebyshev/Jacobi,
which is the same in parallel.

  Thanks,

     Matt

Thank you both again,
>
> Jordi
>
>
> On 5/12/23 04:46, Matthew Knepley wrote:
>
> On Mon, Dec 4, 2023 at 12:01?PM Jordi Manyer Fuertes via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Dear PETSc users/developpers,
>>
>> I am currently trying to use the method `MatNullSpaceCreateRigidBody`
>> together with `PCGAMG` to efficiently precondition an elasticity solver
>> in 2D/3D.
>>
>> I have managed to make it work in serial (or with 1 MPI rank) with
>> h-independent number of iterations (which is great), but the solver
>> diverges in parallel.
>>
>> I assume it has to do with the coordinate vector I am building the
>> null-space with not being correctly setup. The documentation is not that
>> clear on which nodes exactly have to be set in each partition. Does it
>> require nodes corresponding to owned dofs, or all dofs in each partition
>> (owned+ghost)? What ghost layout should the `Vec` have?
>>
>> Any other tips about what I might be doing wrong?
>>
>
> What we assume is that you have some elastic problem formulated in primal
> unknowns (displacements) so that the solution vector looks like this:
>
>   [ d^0_x d^0_y d^0_z d^1_x ..... ]
>
> or whatever spatial dimension you have. We expect to get a global vector
> that looks like that, but instead
> of displacements, we get the coordinates that each displacement
> corresponds to. We make the generators of translations:
>
>   [ 1 0 0 1 0 0 1 0 0 1 0 0... ]
>   [ 0 1 0 0 1 0 0 1 0 0 1 0... ]
>   [ 0 0 1 0 0 1 0 0 1 0 0 1... ]
>
> for which we do not need the coordinates, and then the generators of
> rotations about each axis, for which
> we _do_ need the coordinates, since we need to know how much each point
> moves if you rotate about some center.
>
>   Does that make sense?
>
>    Thanks,
>
>       Matt
>
>
>
>> Thanks,
>>
>> Jordi
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231205/2c4cabe8/attachment-0001.html>

From mfadams at lbl.gov  Tue Dec  5 10:09:27 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Tue, 5 Dec 2023 11:09:27 -0500
Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody
In-Reply-To: <CAMYG4GkfwuuP-Wp4143N87HqupQ2D7UrrgxW4gWc3wTfU7BMQw@mail.gmail.com>
References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>
	<CAMYG4G=f75fYSpuC6wMZGE6L9p7su9AaSYA_bgTFM3NKu7_P6w@mail.gmail.com>
	<95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu>
	<CAMYG4GkfwuuP-Wp4143N87HqupQ2D7UrrgxW4gWc3wTfU7BMQw@mail.gmail.com>
Message-ID: <CADOhEh52tC+pk0mQqPnL9CSGufZBpdpSRFVUXFk5SE8EJA-VWA@mail.gmail.com>

I would suggest (excuse me if I missed something):

*** Test a simple Jacobi solver in serial and parallel and verify that the
convergence history (ksp_monitor) are identical to round-off error
*** Test GAMG, serial and parallel, without MatNullSpaceCreateRigidBody and
verify that the convergence is close, say well within 20% in the number of
iterations
*** Next you can get the null space vectors (v_i) and compute q = A*v_i and
verify that each q is zero except for the BCs.
  - You could remove the BCs from A, temporarily, and the q should have
norm machine epsilon to make this test simpler.
  - No need to solve this no-BC A solve.

Mark

On Tue, Dec 5, 2023 at 8:46?AM Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Dec 5, 2023 at 7:57?AM Jordi Manyer Fuertes <
> jordi.manyer at monash.edu> wrote:
>
>> Thanks for the prompt response. Both answers look like what I'm doing.
>>
>> After playing a bit more with solver, I managed to make it run in
>> parallel with different boundary conditions (full dirichlet bcs, vs mixed
>> newmann + dirichlet). This raises two questions:
>>
>> - How relevant are boundary conditions (eliminating dirichlet rows/cols
>> vs weak newmann bcs) to the solver? Should I modify something when changing
>> boundary conditions?
>>
>> The rigid body kernel is independent of boundary conditions, and is only
> really important for the coarse grids. However, it is really easy to ruin a
> solve with inconsistent boundary conditions, or with conditions which cause
> a singularity at a change point.
>
>> - Also, the solver did well with the old bcs when run in a single
>> processor (but not in parallel). This seems odd, since parallel and serial
>> behavior should be consistent (or not?). Could it be fault of the PCGAMG?
>>
> This is unlikely. We have many parallel tests of elasticity (SNES ex17,
> ex56, ex77, etc). We do not see problems. It seems more likely that the
> system might not be assembled correctly in parallel. Did you check that the
> matrices match?
>
>> I believe the default local solver is ILU, shoud I be changing it to LU
>> or something else for these kind of problems?
>>
> Do you mean the smoother for AMG? No, the default is Chebyshev/Jacobi,
> which is the same in parallel.
>
>   Thanks,
>
>      Matt
>
> Thank you both again,
>>
>> Jordi
>>
>>
>> On 5/12/23 04:46, Matthew Knepley wrote:
>>
>> On Mon, Dec 4, 2023 at 12:01?PM Jordi Manyer Fuertes via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>>> Dear PETSc users/developpers,
>>>
>>> I am currently trying to use the method `MatNullSpaceCreateRigidBody`
>>> together with `PCGAMG` to efficiently precondition an elasticity solver
>>> in 2D/3D.
>>>
>>> I have managed to make it work in serial (or with 1 MPI rank) with
>>> h-independent number of iterations (which is great), but the solver
>>> diverges in parallel.
>>>
>>> I assume it has to do with the coordinate vector I am building the
>>> null-space with not being correctly setup. The documentation is not that
>>> clear on which nodes exactly have to be set in each partition. Does it
>>> require nodes corresponding to owned dofs, or all dofs in each partition
>>> (owned+ghost)? What ghost layout should the `Vec` have?
>>>
>>> Any other tips about what I might be doing wrong?
>>>
>>
>> What we assume is that you have some elastic problem formulated in primal
>> unknowns (displacements) so that the solution vector looks like this:
>>
>>   [ d^0_x d^0_y d^0_z d^1_x ..... ]
>>
>> or whatever spatial dimension you have. We expect to get a global vector
>> that looks like that, but instead
>> of displacements, we get the coordinates that each displacement
>> corresponds to. We make the generators of translations:
>>
>>   [ 1 0 0 1 0 0 1 0 0 1 0 0... ]
>>   [ 0 1 0 0 1 0 0 1 0 0 1 0... ]
>>   [ 0 0 1 0 0 1 0 0 1 0 0 1... ]
>>
>> for which we do not need the coordinates, and then the generators of
>> rotations about each axis, for which
>> we _do_ need the coordinates, since we need to know how much each point
>> moves if you rotate about some center.
>>
>>   Does that make sense?
>>
>>    Thanks,
>>
>>       Matt
>>
>>
>>
>>> Thanks,
>>>
>>> Jordi
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231205/d91f9f2d/attachment.html>

From alexlindsay239 at gmail.com  Tue Dec  5 12:58:54 2023
From: alexlindsay239 at gmail.com (Alexander Lindsay)
Date: Tue, 5 Dec 2023 10:58:54 -0800
Subject: [petsc-users] Pre-check before each line search function
 evaluation
In-Reply-To: <CAMYG4G=MP4coyt_vV4VTQdyT1zbxxwm3JUadtCtSRmG_mp+-8Q@mail.gmail.com>
References: <CANFcJrHwwuiWGnFZqZu=4bv2c1NuZs47kNa_e_HyzC6u1=_PMQ@mail.gmail.com>
	<80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev>
	<CANFcJrEN=ZHHoVcbKuGwaaG52LO7LjaOerH5YVyBJp7jGa0Bsw@mail.gmail.com>
	<CAMYG4G=FJtyrM2yL2VM9GvL8afQFEcfvp1HTmiUG17g0JxgVzw@mail.gmail.com>
	<CANFcJrE+7gv=G2EhAo5qhkzgJks5WrzciPtOu4sWuPC+D6UfYw@mail.gmail.com>
	<CAMYG4G=MP4coyt_vV4VTQdyT1zbxxwm3JUadtCtSRmG_mp+-8Q@mail.gmail.com>
Message-ID: <CANFcJrH80dvAt5SxF-Pez01n8fuOfhXjS8fOkTOcq2WbSHZkCg@mail.gmail.com>

Thanks Matt. For the immediate present I will probably use a basic line
search with a precheck, but if I want true line searches in the future I
will pursue option 2

On Thu, Nov 30, 2023 at 2:27?PM Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Nov 30, 2023 at 5:08?PM Alexander Lindsay <
> alexlindsay239 at gmail.com> wrote:
>
>> Hi Matt, your derivation is spot on. However, the problem is not linear,
>> which is why I am using SNES. So you need to replace
>>
>> U = A^{-1} f - A^{-1} B L
>>
>> with
>>
>> dU = A^{-1} f - A^{-1} B dL
>>
>
> I see two cases:
>
>   1) There is an easy nonlinear elimination for U. In this case, you do
> this to get U_1.
>
>   2) There is only a linear elimination. In this case, I don't think the
> nonlinear system should be phrased
>       only on L, but rather on (U, L) itself. The linear elimination can
> be used as an excellent preconditioner
>       for the Newton system.
>
>   Thanks,
>
>      Matt
>
>
>> On Thu, Nov 30, 2023 at 1:47?PM Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Thu, Nov 30, 2023 at 4:23?PM Alexander Lindsay <
>>> alexlindsay239 at gmail.com> wrote:
>>>
>>>> If someone passes me just L, where L represents the "global" degrees of
>>>> freedom, in this case they represent unknowns on the trace of the mesh,
>>>> this is insufficient information for me to evaluate my function. Because in
>>>> truth my degrees of freedom are the sum of the trace unknowns (the unknowns
>>>> in the global solution vector) and the eliminated unknowns which are
>>>> entirely local to each element. So I will say my dofs are L + U.
>>>>
>>>
>>> I want to try and reduce this to the simplest possible thing so that I
>>> can understand. We have some system which has two parts to the solution, L
>>> and U. If this problem is linear, we have
>>>
>>>   / A  B \ / U \ = / f \
>>>   \ C D / \ L /   \ g /
>>>
>>> and we assume that A is easily invertible, so that
>>>
>>>   U + A^{-1} B L = f
>>>   U = f - A^{-1} B L
>>>
>>>   C U + D L = g
>>>   C (f - A^{-1} B L) + D L = g
>>>   (D - C A^{-1} B) L = g - C f
>>>
>>> where I have reproduced the Schur complement derivation. Here, given any
>>> L, I can construct the corresponding U by inverting A. I know your system
>>> may be different, but if you are only solving for L,
>>> it should have this property I think.
>>>
>>> Thus, if the line search generates a new L, say L_1, I should be able to
>>> get U_1 by just plugging in. If this is not so, can you write out the
>>> equations so we can see why this is not true?
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> I start with some initial guess L0 and U0. I perform a finite element
>>>> assembly procedure on each element which gives me things like K_LL, K_UL,
>>>> K_LU, K_UU, F_U, and F_L. I can do some math:
>>>>
>>>> K_LL = -K_LU * K_UU^-1 * K_UL + K_LL
>>>> F_L = -K_LU * K_UU^-1 * F_U + F_L
>>>>
>>>> And then I feed K_LL and F_L into the global system matrix and vector
>>>> respectively. I do something (like a linear solve) which gives me an
>>>> increment to L, I'll call it dL. I loop back through and do a finite
>>>> element assembly again using **L0 and U0** (or one could in theory save off
>>>> the previous assemblies) to once again obtain the same K_LL, K_UL, K_LU,
>>>> K_UU, F_U, F_L. And now I can compute the increment for U, dU, according to
>>>>
>>>> dU = K_UU^-1 * (-F_U - K_UL * dL)
>>>>
>>>> Armed now with both dL and dU, I am ready to perform a new residual
>>>> evaluation with (L0 + dL, U0 + dU) = (L1, U1).
>>>>
>>>> The key part is that I cannot get U1 (or more generally an arbitrary U)
>>>> just given L1 (or more generally an arbitrary L). In order to get U1, I
>>>> must know both L0 and dL (and U0 of course). This is because at its core U
>>>> is not some auxiliary vector; it represents true degrees of freedom.
>>>>
>>>> On Thu, Nov 30, 2023 at 12:32?PM Barry Smith <bsmith at petsc.dev> wrote:
>>>>
>>>>>
>>>>>   Why is this all not part of the function evaluation?
>>>>>
>>>>>
>>>>> > On Nov 30, 2023, at 3:25?PM, Alexander Lindsay <
>>>>> alexlindsay239 at gmail.com> wrote:
>>>>> >
>>>>> > Hi I'm looking at the sources, and I believe the answer is no, but
>>>>> is there a dedicated callback that is akin to SNESLineSearchPrecheck but is
>>>>> called before *each* function evaluation in a line search method? I am
>>>>> using a Hybridized Discontinuous Galerkin method in which most of the
>>>>> degrees of freedom are eliminated from the global system. However, an
>>>>> accurate function evaluation requires that an update to the "global" dofs
>>>>> also trigger an update to the eliminated dofs.
>>>>> >
>>>>> > I can almost certainly do this manually but I believe it would be
>>>>> more prone to error than a dedicated callback.
>>>>>
>>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231205/3a580791/attachment.html>

From jeremy.theler-ext at ansys.com  Tue Dec  5 14:22:25 2023
From: jeremy.theler-ext at ansys.com (Jeremy Theler (External))
Date: Tue, 5 Dec 2023 20:22:25 +0000
Subject: [petsc-users] Help for MatNullSpaceCreateRigidBody
In-Reply-To: <95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu>
References: <97c05c18-214b-49f6-85cf-c8f512a3b047@monash.edu>
	<CAMYG4G=f75fYSpuC6wMZGE6L9p7su9AaSYA_bgTFM3NKu7_P6w@mail.gmail.com>
	<95cc9291-9dc9-4499-a84b-7e4952a46b43@monash.edu>
Message-ID: <BYAPR01MB38791FFD2EED2220BABE0846A785A@BYAPR01MB3879.prod.exchangelabs.com>

just in case it helps, here's one way I have to create the near nullspace:

https://github.com/seamplex/feenox/blob/main/src/pdes/mechanical/init.c#L468

--
jeremy
________________________________
From: petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of Jordi Manyer Fuertes via petsc-users <petsc-users at mcs.anl.gov>
Sent: Tuesday, December 5, 2023 9:57 AM
To: Matthew Knepley <knepley at gmail.com>; bsmith at petsc.dev <bsmith at petsc.dev>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Help for MatNullSpaceCreateRigidBody


[External Sender]

Thanks for the prompt response. Both answers look like what I'm doing.

After playing a bit more with solver, I managed to make it run in parallel with different boundary conditions (full dirichlet bcs, vs mixed newmann + dirichlet). This raises two questions:

- How relevant are boundary conditions (eliminating dirichlet rows/cols vs weak newmann bcs) to the solver? Should I modify something when changing boundary conditions?

- Also, the solver did well with the old bcs when run in a single processor (but not in parallel). This seems odd, since parallel and serial behavior should be consistent (or not?). Could it be fault of the PCGAMG? I believe the default local solver is ILU, shoud I be changing it to LU or something else for these kind of problems?

Thank you both again,

Jordi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231205/d85d4ebf/attachment-0001.html>

From srvenkat at utexas.edu  Tue Dec  5 17:15:37 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Tue, 5 Dec 2023 17:15:37 -0600
Subject: [petsc-users] Scattering a vector to/from a subset of processors
In-Reply-To: <CADtq7MvNRSTMP4FxawHWT5WG7rr_zCFN4reJU_NtSvb1OsO56w@mail.gmail.com>
References: <CADtq7MsGCtU-DQcsRABth3ixZ21D9nYLtuUqRc3++yy3mZ-PFw@mail.gmail.com>
	<CA+MQGp_iWB5ztpAHD8k7Rs0fKqBWUusRO79chyCqO_no=53-Bg@mail.gmail.com>
	<CADtq7MvNRSTMP4FxawHWT5WG7rr_zCFN4reJU_NtSvb1OsO56w@mail.gmail.com>
Message-ID: <CADtq7MtwbPuAqrmXYrbF+JK+kjMufUveiSc4q+JWJaZ6W24GiA@mail.gmail.com>

Hi, I have a follow up question on this.

Now, I'm trying to do a scatter and permutation of the vector. Under the
same setup as the original example, here are the new Start and Finish
states I want to achieve:
 Start                                Finish
Proc | Entries                 Proc | Entries
    0   |  0,...,8                     0   | 0, 12, 24
    1   |  9,...,17                   1   | 1, 13, 25
    2   |  18,...,26                 2   | 2, 14, 26
    3   |  27,...,35                 3   | 3, 15, 27
    4   |  None                     4   | 4, 16, 28
    5   |  None                     5   | 5, 17, 29
    6   |  None                     6   | 6, 18, 30
    7   |  None                     7   | 7, 19, 31
    8   |  None                     8   | 8, 20, 32
    9   |  None                     9   | 9, 21, 33
    10   |  None                   10 | 10, 22, 34
    11   |  None                   11  | 11, 23, 35

So far, I've tried to use ISCreateGeneral(), with each process giving an
idx array corresponding to the indices it wants (i.e. idx on P0 looks like
[0,12,24] P1 -> [1,13, 25], and so on).
Then I use this to create the VecScatter with VecScatterCreate(x, is, y,
NULL, &scatter).

However, when I try to do the scatter, I get some illegal memory access
errors.

Is there something wrong with how I define the index sets?

Thanks,
Sreeram


On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat <srvenkat at utexas.edu>
wrote:

> Thank you. This works for me.
>
> Sreeram
>
> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Hi, Sreeram,
>> You can try this code. Since x, y are both MPI vectors, we just need to
>> say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35],
>> [], ..., [].  Actually, you can do it arbitrarily, say, with 12 index sets
>> [0..17], [18..35], .., [].  PETSc will figure out how to do the
>> communication.
>>
>> PetscInt rstart, rend, N;
>> IS ix;
>> VecScatter vscat;
>> Vec y;
>> MPI_Comm comm;
>> VecType type;
>>
>> PetscObjectGetComm((PetscObject)x, &comm);
>> VecGetType(x, &type);
>> VecGetSize(x, &N);
>> VecGetOwnershipRange(x, &rstart, &rend);
>>
>> VecCreate(comm, &y);
>> VecSetSizes(y, PETSC_DECIDE, N);
>> VecSetType(y, type);
>>
>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart, 1,
>> &ix);
>> VecScatterCreate(x, ix, y, ix, &vscat);
>>
>> --Junchao Zhang
>>
>>
>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>>> Suppose I am running on 12 processors, and I have a vector "v" of size
>>> 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so it
>>> has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3,
>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure
>>> what IndexSets to use for the sender and receiver.
>>>
>>> The result I am trying to achieve is this:
>>>
>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>
>>>      Start                                Finish
>>> Proc | Entries                 Proc | Entries
>>>     0   |  0,...,8                     0   | 0, 1, 2
>>>     1   |  9,...,17                   1   | 3, 4, 5
>>>     2   |  18,...,26                 2   | 6, 7, 8
>>>     3   |  27,...,35                 3   | 9, 10, 11
>>>     4   |  None                     4   | 12, 13, 14
>>>     5   |  None                     5   | 15, 16, 17
>>>     6   |  None                     6   | 18, 19, 20
>>>     7   |  None                     7   | 21, 22, 23
>>>     8   |  None                     8   | 24, 25, 26
>>>     9   |  None                     9   | 27, 28, 29
>>>     10   |  None                   10 | 30, 31, 32
>>>     11   |  None                   11  | 33, 34, 35
>>>
>>> Appreciate any help you can provide on this.
>>>
>>> Thanks,
>>> Sreeram
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231205/d983631b/attachment.html>

From junchao.zhang at gmail.com  Tue Dec  5 21:29:59 2023
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Tue, 5 Dec 2023 21:29:59 -0600
Subject: [petsc-users] Scattering a vector to/from a subset of processors
In-Reply-To: <CADtq7MtwbPuAqrmXYrbF+JK+kjMufUveiSc4q+JWJaZ6W24GiA@mail.gmail.com>
References: <CADtq7MsGCtU-DQcsRABth3ixZ21D9nYLtuUqRc3++yy3mZ-PFw@mail.gmail.com>
	<CA+MQGp_iWB5ztpAHD8k7Rs0fKqBWUusRO79chyCqO_no=53-Bg@mail.gmail.com>
	<CADtq7MvNRSTMP4FxawHWT5WG7rr_zCFN4reJU_NtSvb1OsO56w@mail.gmail.com>
	<CADtq7MtwbPuAqrmXYrbF+JK+kjMufUveiSc4q+JWJaZ6W24GiA@mail.gmail.com>
Message-ID: <CA+MQGp-C+==c_TdKaBs433FBy95v7drh0jTau7nHreLfw+=tEA@mail.gmail.com>

I think your approach is correct.  Do you have an example code?

--Junchao Zhang


On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat <srvenkat at utexas.edu> wrote:

> Hi, I have a follow up question on this.
>
> Now, I'm trying to do a scatter and permutation of the vector. Under the
> same setup as the original example, here are the new Start and Finish
> states I want to achieve:
>  Start                                Finish
> Proc | Entries                 Proc | Entries
>     0   |  0,...,8                     0   | 0, 12, 24
>     1   |  9,...,17                   1   | 1, 13, 25
>     2   |  18,...,26                 2   | 2, 14, 26
>     3   |  27,...,35                 3   | 3, 15, 27
>     4   |  None                     4   | 4, 16, 28
>     5   |  None                     5   | 5, 17, 29
>     6   |  None                     6   | 6, 18, 30
>     7   |  None                     7   | 7, 19, 31
>     8   |  None                     8   | 8, 20, 32
>     9   |  None                     9   | 9, 21, 33
>     10   |  None                   10 | 10, 22, 34
>     11   |  None                   11  | 11, 23, 35
>
> So far, I've tried to use ISCreateGeneral(), with each process giving an
> idx array corresponding to the indices it wants (i.e. idx on P0 looks like
> [0,12,24] P1 -> [1,13, 25], and so on).
> Then I use this to create the VecScatter with VecScatterCreate(x, is, y,
> NULL, &scatter).
>
> However, when I try to do the scatter, I get some illegal memory access
> errors.
>
> Is there something wrong with how I define the index sets?
>
> Thanks,
> Sreeram
>
>
>
>
>
> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat <srvenkat at utexas.edu>
> wrote:
>
>> Thank you. This works for me.
>>
>> Sreeram
>>
>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>> Hi, Sreeram,
>>> You can try this code. Since x, y are both MPI vectors, we just need to
>>> say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35],
>>> [], ..., [].  Actually, you can do it arbitrarily, say, with 12 index sets
>>> [0..17], [18..35], .., [].  PETSc will figure out how to do the
>>> communication.
>>>
>>> PetscInt rstart, rend, N;
>>> IS ix;
>>> VecScatter vscat;
>>> Vec y;
>>> MPI_Comm comm;
>>> VecType type;
>>>
>>> PetscObjectGetComm((PetscObject)x, &comm);
>>> VecGetType(x, &type);
>>> VecGetSize(x, &N);
>>> VecGetOwnershipRange(x, &rstart, &rend);
>>>
>>> VecCreate(comm, &y);
>>> VecSetSizes(y, PETSC_DECIDE, N);
>>> VecSetType(y, type);
>>>
>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart, 1,
>>> &ix);
>>> VecScatterCreate(x, ix, y, ix, &vscat);
>>>
>>> --Junchao Zhang
>>>
>>>
>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>>> Suppose I am running on 12 processors, and I have a vector "v" of size
>>>> 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so it
>>>> has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3,
>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure
>>>> what IndexSets to use for the sender and receiver.
>>>>
>>>> The result I am trying to achieve is this:
>>>>
>>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>>
>>>>      Start                                Finish
>>>> Proc | Entries                 Proc | Entries
>>>>     0   |  0,...,8                     0   | 0, 1, 2
>>>>     1   |  9,...,17                   1   | 3, 4, 5
>>>>     2   |  18,...,26                 2   | 6, 7, 8
>>>>     3   |  27,...,35                 3   | 9, 10, 11
>>>>     4   |  None                     4   | 12, 13, 14
>>>>     5   |  None                     5   | 15, 16, 17
>>>>     6   |  None                     6   | 18, 19, 20
>>>>     7   |  None                     7   | 21, 22, 23
>>>>     8   |  None                     8   | 24, 25, 26
>>>>     9   |  None                     9   | 27, 28, 29
>>>>     10   |  None                   10 | 30, 31, 32
>>>>     11   |  None                   11  | 33, 34, 35
>>>>
>>>> Appreciate any help you can provide on this.
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231205/ad97b138/attachment-0001.html>

From srvenkat at utexas.edu  Tue Dec  5 22:09:21 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Tue, 5 Dec 2023 22:09:21 -0600
Subject: [petsc-users] Scattering a vector to/from a subset of processors
In-Reply-To: <CA+MQGp-C+==c_TdKaBs433FBy95v7drh0jTau7nHreLfw+=tEA@mail.gmail.com>
References: <CADtq7MsGCtU-DQcsRABth3ixZ21D9nYLtuUqRc3++yy3mZ-PFw@mail.gmail.com>
	<CA+MQGp_iWB5ztpAHD8k7Rs0fKqBWUusRO79chyCqO_no=53-Bg@mail.gmail.com>
	<CADtq7MvNRSTMP4FxawHWT5WG7rr_zCFN4reJU_NtSvb1OsO56w@mail.gmail.com>
	<CADtq7MtwbPuAqrmXYrbF+JK+kjMufUveiSc4q+JWJaZ6W24GiA@mail.gmail.com>
	<CA+MQGp-C+==c_TdKaBs433FBy95v7drh0jTau7nHreLfw+=tEA@mail.gmail.com>
Message-ID: <CADtq7MuO=2RYwx6gt_zLb98CX51qcN6j0fsrckn0GaSXJLSd6w@mail.gmail.com>

Yes, I have an example code at github.com/s769/petsc-test. Only thing is,
when I described the example before, I simplified the actual use case in
the code to make things simpler. Here are the extra details relevant to
this code:

   - We assume a 2D processor grid, given by the command-line args
   -proc_rows and -proc_cols
   - The total length of the vector is n_m*n_t given by command-line args
   -nm and -nt. n_m corresponds to a space index and n_t a time index.
   - In the "Start" phase, the vector is divided into n_m blocks each of
   size n_t (indexed space->time). The blocks are partitioned over the first
   row of processors. For example if -nm = 4 and -proc_cols = 4, each
   processor in the first row will get one block of size n_t. Each processor
   in the first row gets n_m_local blocks of size n_t, where the sum of all
   n_m_locals for the first row of processors is n_m.
   - In the "Finish" phase, the vector is divided into n_t blocks each of
   size n_m (indexed time->space; this is the reason for the permutation of
   indices). The blocks are partitioned over all processors. Each processor
   will get n_t_local blocks of size n_m, where the sum of all n_t_locals for
   all processors is n_t.

I think the basic idea is similar to the previous example, but these
details make things a bit more complicated. Please let me know if anything
is unclear, and I can try to explain more.

Thanks for your help,
Sreeram

On Tue, Dec 5, 2023 at 9:30?PM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> I think your approach is correct.  Do you have an example code?
>
> --Junchao Zhang
>
>
> On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat <srvenkat at utexas.edu>
> wrote:
>
>> Hi, I have a follow up question on this.
>>
>> Now, I'm trying to do a scatter and permutation of the vector. Under the
>> same setup as the original example, here are the new Start and Finish
>> states I want to achieve:
>>  Start                                Finish
>> Proc | Entries                 Proc | Entries
>>     0   |  0,...,8                     0   | 0, 12, 24
>>     1   |  9,...,17                   1   | 1, 13, 25
>>     2   |  18,...,26                 2   | 2, 14, 26
>>     3   |  27,...,35                 3   | 3, 15, 27
>>     4   |  None                     4   | 4, 16, 28
>>     5   |  None                     5   | 5, 17, 29
>>     6   |  None                     6   | 6, 18, 30
>>     7   |  None                     7   | 7, 19, 31
>>     8   |  None                     8   | 8, 20, 32
>>     9   |  None                     9   | 9, 21, 33
>>     10   |  None                   10 | 10, 22, 34
>>     11   |  None                   11  | 11, 23, 35
>>
>> So far, I've tried to use ISCreateGeneral(), with each process giving an
>> idx array corresponding to the indices it wants (i.e. idx on P0 looks like
>> [0,12,24] P1 -> [1,13, 25], and so on).
>> Then I use this to create the VecScatter with VecScatterCreate(x, is, y,
>> NULL, &scatter).
>>
>> However, when I try to do the scatter, I get some illegal memory access
>> errors.
>>
>> Is there something wrong with how I define the index sets?
>>
>> Thanks,
>> Sreeram
>>
>>
>>
>>
>>
>> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>>> Thank you. This works for me.
>>>
>>> Sreeram
>>>
>>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>>> Hi, Sreeram,
>>>> You can try this code. Since x, y are both MPI vectors, we just need to
>>>> say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35],
>>>> [], ..., [].  Actually, you can do it arbitrarily, say, with 12 index sets
>>>> [0..17], [18..35], .., [].  PETSc will figure out how to do the
>>>> communication.
>>>>
>>>> PetscInt rstart, rend, N;
>>>> IS ix;
>>>> VecScatter vscat;
>>>> Vec y;
>>>> MPI_Comm comm;
>>>> VecType type;
>>>>
>>>> PetscObjectGetComm((PetscObject)x, &comm);
>>>> VecGetType(x, &type);
>>>> VecGetSize(x, &N);
>>>> VecGetOwnershipRange(x, &rstart, &rend);
>>>>
>>>> VecCreate(comm, &y);
>>>> VecSetSizes(y, PETSC_DECIDE, N);
>>>> VecSetType(y, type);
>>>>
>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart,
>>>> 1, &ix);
>>>> VecScatterCreate(x, ix, y, ix, &vscat);
>>>>
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>>> Suppose I am running on 12 processors, and I have a vector "v" of size
>>>>> 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so it
>>>>> has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3,
>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure
>>>>> what IndexSets to use for the sender and receiver.
>>>>>
>>>>> The result I am trying to achieve is this:
>>>>>
>>>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>>>
>>>>>      Start                                Finish
>>>>> Proc | Entries                 Proc | Entries
>>>>>     0   |  0,...,8                     0   | 0, 1, 2
>>>>>     1   |  9,...,17                   1   | 3, 4, 5
>>>>>     2   |  18,...,26                 2   | 6, 7, 8
>>>>>     3   |  27,...,35                 3   | 9, 10, 11
>>>>>     4   |  None                     4   | 12, 13, 14
>>>>>     5   |  None                     5   | 15, 16, 17
>>>>>     6   |  None                     6   | 18, 19, 20
>>>>>     7   |  None                     7   | 21, 22, 23
>>>>>     8   |  None                     8   | 24, 25, 26
>>>>>     9   |  None                     9   | 27, 28, 29
>>>>>     10   |  None                   10 | 30, 31, 32
>>>>>     11   |  None                   11  | 33, 34, 35
>>>>>
>>>>> Appreciate any help you can provide on this.
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231205/966cc722/attachment.html>

From junchao.zhang at gmail.com  Wed Dec  6 11:12:04 2023
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 6 Dec 2023 11:12:04 -0600
Subject: [petsc-users] Scattering a vector to/from a subset of processors
In-Reply-To: <CADtq7MuO=2RYwx6gt_zLb98CX51qcN6j0fsrckn0GaSXJLSd6w@mail.gmail.com>
References: <CADtq7MsGCtU-DQcsRABth3ixZ21D9nYLtuUqRc3++yy3mZ-PFw@mail.gmail.com>
	<CA+MQGp_iWB5ztpAHD8k7Rs0fKqBWUusRO79chyCqO_no=53-Bg@mail.gmail.com>
	<CADtq7MvNRSTMP4FxawHWT5WG7rr_zCFN4reJU_NtSvb1OsO56w@mail.gmail.com>
	<CADtq7MtwbPuAqrmXYrbF+JK+kjMufUveiSc4q+JWJaZ6W24GiA@mail.gmail.com>
	<CA+MQGp-C+==c_TdKaBs433FBy95v7drh0jTau7nHreLfw+=tEA@mail.gmail.com>
	<CADtq7MuO=2RYwx6gt_zLb98CX51qcN6j0fsrckn0GaSXJLSd6w@mail.gmail.com>
Message-ID: <CA+MQGp_WZ+Vw54f5uKqxc7Q2Y_GiBNcP0YEk9SKSGyXp5Xi84w@mail.gmail.com>

Hi, Sreeram,
  I made an example with your approach.  It worked fine as you see the
output at the end

#include <petscvec.h>
int main(int argc, char **argv)
{
PetscInt i, j, rstart, rend, n, N, *indices;
PetscMPIInt size, rank;
IS ix;
VecScatter vscat;
Vec x, y;

PetscFunctionBeginUser;
PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL));
PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));
PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));

PetscCall(VecCreate(PETSC_COMM_WORLD, &x));
PetscCall(VecSetFromOptions(x));
PetscCall(PetscObjectSetName((PetscObject)x, "Vec X"));
n = (rank < 4) ? 9 : 0;
PetscCall(VecSetSizes(x, n, PETSC_DECIDE));

PetscCall(VecGetOwnershipRange(x, &rstart, &rend));
for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i, (PetscScalar)i,
INSERT_VALUES));
PetscCall(VecAssemblyBegin(x));
PetscCall(VecAssemblyEnd(x));
PetscCall(VecGetSize(x, &N));

PetscCall(VecCreate(PETSC_COMM_WORLD, &y));
PetscCall(VecSetFromOptions(y));
PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y"));
PetscCall(VecSetSizes(y, PETSC_DECIDE, N));

PetscCall(VecGetOwnershipRange(y, &rstart, &rend));
PetscCall(PetscMalloc1(rend - rstart, &indices));
for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j;

PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices,
PETSC_OWN_POINTER, &ix));
PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat));

PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));

PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD));
PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD));
PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD));

PetscCall(VecScatterDestroy(&vscat));
PetscCall(ISDestroy(&ix));
PetscCall(VecDestroy(&x));
PetscCall(VecDestroy(&y));
PetscCall(PetscFinalize());
return 0;
}

$ mpirun -n 12 ./ex100
IS Object: 12 MPI processes
  type: general
[0] Number of indices in set 3
[0] 0 0
[0] 1 12
[0] 2 24
[1] Number of indices in set 3
[1] 0 1
[1] 1 13
[1] 2 25
[2] Number of indices in set 3
[2] 0 2
[2] 1 14
[2] 2 26
[3] Number of indices in set 3
[3] 0 3
[3] 1 15
[3] 2 27
[4] Number of indices in set 3
[4] 0 4
[4] 1 16
[4] 2 28
[5] Number of indices in set 3
[5] 0 5
[5] 1 17
[5] 2 29
[6] Number of indices in set 3
[6] 0 6
[6] 1 18
[6] 2 30
[7] Number of indices in set 3
[7] 0 7
[7] 1 19
[7] 2 31
[8] Number of indices in set 3
[8] 0 8
[8] 1 20
[8] 2 32
[9] Number of indices in set 3
[9] 0 9
[9] 1 21
[9] 2 33
[10] Number of indices in set 3
[10] 0 10
[10] 1 22
[10] 2 34
[11] Number of indices in set 3
[11] 0 11
[11] 1 23
[11] 2 35
Vec Object: Vec X 12 MPI processes
  type: mpi
Process [0]
0.
1.
2.
3.
4.
5.
6.
7.
8.
Process [1]
9.
10.
11.
12.
13.
14.
15.
16.
17.
Process [2]
18.
19.
20.
21.
22.
23.
24.
25.
26.
Process [3]
27.
28.
29.
30.
31.
32.
33.
34.
35.
Process [4]
Process [5]
Process [6]
Process [7]
Process [8]
Process [9]
Process [10]
Process [11]
Vec Object: Vec Y 12 MPI processes
  type: mpi
Process [0]
0.
12.
24.
Process [1]
1.
13.
25.
Process [2]
2.
14.
26.
Process [3]
3.
15.
27.
Process [4]
4.
16.
28.
Process [5]
5.
17.
29.
Process [6]
6.
18.
30.
Process [7]
7.
19.
31.
Process [8]
8.
20.
32.
Process [9]
9.
21.
33.
Process [10]
10.
22.
34.
Process [11]
11.
23.
35.

--Junchao Zhang


On Tue, Dec 5, 2023 at 10:09?PM Sreeram R Venkat <srvenkat at utexas.edu>
wrote:

> Yes, I have an example code at github.com/s769/petsc-test. Only thing is,
> when I described the example before, I simplified the actual use case in
> the code to make things simpler. Here are the extra details relevant to
> this code:
>
>    - We assume a 2D processor grid, given by the command-line args
>    -proc_rows and -proc_cols
>    - The total length of the vector is n_m*n_t given by command-line args
>    -nm and -nt. n_m corresponds to a space index and n_t a time index.
>    - In the "Start" phase, the vector is divided into n_m blocks each of
>    size n_t (indexed space->time). The blocks are partitioned over the first
>    row of processors. For example if -nm = 4 and -proc_cols = 4, each
>    processor in the first row will get one block of size n_t. Each processor
>    in the first row gets n_m_local blocks of size n_t, where the sum of all
>    n_m_locals for the first row of processors is n_m.
>    - In the "Finish" phase, the vector is divided into n_t blocks each of
>    size n_m (indexed time->space; this is the reason for the permutation of
>    indices). The blocks are partitioned over all processors. Each processor
>    will get n_t_local blocks of size n_m, where the sum of all n_t_locals for
>    all processors is n_t.
>
> I think the basic idea is similar to the previous example, but these
> details make things a bit more complicated. Please let me know if anything
> is unclear, and I can try to explain more.
>
> Thanks for your help,
> Sreeram
>
> On Tue, Dec 5, 2023 at 9:30?PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> I think your approach is correct.  Do you have an example code?
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>>> Hi, I have a follow up question on this.
>>>
>>> Now, I'm trying to do a scatter and permutation of the vector. Under the
>>> same setup as the original example, here are the new Start and Finish
>>> states I want to achieve:
>>>  Start                                Finish
>>> Proc | Entries                 Proc | Entries
>>>     0   |  0,...,8                     0   | 0, 12, 24
>>>     1   |  9,...,17                   1   | 1, 13, 25
>>>     2   |  18,...,26                 2   | 2, 14, 26
>>>     3   |  27,...,35                 3   | 3, 15, 27
>>>     4   |  None                     4   | 4, 16, 28
>>>     5   |  None                     5   | 5, 17, 29
>>>     6   |  None                     6   | 6, 18, 30
>>>     7   |  None                     7   | 7, 19, 31
>>>     8   |  None                     8   | 8, 20, 32
>>>     9   |  None                     9   | 9, 21, 33
>>>     10   |  None                   10 | 10, 22, 34
>>>     11   |  None                   11  | 11, 23, 35
>>>
>>> So far, I've tried to use ISCreateGeneral(), with each process giving an
>>> idx array corresponding to the indices it wants (i.e. idx on P0 looks like
>>> [0,12,24] P1 -> [1,13, 25], and so on).
>>> Then I use this to create the VecScatter with VecScatterCreate(x, is, y,
>>> NULL, &scatter).
>>>
>>> However, when I try to do the scatter, I get some illegal memory access
>>> errors.
>>>
>>> Is there something wrong with how I define the index sets?
>>>
>>> Thanks,
>>> Sreeram
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>>> Thank you. This works for me.
>>>>
>>>> Sreeram
>>>>
>>>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, Sreeram,
>>>>> You can try this code. Since x, y are both MPI vectors, we just need
>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35],
>>>>> [], ..., [].  Actually, you can do it arbitrarily, say, with 12 index sets
>>>>> [0..17], [18..35], .., [].  PETSc will figure out how to do the
>>>>> communication.
>>>>>
>>>>> PetscInt rstart, rend, N;
>>>>> IS ix;
>>>>> VecScatter vscat;
>>>>> Vec y;
>>>>> MPI_Comm comm;
>>>>> VecType type;
>>>>>
>>>>> PetscObjectGetComm((PetscObject)x, &comm);
>>>>> VecGetType(x, &type);
>>>>> VecGetSize(x, &N);
>>>>> VecGetOwnershipRange(x, &rstart, &rend);
>>>>>
>>>>> VecCreate(comm, &y);
>>>>> VecSetSizes(y, PETSC_DECIDE, N);
>>>>> VecSetType(y, type);
>>>>>
>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart,
>>>>> 1, &ix);
>>>>> VecScatterCreate(x, ix, y, ix, &vscat);
>>>>>
>>>>> --Junchao Zhang
>>>>>
>>>>>
>>>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of
>>>>>> size 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so
>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3,
>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure
>>>>>> what IndexSets to use for the sender and receiver.
>>>>>>
>>>>>> The result I am trying to achieve is this:
>>>>>>
>>>>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>>>>
>>>>>>      Start                                Finish
>>>>>> Proc | Entries                 Proc | Entries
>>>>>>     0   |  0,...,8                     0   | 0, 1, 2
>>>>>>     1   |  9,...,17                   1   | 3, 4, 5
>>>>>>     2   |  18,...,26                 2   | 6, 7, 8
>>>>>>     3   |  27,...,35                 3   | 9, 10, 11
>>>>>>     4   |  None                     4   | 12, 13, 14
>>>>>>     5   |  None                     5   | 15, 16, 17
>>>>>>     6   |  None                     6   | 18, 19, 20
>>>>>>     7   |  None                     7   | 21, 22, 23
>>>>>>     8   |  None                     8   | 24, 25, 26
>>>>>>     9   |  None                     9   | 27, 28, 29
>>>>>>     10   |  None                   10 | 30, 31, 32
>>>>>>     11   |  None                   11  | 33, 34, 35
>>>>>>
>>>>>> Appreciate any help you can provide on this.
>>>>>>
>>>>>> Thanks,
>>>>>> Sreeram
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/6ec83f20/attachment-0001.html>

From srvenkat at utexas.edu  Wed Dec  6 13:20:45 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Wed, 6 Dec 2023 13:20:45 -0600
Subject: [petsc-users] Scattering a vector to/from a subset of processors
In-Reply-To: <CA+MQGp_WZ+Vw54f5uKqxc7Q2Y_GiBNcP0YEk9SKSGyXp5Xi84w@mail.gmail.com>
References: <CADtq7MsGCtU-DQcsRABth3ixZ21D9nYLtuUqRc3++yy3mZ-PFw@mail.gmail.com>
	<CA+MQGp_iWB5ztpAHD8k7Rs0fKqBWUusRO79chyCqO_no=53-Bg@mail.gmail.com>
	<CADtq7MvNRSTMP4FxawHWT5WG7rr_zCFN4reJU_NtSvb1OsO56w@mail.gmail.com>
	<CADtq7MtwbPuAqrmXYrbF+JK+kjMufUveiSc4q+JWJaZ6W24GiA@mail.gmail.com>
	<CA+MQGp-C+==c_TdKaBs433FBy95v7drh0jTau7nHreLfw+=tEA@mail.gmail.com>
	<CADtq7MuO=2RYwx6gt_zLb98CX51qcN6j0fsrckn0GaSXJLSd6w@mail.gmail.com>
	<CA+MQGp_WZ+Vw54f5uKqxc7Q2Y_GiBNcP0YEk9SKSGyXp5Xi84w@mail.gmail.com>
Message-ID: <CADtq7MvmeeCdJsPGC3_FoSUcemKVZ01H0VcbXps1gH0M1Zn4eQ@mail.gmail.com>

Thank you for your help. It turned out the problem was that I forgot to
assemble the "x" vector before doing the scatter. It seems to be working
now.

Thanks,
Sreeram

On Wed, Dec 6, 2023 at 11:12?AM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> Hi, Sreeram,
>   I made an example with your approach.  It worked fine as you see the
> output at the end
>
> #include <petscvec.h>
> int main(int argc, char **argv)
> {
> PetscInt i, j, rstart, rend, n, N, *indices;
> PetscMPIInt size, rank;
> IS ix;
> VecScatter vscat;
> Vec x, y;
>
> PetscFunctionBeginUser;
> PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL));
> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));
> PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));
>
> PetscCall(VecCreate(PETSC_COMM_WORLD, &x));
> PetscCall(VecSetFromOptions(x));
> PetscCall(PetscObjectSetName((PetscObject)x, "Vec X"));
> n = (rank < 4) ? 9 : 0;
> PetscCall(VecSetSizes(x, n, PETSC_DECIDE));
>
> PetscCall(VecGetOwnershipRange(x, &rstart, &rend));
> for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i,
> (PetscScalar)i, INSERT_VALUES));
> PetscCall(VecAssemblyBegin(x));
> PetscCall(VecAssemblyEnd(x));
> PetscCall(VecGetSize(x, &N));
>
> PetscCall(VecCreate(PETSC_COMM_WORLD, &y));
> PetscCall(VecSetFromOptions(y));
> PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y"));
> PetscCall(VecSetSizes(y, PETSC_DECIDE, N));
>
> PetscCall(VecGetOwnershipRange(y, &rstart, &rend));
> PetscCall(PetscMalloc1(rend - rstart, &indices));
> for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j;
>
> PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices,
> PETSC_OWN_POINTER, &ix));
> PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat));
>
> PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
> PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
>
> PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD));
> PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD));
> PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD));
>
> PetscCall(VecScatterDestroy(&vscat));
> PetscCall(ISDestroy(&ix));
> PetscCall(VecDestroy(&x));
> PetscCall(VecDestroy(&y));
> PetscCall(PetscFinalize());
> return 0;
> }
>
> $ mpirun -n 12 ./ex100
> IS Object: 12 MPI processes
>   type: general
> [0] Number of indices in set 3
> [0] 0 0
> [0] 1 12
> [0] 2 24
> [1] Number of indices in set 3
> [1] 0 1
> [1] 1 13
> [1] 2 25
> [2] Number of indices in set 3
> [2] 0 2
> [2] 1 14
> [2] 2 26
> [3] Number of indices in set 3
> [3] 0 3
> [3] 1 15
> [3] 2 27
> [4] Number of indices in set 3
> [4] 0 4
> [4] 1 16
> [4] 2 28
> [5] Number of indices in set 3
> [5] 0 5
> [5] 1 17
> [5] 2 29
> [6] Number of indices in set 3
> [6] 0 6
> [6] 1 18
> [6] 2 30
> [7] Number of indices in set 3
> [7] 0 7
> [7] 1 19
> [7] 2 31
> [8] Number of indices in set 3
> [8] 0 8
> [8] 1 20
> [8] 2 32
> [9] Number of indices in set 3
> [9] 0 9
> [9] 1 21
> [9] 2 33
> [10] Number of indices in set 3
> [10] 0 10
> [10] 1 22
> [10] 2 34
> [11] Number of indices in set 3
> [11] 0 11
> [11] 1 23
> [11] 2 35
> Vec Object: Vec X 12 MPI processes
>   type: mpi
> Process [0]
> 0.
> 1.
> 2.
> 3.
> 4.
> 5.
> 6.
> 7.
> 8.
> Process [1]
> 9.
> 10.
> 11.
> 12.
> 13.
> 14.
> 15.
> 16.
> 17.
> Process [2]
> 18.
> 19.
> 20.
> 21.
> 22.
> 23.
> 24.
> 25.
> 26.
> Process [3]
> 27.
> 28.
> 29.
> 30.
> 31.
> 32.
> 33.
> 34.
> 35.
> Process [4]
> Process [5]
> Process [6]
> Process [7]
> Process [8]
> Process [9]
> Process [10]
> Process [11]
> Vec Object: Vec Y 12 MPI processes
>   type: mpi
> Process [0]
> 0.
> 12.
> 24.
> Process [1]
> 1.
> 13.
> 25.
> Process [2]
> 2.
> 14.
> 26.
> Process [3]
> 3.
> 15.
> 27.
> Process [4]
> 4.
> 16.
> 28.
> Process [5]
> 5.
> 17.
> 29.
> Process [6]
> 6.
> 18.
> 30.
> Process [7]
> 7.
> 19.
> 31.
> Process [8]
> 8.
> 20.
> 32.
> Process [9]
> 9.
> 21.
> 33.
> Process [10]
> 10.
> 22.
> 34.
> Process [11]
> 11.
> 23.
> 35.
>
> --Junchao Zhang
>
>
> On Tue, Dec 5, 2023 at 10:09?PM Sreeram R Venkat <srvenkat at utexas.edu>
> wrote:
>
>> Yes, I have an example code at github.com/s769/petsc-test. Only thing
>> is, when I described the example before, I simplified the actual use case
>> in the code to make things simpler. Here are the extra details relevant to
>> this code:
>>
>>    - We assume a 2D processor grid, given by the command-line args
>>    -proc_rows and -proc_cols
>>    - The total length of the vector is n_m*n_t given by command-line
>>    args -nm and -nt. n_m corresponds to a space index and n_t a time index.
>>    - In the "Start" phase, the vector is divided into n_m blocks each of
>>    size n_t (indexed space->time). The blocks are partitioned over the first
>>    row of processors. For example if -nm = 4 and -proc_cols = 4, each
>>    processor in the first row will get one block of size n_t. Each processor
>>    in the first row gets n_m_local blocks of size n_t, where the sum of all
>>    n_m_locals for the first row of processors is n_m.
>>    - In the "Finish" phase, the vector is divided into n_t blocks each
>>    of size n_m (indexed time->space; this is the reason for the permutation of
>>    indices). The blocks are partitioned over all processors. Each processor
>>    will get n_t_local blocks of size n_m, where the sum of all n_t_locals for
>>    all processors is n_t.
>>
>> I think the basic idea is similar to the previous example, but these
>> details make things a bit more complicated. Please let me know if anything
>> is unclear, and I can try to explain more.
>>
>> Thanks for your help,
>> Sreeram
>>
>> On Tue, Dec 5, 2023 at 9:30?PM Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>> I think your approach is correct.  Do you have an example code?
>>>
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>>> Hi, I have a follow up question on this.
>>>>
>>>> Now, I'm trying to do a scatter and permutation of the vector. Under
>>>> the same setup as the original example, here are the new Start and Finish
>>>> states I want to achieve:
>>>>  Start                                Finish
>>>> Proc | Entries                 Proc | Entries
>>>>     0   |  0,...,8                     0   | 0, 12, 24
>>>>     1   |  9,...,17                   1   | 1, 13, 25
>>>>     2   |  18,...,26                 2   | 2, 14, 26
>>>>     3   |  27,...,35                 3   | 3, 15, 27
>>>>     4   |  None                     4   | 4, 16, 28
>>>>     5   |  None                     5   | 5, 17, 29
>>>>     6   |  None                     6   | 6, 18, 30
>>>>     7   |  None                     7   | 7, 19, 31
>>>>     8   |  None                     8   | 8, 20, 32
>>>>     9   |  None                     9   | 9, 21, 33
>>>>     10   |  None                   10 | 10, 22, 34
>>>>     11   |  None                   11  | 11, 23, 35
>>>>
>>>> So far, I've tried to use ISCreateGeneral(), with each process giving
>>>> an idx array corresponding to the indices it wants (i.e. idx on P0 looks
>>>> like [0,12,24] P1 -> [1,13, 25], and so on).
>>>> Then I use this to create the VecScatter with VecScatterCreate(x, is,
>>>> y, NULL, &scatter).
>>>>
>>>> However, when I try to do the scatter, I get some illegal memory access
>>>> errors.
>>>>
>>>> Is there something wrong with how I define the index sets?
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>>> Thank you. This works for me.
>>>>>
>>>>> Sreeram
>>>>>
>>>>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang <junchao.zhang at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, Sreeram,
>>>>>> You can try this code. Since x, y are both MPI vectors, we just need
>>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35],
>>>>>> [], ..., [].  Actually, you can do it arbitrarily, say, with 12 index sets
>>>>>> [0..17], [18..35], .., [].  PETSc will figure out how to do the
>>>>>> communication.
>>>>>>
>>>>>> PetscInt rstart, rend, N;
>>>>>> IS ix;
>>>>>> VecScatter vscat;
>>>>>> Vec y;
>>>>>> MPI_Comm comm;
>>>>>> VecType type;
>>>>>>
>>>>>> PetscObjectGetComm((PetscObject)x, &comm);
>>>>>> VecGetType(x, &type);
>>>>>> VecGetSize(x, &N);
>>>>>> VecGetOwnershipRange(x, &rstart, &rend);
>>>>>>
>>>>>> VecCreate(comm, &y);
>>>>>> VecSetSizes(y, PETSC_DECIDE, N);
>>>>>> VecSetType(y, type);
>>>>>>
>>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart,
>>>>>> rstart, 1, &ix);
>>>>>> VecScatterCreate(x, ix, y, ix, &vscat);
>>>>>>
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of
>>>>>>> size 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so
>>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>>>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3,
>>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure
>>>>>>> what IndexSets to use for the sender and receiver.
>>>>>>>
>>>>>>> The result I am trying to achieve is this:
>>>>>>>
>>>>>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>>>>>
>>>>>>>      Start                                Finish
>>>>>>> Proc | Entries                 Proc | Entries
>>>>>>>     0   |  0,...,8                     0   | 0, 1, 2
>>>>>>>     1   |  9,...,17                   1   | 3, 4, 5
>>>>>>>     2   |  18,...,26                 2   | 6, 7, 8
>>>>>>>     3   |  27,...,35                 3   | 9, 10, 11
>>>>>>>     4   |  None                     4   | 12, 13, 14
>>>>>>>     5   |  None                     5   | 15, 16, 17
>>>>>>>     6   |  None                     6   | 18, 19, 20
>>>>>>>     7   |  None                     7   | 21, 22, 23
>>>>>>>     8   |  None                     8   | 24, 25, 26
>>>>>>>     9   |  None                     9   | 27, 28, 29
>>>>>>>     10   |  None                   10 | 30, 31, 32
>>>>>>>     11   |  None                   11  | 33, 34, 35
>>>>>>>
>>>>>>> Appreciate any help you can provide on this.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/1310f653/attachment-0001.html>

From junchao.zhang at gmail.com  Wed Dec  6 14:36:59 2023
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 6 Dec 2023 14:36:59 -0600
Subject: [petsc-users] Scattering a vector to/from a subset of processors
In-Reply-To: <CADtq7MvmeeCdJsPGC3_FoSUcemKVZ01H0VcbXps1gH0M1Zn4eQ@mail.gmail.com>
References: <CADtq7MsGCtU-DQcsRABth3ixZ21D9nYLtuUqRc3++yy3mZ-PFw@mail.gmail.com>
	<CA+MQGp_iWB5ztpAHD8k7Rs0fKqBWUusRO79chyCqO_no=53-Bg@mail.gmail.com>
	<CADtq7MvNRSTMP4FxawHWT5WG7rr_zCFN4reJU_NtSvb1OsO56w@mail.gmail.com>
	<CADtq7MtwbPuAqrmXYrbF+JK+kjMufUveiSc4q+JWJaZ6W24GiA@mail.gmail.com>
	<CA+MQGp-C+==c_TdKaBs433FBy95v7drh0jTau7nHreLfw+=tEA@mail.gmail.com>
	<CADtq7MuO=2RYwx6gt_zLb98CX51qcN6j0fsrckn0GaSXJLSd6w@mail.gmail.com>
	<CA+MQGp_WZ+Vw54f5uKqxc7Q2Y_GiBNcP0YEk9SKSGyXp5Xi84w@mail.gmail.com>
	<CADtq7MvmeeCdJsPGC3_FoSUcemKVZ01H0VcbXps1gH0M1Zn4eQ@mail.gmail.com>
Message-ID: <CA+MQGp909qgPW__AAYMyLcLDDAs+w7k+U0B4O4SWTpiZ61C92Q@mail.gmail.com>

Glad it worked!
--Junchao Zhang


On Wed, Dec 6, 2023 at 1:20?PM Sreeram R Venkat <srvenkat at utexas.edu> wrote:

> Thank you for your help. It turned out the problem was that I forgot to
> assemble the "x" vector before doing the scatter. It seems to be working
> now.
>
> Thanks,
> Sreeram
>
> On Wed, Dec 6, 2023 at 11:12?AM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Hi, Sreeram,
>>   I made an example with your approach.  It worked fine as you see the
>> output at the end
>>
>> #include <petscvec.h>
>> int main(int argc, char **argv)
>> {
>> PetscInt i, j, rstart, rend, n, N, *indices;
>> PetscMPIInt size, rank;
>> IS ix;
>> VecScatter vscat;
>> Vec x, y;
>>
>> PetscFunctionBeginUser;
>> PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL));
>> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));
>> PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));
>>
>> PetscCall(VecCreate(PETSC_COMM_WORLD, &x));
>> PetscCall(VecSetFromOptions(x));
>> PetscCall(PetscObjectSetName((PetscObject)x, "Vec X"));
>> n = (rank < 4) ? 9 : 0;
>> PetscCall(VecSetSizes(x, n, PETSC_DECIDE));
>>
>> PetscCall(VecGetOwnershipRange(x, &rstart, &rend));
>> for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i,
>> (PetscScalar)i, INSERT_VALUES));
>> PetscCall(VecAssemblyBegin(x));
>> PetscCall(VecAssemblyEnd(x));
>> PetscCall(VecGetSize(x, &N));
>>
>> PetscCall(VecCreate(PETSC_COMM_WORLD, &y));
>> PetscCall(VecSetFromOptions(y));
>> PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y"));
>> PetscCall(VecSetSizes(y, PETSC_DECIDE, N));
>>
>> PetscCall(VecGetOwnershipRange(y, &rstart, &rend));
>> PetscCall(PetscMalloc1(rend - rstart, &indices));
>> for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j;
>>
>> PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices,
>> PETSC_OWN_POINTER, &ix));
>> PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat));
>>
>> PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
>> PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
>>
>> PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD));
>> PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD));
>> PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD));
>>
>> PetscCall(VecScatterDestroy(&vscat));
>> PetscCall(ISDestroy(&ix));
>> PetscCall(VecDestroy(&x));
>> PetscCall(VecDestroy(&y));
>> PetscCall(PetscFinalize());
>> return 0;
>> }
>>
>> $ mpirun -n 12 ./ex100
>> IS Object: 12 MPI processes
>>   type: general
>> [0] Number of indices in set 3
>> [0] 0 0
>> [0] 1 12
>> [0] 2 24
>> [1] Number of indices in set 3
>> [1] 0 1
>> [1] 1 13
>> [1] 2 25
>> [2] Number of indices in set 3
>> [2] 0 2
>> [2] 1 14
>> [2] 2 26
>> [3] Number of indices in set 3
>> [3] 0 3
>> [3] 1 15
>> [3] 2 27
>> [4] Number of indices in set 3
>> [4] 0 4
>> [4] 1 16
>> [4] 2 28
>> [5] Number of indices in set 3
>> [5] 0 5
>> [5] 1 17
>> [5] 2 29
>> [6] Number of indices in set 3
>> [6] 0 6
>> [6] 1 18
>> [6] 2 30
>> [7] Number of indices in set 3
>> [7] 0 7
>> [7] 1 19
>> [7] 2 31
>> [8] Number of indices in set 3
>> [8] 0 8
>> [8] 1 20
>> [8] 2 32
>> [9] Number of indices in set 3
>> [9] 0 9
>> [9] 1 21
>> [9] 2 33
>> [10] Number of indices in set 3
>> [10] 0 10
>> [10] 1 22
>> [10] 2 34
>> [11] Number of indices in set 3
>> [11] 0 11
>> [11] 1 23
>> [11] 2 35
>> Vec Object: Vec X 12 MPI processes
>>   type: mpi
>> Process [0]
>> 0.
>> 1.
>> 2.
>> 3.
>> 4.
>> 5.
>> 6.
>> 7.
>> 8.
>> Process [1]
>> 9.
>> 10.
>> 11.
>> 12.
>> 13.
>> 14.
>> 15.
>> 16.
>> 17.
>> Process [2]
>> 18.
>> 19.
>> 20.
>> 21.
>> 22.
>> 23.
>> 24.
>> 25.
>> 26.
>> Process [3]
>> 27.
>> 28.
>> 29.
>> 30.
>> 31.
>> 32.
>> 33.
>> 34.
>> 35.
>> Process [4]
>> Process [5]
>> Process [6]
>> Process [7]
>> Process [8]
>> Process [9]
>> Process [10]
>> Process [11]
>> Vec Object: Vec Y 12 MPI processes
>>   type: mpi
>> Process [0]
>> 0.
>> 12.
>> 24.
>> Process [1]
>> 1.
>> 13.
>> 25.
>> Process [2]
>> 2.
>> 14.
>> 26.
>> Process [3]
>> 3.
>> 15.
>> 27.
>> Process [4]
>> 4.
>> 16.
>> 28.
>> Process [5]
>> 5.
>> 17.
>> 29.
>> Process [6]
>> 6.
>> 18.
>> 30.
>> Process [7]
>> 7.
>> 19.
>> 31.
>> Process [8]
>> 8.
>> 20.
>> 32.
>> Process [9]
>> 9.
>> 21.
>> 33.
>> Process [10]
>> 10.
>> 22.
>> 34.
>> Process [11]
>> 11.
>> 23.
>> 35.
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Dec 5, 2023 at 10:09?PM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>>> Yes, I have an example code at github.com/s769/petsc-test. Only thing
>>> is, when I described the example before, I simplified the actual use case
>>> in the code to make things simpler. Here are the extra details relevant to
>>> this code:
>>>
>>>    - We assume a 2D processor grid, given by the command-line args
>>>    -proc_rows and -proc_cols
>>>    - The total length of the vector is n_m*n_t given by command-line
>>>    args -nm and -nt. n_m corresponds to a space index and n_t a time index.
>>>    - In the "Start" phase, the vector is divided into n_m blocks each
>>>    of size n_t (indexed space->time). The blocks are partitioned over the
>>>    first row of processors. For example if -nm = 4 and -proc_cols = 4, each
>>>    processor in the first row will get one block of size n_t. Each processor
>>>    in the first row gets n_m_local blocks of size n_t, where the sum of all
>>>    n_m_locals for the first row of processors is n_m.
>>>    - In the "Finish" phase, the vector is divided into n_t blocks each
>>>    of size n_m (indexed time->space; this is the reason for the permutation of
>>>    indices). The blocks are partitioned over all processors. Each processor
>>>    will get n_t_local blocks of size n_m, where the sum of all n_t_locals for
>>>    all processors is n_t.
>>>
>>> I think the basic idea is similar to the previous example, but these
>>> details make things a bit more complicated. Please let me know if anything
>>> is unclear, and I can try to explain more.
>>>
>>> Thanks for your help,
>>> Sreeram
>>>
>>> On Tue, Dec 5, 2023 at 9:30?PM Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>>> I think your approach is correct.  Do you have an example code?
>>>>
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Tue, Dec 5, 2023 at 5:15?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>>> Hi, I have a follow up question on this.
>>>>>
>>>>> Now, I'm trying to do a scatter and permutation of the vector. Under
>>>>> the same setup as the original example, here are the new Start and Finish
>>>>> states I want to achieve:
>>>>>  Start                                Finish
>>>>> Proc | Entries                 Proc | Entries
>>>>>     0   |  0,...,8                     0   | 0, 12, 24
>>>>>     1   |  9,...,17                   1   | 1, 13, 25
>>>>>     2   |  18,...,26                 2   | 2, 14, 26
>>>>>     3   |  27,...,35                 3   | 3, 15, 27
>>>>>     4   |  None                     4   | 4, 16, 28
>>>>>     5   |  None                     5   | 5, 17, 29
>>>>>     6   |  None                     6   | 6, 18, 30
>>>>>     7   |  None                     7   | 7, 19, 31
>>>>>     8   |  None                     8   | 8, 20, 32
>>>>>     9   |  None                     9   | 9, 21, 33
>>>>>     10   |  None                   10 | 10, 22, 34
>>>>>     11   |  None                   11  | 11, 23, 35
>>>>>
>>>>> So far, I've tried to use ISCreateGeneral(), with each process giving
>>>>> an idx array corresponding to the indices it wants (i.e. idx on P0 looks
>>>>> like [0,12,24] P1 -> [1,13, 25], and so on).
>>>>> Then I use this to create the VecScatter with VecScatterCreate(x, is,
>>>>> y, NULL, &scatter).
>>>>>
>>>>> However, when I try to do the scatter, I get some illegal memory
>>>>> access errors.
>>>>>
>>>>> Is there something wrong with how I define the index sets?
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Oct 5, 2023 at 12:57?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>>> Thank you. This works for me.
>>>>>>
>>>>>> Sreeram
>>>>>>
>>>>>> On Wed, Oct 4, 2023 at 6:41?PM Junchao Zhang <junchao.zhang at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi, Sreeram,
>>>>>>> You can try this code. Since x, y are both MPI vectors, we just need
>>>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>>>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35],
>>>>>>> [], ..., [].  Actually, you can do it arbitrarily, say, with 12 index sets
>>>>>>> [0..17], [18..35], .., [].  PETSc will figure out how to do the
>>>>>>> communication.
>>>>>>>
>>>>>>> PetscInt rstart, rend, N;
>>>>>>> IS ix;
>>>>>>> VecScatter vscat;
>>>>>>> Vec y;
>>>>>>> MPI_Comm comm;
>>>>>>> VecType type;
>>>>>>>
>>>>>>> PetscObjectGetComm((PetscObject)x, &comm);
>>>>>>> VecGetType(x, &type);
>>>>>>> VecGetSize(x, &N);
>>>>>>> VecGetOwnershipRange(x, &rstart, &rend);
>>>>>>>
>>>>>>> VecCreate(comm, &y);
>>>>>>> VecSetSizes(y, PETSC_DECIDE, N);
>>>>>>> VecSetType(y, type);
>>>>>>>
>>>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart,
>>>>>>> rstart, 1, &ix);
>>>>>>> VecScatterCreate(x, ix, y, ix, &vscat);
>>>>>>>
>>>>>>> --Junchao Zhang
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 4, 2023 at 6:03?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of
>>>>>>>> size 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so
>>>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>>>>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3,
>>>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure
>>>>>>>> what IndexSets to use for the sender and receiver.
>>>>>>>>
>>>>>>>> The result I am trying to achieve is this:
>>>>>>>>
>>>>>>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>>>>>>
>>>>>>>>      Start                                Finish
>>>>>>>> Proc | Entries                 Proc | Entries
>>>>>>>>     0   |  0,...,8                     0   | 0, 1, 2
>>>>>>>>     1   |  9,...,17                   1   | 3, 4, 5
>>>>>>>>     2   |  18,...,26                 2   | 6, 7, 8
>>>>>>>>     3   |  27,...,35                 3   | 9, 10, 11
>>>>>>>>     4   |  None                     4   | 12, 13, 14
>>>>>>>>     5   |  None                     5   | 15, 16, 17
>>>>>>>>     6   |  None                     6   | 18, 19, 20
>>>>>>>>     7   |  None                     7   | 21, 22, 23
>>>>>>>>     8   |  None                     8   | 24, 25, 26
>>>>>>>>     9   |  None                     9   | 27, 28, 29
>>>>>>>>     10   |  None                   10 | 30, 31, 32
>>>>>>>>     11   |  None                   11  | 33, 34, 35
>>>>>>>>
>>>>>>>> Appreciate any help you can provide on this.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sreeram
>>>>>>>>
>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/9d02f98c/attachment-0001.html>

From coltonbryant2021 at u.northwestern.edu  Wed Dec  6 16:53:42 2023
From: coltonbryant2021 at u.northwestern.edu (Colton Bryant)
Date: Wed, 6 Dec 2023 16:53:42 -0600
Subject: [petsc-users] DMSTAG Gathering Vector on single process
Message-ID: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>

Hello,

I am working on a code in which a DMSTAG object is used to solve a fluid
flow problem and I need to gather this flow data on a single process to
interact with an existing (serial) library at each timestep of my
simulation. After looking around the solution I've tried is:

-use DMStagVecSplitToDMDA to extract vectors of each component of the flow
-use DMDACreateNaturalVector and DMDAGlobalToNatural to get the components
naturally ordered
-use VecScatterCreateToZero to set up and then do the scatter to gather on
the single process

Unless I'm misunderstanding something this method results in a lot of
memory allocation/freeing happening at each step of the evolution and I was
wondering if there is a way to directly perform such a scatter from the
DMSTAG object without splitting as I'm doing here.

Any advice would be much appreciated!

Best,
Colton Bryant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/aeabae94/attachment.html>

From knepley at gmail.com  Wed Dec  6 17:18:28 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 6 Dec 2023 18:18:28 -0500
Subject: [petsc-users] DMSTAG Gathering Vector on single process
In-Reply-To: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
References: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
Message-ID: <CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>

On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant <
coltonbryant2021 at u.northwestern.edu> wrote:

> Hello,
>
> I am working on a code in which a DMSTAG object is used to solve a fluid
> flow problem and I need to gather this flow data on a single process to
> interact with an existing (serial) library at each timestep of my
> simulation. After looking around the solution I've tried is:
>
> -use DMStagVecSplitToDMDA to extract vectors of each component of the flow
> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the components
> naturally ordered
> -use VecScatterCreateToZero to set up and then do the scatter to gather on
> the single process
>
> Unless I'm misunderstanding something this method results in a lot of
> memory allocation/freeing happening at each step of the evolution and I was
> wondering if there is a way to directly perform such a scatter from the
> DMSTAG object without splitting as I'm doing here.
>

1) You can see here:

https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA

that this function is small. You can do the DMDA creation manually, and
then just call DMStagMigrateVecDMDA() each time, which will not create
anything.

2) You can create the natural vector upfront, and just scatter each time.

3) You can create the serial vector upfront, and just scatter each time.

This is some data movement. You can compress the g2n and 2zero scatters
using

  https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/

as an optimization.

  Thanks,

     Matt


> Any advice would be much appreciated!
>
> Best,
> Colton Bryant
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/16147b7c/attachment.html>

From coltonbryant2021 at u.northwestern.edu  Wed Dec  6 17:37:59 2023
From: coltonbryant2021 at u.northwestern.edu (Colton Bryant)
Date: Wed, 6 Dec 2023 17:37:59 -0600
Subject: [petsc-users] DMSTAG Gathering Vector on single process
In-Reply-To: <CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>
References: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
	<CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>
Message-ID: <CAK7cQXsRApiB_07XLaVwgEMmfJUWz209_t83UWUR-7KyCVgw1g@mail.gmail.com>

Ah excellent! I was not aware of the ability to preallocate the objects and
migrate them each time.

Thanks!
-Colton

On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant <
> coltonbryant2021 at u.northwestern.edu> wrote:
>
>> Hello,
>>
>> I am working on a code in which a DMSTAG object is used to solve a fluid
>> flow problem and I need to gather this flow data on a single process to
>> interact with an existing (serial) library at each timestep of my
>> simulation. After looking around the solution I've tried is:
>>
>> -use DMStagVecSplitToDMDA to extract vectors of each component of the flow
>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the
>> components naturally ordered
>> -use VecScatterCreateToZero to set up and then do the scatter to gather
>> on the single process
>>
>> Unless I'm misunderstanding something this method results in a lot of
>> memory allocation/freeing happening at each step of the evolution and I was
>> wondering if there is a way to directly perform such a scatter from the
>> DMSTAG object without splitting as I'm doing here.
>>
>
> 1) You can see here:
>
> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA
>
> that this function is small. You can do the DMDA creation manually, and
> then just call DMStagMigrateVecDMDA() each time, which will not create
> anything.
>
> 2) You can create the natural vector upfront, and just scatter each time.
>
> 3) You can create the serial vector upfront, and just scatter each time.
>
> This is some data movement. You can compress the g2n and 2zero scatters
> using
>
>   https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/
>
> as an optimization.
>
>   Thanks,
>
>      Matt
>
>
>> Any advice would be much appreciated!
>>
>> Best,
>> Colton Bryant
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/40acde73/attachment-0001.html>

From bsmith at petsc.dev  Wed Dec  6 17:50:52 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 6 Dec 2023 18:50:52 -0500
Subject: [petsc-users] DMSTAG Gathering Vector on single process
In-Reply-To: <CAK7cQXsRApiB_07XLaVwgEMmfJUWz209_t83UWUR-7KyCVgw1g@mail.gmail.com>
References: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
	<CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>
	<CAK7cQXsRApiB_07XLaVwgEMmfJUWz209_t83UWUR-7KyCVgw1g@mail.gmail.com>
Message-ID: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev>


  Depending on the serial library you may not need to split the vector into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just global to natural and scatter to zero on the full vector, now the full vector is on the first rank and you can access what you need in that one vector if possible.
  

> On Dec 6, 2023, at 6:37?PM, Colton Bryant <coltonbryant2021 at u.northwestern.edu> wrote:
> 
> Ah excellent! I was not aware of the ability to preallocate the objects and migrate them each time.
> 
> Thanks!
> -Colton
> 
> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant <coltonbryant2021 at u.northwestern.edu <mailto:coltonbryant2021 at u.northwestern.edu>> wrote:
>>> Hello,
>>> 
>>> I am working on a code in which a DMSTAG object is used to solve a fluid flow problem and I need to gather this flow data on a single process to interact with an existing (serial) library at each timestep of my simulation. After looking around the solution I've tried is:
>>> 
>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the flow
>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the components naturally ordered
>>> -use VecScatterCreateToZero to set up and then do the scatter to gather on the single process
>>> 
>>> Unless I'm misunderstanding something this method results in a lot of memory allocation/freeing happening at each step of the evolution and I was wondering if there is a way to directly perform such a scatter from the DMSTAG object without splitting as I'm doing here.
>> 
>> 1) You can see here:
>> 
>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA
>> 
>> that this function is small. You can do the DMDA creation manually, and then just call DMStagMigrateVecDMDA() each time, which will not create anything.
>> 
>> 2) You can create the natural vector upfront, and just scatter each time.
>> 
>> 3) You can create the serial vector upfront, and just scatter each time.
>> 
>> This is some data movement. You can compress the g2n and 2zero scatters using
>> 
>>   https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/
>> 
>> as an optimization.
>> 
>>   Thanks,
>> 
>>      Matt
>>  
>>> Any advice would be much appreciated!
>>> 
>>> Best,
>>> Colton Bryant
>> 
>> 
>> --
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/280b3bf4/attachment.html>

From knepley at gmail.com  Wed Dec  6 19:35:04 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 6 Dec 2023 20:35:04 -0500
Subject: [petsc-users] DMSTAG Gathering Vector on single process
In-Reply-To: <5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev>
References: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
	<CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>
	<CAK7cQXsRApiB_07XLaVwgEMmfJUWz209_t83UWUR-7KyCVgw1g@mail.gmail.com>
	<5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev>
Message-ID: <CAMYG4GmvCJty72Y=Hu0QnSF+9qKCDzpWP2SMsHt5OjB8uW6Stw@mail.gmail.com>

On Wed, Dec 6, 2023 at 8:10?PM Barry Smith <bsmith at petsc.dev> wrote:

>
>   Depending on the serial library you may not need to split the vector
> into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just
> global to natural and scatter to zero on the full vector, now the full
> vector is on the first rank and you can access what you need in that one
> vector if possible.
>

Does DMStag have a GlobalToNatural? Also, the serial code would have to
have identical interleaving.

  Thanks,

     Matt

> On Dec 6, 2023, at 6:37?PM, Colton Bryant <
> coltonbryant2021 at u.northwestern.edu> wrote:
>
> Ah excellent! I was not aware of the ability to preallocate the objects
> and migrate them each time.
>
> Thanks!
> -Colton
>
> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant <
>> coltonbryant2021 at u.northwestern.edu> wrote:
>>
>>> Hello,
>>>
>>> I am working on a code in which a DMSTAG object is used to solve a fluid
>>> flow problem and I need to gather this flow data on a single process to
>>> interact with an existing (serial) library at each timestep of my
>>> simulation. After looking around the solution I've tried is:
>>>
>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the
>>> flow
>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the
>>> components naturally ordered
>>> -use VecScatterCreateToZero to set up and then do the scatter to gather
>>> on the single process
>>>
>>> Unless I'm misunderstanding something this method results in a lot of
>>> memory allocation/freeing happening at each step of the evolution and I was
>>> wondering if there is a way to directly perform such a scatter from the
>>> DMSTAG object without splitting as I'm doing here.
>>>
>>
>> 1) You can see here:
>>
>>
>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA
>>
>> that this function is small. You can do the DMDA creation manually, and
>> then just call DMStagMigrateVecDMDA() each time, which will not create
>> anything.
>>
>> 2) You can create the natural vector upfront, and just scatter each time.
>>
>> 3) You can create the serial vector upfront, and just scatter each time.
>>
>> This is some data movement. You can compress the g2n and 2zero scatters
>> using
>>
>>   https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/
>>
>> as an optimization.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Any advice would be much appreciated!
>>>
>>> Best,
>>> Colton Bryant
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/8b375dc8/attachment.html>

From bsmith at petsc.dev  Wed Dec  6 20:17:51 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 6 Dec 2023 21:17:51 -0500
Subject: [petsc-users] DMSTAG Gathering Vector on single process
In-Reply-To: <CAMYG4GmvCJty72Y=Hu0QnSF+9qKCDzpWP2SMsHt5OjB8uW6Stw@mail.gmail.com>
References: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
	<CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>
	<CAK7cQXsRApiB_07XLaVwgEMmfJUWz209_t83UWUR-7KyCVgw1g@mail.gmail.com>
	<5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev>
	<CAMYG4GmvCJty72Y=Hu0QnSF+9qKCDzpWP2SMsHt5OjB8uW6Stw@mail.gmail.com>
Message-ID: <73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev>


> On Dec 6, 2023, at 8:35?PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Wed, Dec 6, 2023 at 8:10?PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>> 
>>   Depending on the serial library you may not need to split the vector into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just global to natural and scatter to zero on the full vector, now the full vector is on the first rank and you can access what you need in that one vector if possible.
> 
> Does DMStag have a GlobalToNatural?

   Good point, it does not appear to have such a thing, though it could.

> Also, the serial code would have to have identical interleaving.
> 
>   Thanks,
> 
>      Matt 
>>> On Dec 6, 2023, at 6:37?PM, Colton Bryant <coltonbryant2021 at u.northwestern.edu <mailto:coltonbryant2021 at u.northwestern.edu>> wrote:
>>> 
>>> Ah excellent! I was not aware of the ability to preallocate the objects and migrate them each time. 
>>> 
>>> Thanks!
>>> -Colton
>>> 
>>> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>>>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant <coltonbryant2021 at u.northwestern.edu <mailto:coltonbryant2021 at u.northwestern.edu>> wrote:
>>>>> Hello,
>>>>> 
>>>>> I am working on a code in which a DMSTAG object is used to solve a fluid flow problem and I need to gather this flow data on a single process to interact with an existing (serial) library at each timestep of my simulation. After looking around the solution I've tried is:
>>>>> 
>>>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the flow
>>>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the components naturally ordered
>>>>> -use VecScatterCreateToZero to set up and then do the scatter to gather on the single process
>>>>> 
>>>>> Unless I'm misunderstanding something this method results in a lot of memory allocation/freeing happening at each step of the evolution and I was wondering if there is a way to directly perform such a scatter from the DMSTAG object without splitting as I'm doing here.
>>>> 
>>>> 1) You can see here:
>>>> 
>>>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA
>>>> 
>>>> that this function is small. You can do the DMDA creation manually, and then just call DMStagMigrateVecDMDA() each time, which will not create anything.
>>>> 
>>>> 2) You can create the natural vector upfront, and just scatter each time.
>>>> 
>>>> 3) You can create the serial vector upfront, and just scatter each time.
>>>> 
>>>> This is some data movement. You can compress the g2n and 2zero scatters using
>>>> 
>>>>   https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/
>>>> 
>>>> as an optimization.
>>>> 
>>>>   Thanks,
>>>> 
>>>>      Matt
>>>>  
>>>>> Any advice would be much appreciated!
>>>>> 
>>>>> Best,
>>>>> Colton Bryant 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
>> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/75e038a3/attachment-0001.html>

From srvenkat at utexas.edu  Thu Dec  7 12:17:02 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Thu, 7 Dec 2023 12:17:02 -0600
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
Message-ID: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>

I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n)
and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size
n. The data for v can be stored either in column-major or row-major order.
Now, I want to do 2 types of operations:

1. Matvecs of the form M*v_i = w_i, for i = 1..m.
2. KSPSolves of the form R*x_i = v_i, for i = 1..m.

>From what I have read on the documentation, I can think of 2 approaches.

1. Get the pointer to the data in v (column-major) and use it to create a
dense matrix V. Then do a MatMatMult with M*V = W, and take the data
pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R
and V.

2. Create a MATMAIJ using M/R and use that for matvecs directly with the
vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a
multiple RHS system and act accordingly.

Which would be the more efficient option?

As a side-note, I am also wondering if there is a way to use row-major
storage of the vector v. The reason is that this could allow for more
coalesced memory access when doing matvecs.

Thanks,
Sreeram
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231207/23c4c94b/attachment.html>

From bsmith at petsc.dev  Thu Dec  7 13:34:15 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 7 Dec 2023 14:34:15 -0500
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
Message-ID: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>


> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
> 
> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
> 
> From what I have read on the documentation, I can think of 2 approaches. 
> 
> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
> 
> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
> 
> Which would be the more efficient option?

Use 1. 
> 
> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.

No

> The reason is that this could allow for more coalesced memory access when doing matvecs.

  PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized

> 
> Thanks,
> Sreeram

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231207/825083b1/attachment.html>

From pierre at joliv.et  Thu Dec  7 14:02:19 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 7 Dec 2023 21:02:19 +0100
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
Message-ID: <E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>

To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html.
Also, I?m guessing you are using some sort of preconditioner within your KSP.
Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient.
You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code.

Thanks,
Pierre

> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
> 
> 
> 
>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>> 
>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
>> 
>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>> 
>> From what I have read on the documentation, I can think of 2 approaches. 
>> 
>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
>> 
>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
>> 
>> Which would be the more efficient option?
> 
> Use 1. 
>> 
>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.
> 
> No
> 
>> The reason is that this could allow for more coalesced memory access when doing matvecs.
> 
>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized
> 
>> 
>> Thanks,
>> Sreeram

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231207/47529d7c/attachment.html>

From srvenkat at utexas.edu  Thu Dec  7 14:37:49 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Thu, 7 Dec 2023 14:37:49 -0600
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
Message-ID: <CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>

Thank you Barry and Pierre; I will proceed with the first option.

I want to use the AMGX preconditioner for the KSP. I will try it out and
see how it performs.

Thanks,
Sreeram

On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et> wrote:

> To expand on Barry?s answer, we have observed repeatedly that MatMatMult
> with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce
> this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html
> .
> Also, I?m guessing you are using some sort of preconditioner within your
> KSP.
> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand
> sides column by column, which is very inefficient.
> You could run your code with -info dump and send us dump.0 to see what
> needs to be done on our end to make things more efficient, should you not
> be satisfied with the current performance of the code.
>
> Thanks,
> Pierre
>
> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>
>
>
> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x
> n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has
> size n. The data for v can be stored either in column-major or row-major
> order.  Now, I want to do 2 types of operations:
>
> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>
> From what I have read on the documentation, I can think of 2 approaches.
>
> 1. Get the pointer to the data in v (column-major) and use it to create a
> dense matrix V. Then do a MatMatMult with M*V = W, and take the data
> pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R
> and V.
>
> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the
> vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a
> multiple RHS system and act accordingly.
>
> Which would be the more efficient option?
>
>
> Use 1.
>
>
> As a side-note, I am also wondering if there is a way to use row-major
> storage of the vector v.
>
>
> No
>
> The reason is that this could allow for more coalesced memory access when
> doing matvecs.
>
>
>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for
> the computation so in theory they should already be well-optimized
>
>
> Thanks,
> Sreeram
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231207/8a11e096/attachment-0001.html>

From pierre at joliv.et  Thu Dec  7 15:02:54 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 7 Dec 2023 22:02:54 +0100
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
Message-ID: <D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>


> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> Thank you Barry and Pierre; I will proceed with the first option. 
> 
> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs.

Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation.
BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation.
But let us know if you need assistance figuring things out.

Thanks,
Pierre

> Thanks,
> Sreeram
> 
> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html.
>> Also, I?m guessing you are using some sort of preconditioner within your KSP.
>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient.
>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code.
>> 
>> Thanks,
>> Pierre
>> 
>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>> 
>>> 
>>> 
>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>> 
>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
>>>> 
>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>> 
>>>> From what I have read on the documentation, I can think of 2 approaches. 
>>>> 
>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
>>>> 
>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
>>>> 
>>>> Which would be the more efficient option?
>>> 
>>> Use 1. 
>>>> 
>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.
>>> 
>>> No
>>> 
>>>> The reason is that this could allow for more coalesced memory access when doing matvecs.
>>> 
>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized
>>> 
>>>> 
>>>> Thanks,
>>>> Sreeram
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231207/e6475422/attachment.html>

From srvenkat at utexas.edu  Thu Dec  7 15:10:58 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Thu, 7 Dec 2023 15:10:58 -0600
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
Message-ID: <CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>

Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly
was also tricky so hopefully the HYPRE build will be easier.

Thanks,
Sreeram

On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:

>
>
> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> Thank you Barry and Pierre; I will proceed with the first option.
>
> I want to use the AMGX preconditioner for the KSP. I will try it out and
> see how it performs.
>
>
> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no
> PCMatApply() implementation.
> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation.
> But let us know if you need assistance figuring things out.
>
> Thanks,
> Pierre
>
> Thanks,
> Sreeram
>
> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et> wrote:
>
>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult
>> with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce
>> this on your own with
>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>> Also, I?m guessing you are using some sort of preconditioner within your
>> KSP.
>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>> right-hand sides column by column, which is very inefficient.
>> You could run your code with -info dump and send us dump.0 to see what
>> needs to be done on our end to make things more efficient, should you not
>> be satisfied with the current performance of the code.
>>
>> Thanks,
>> Pierre
>>
>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>
>>
>>
>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>>
>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x
>> n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has
>> size n. The data for v can be stored either in column-major or row-major
>> order.  Now, I want to do 2 types of operations:
>>
>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>
>> From what I have read on the documentation, I can think of 2 approaches.
>>
>> 1. Get the pointer to the data in v (column-major) and use it to create a
>> dense matrix V. Then do a MatMatMult with M*V = W, and take the data
>> pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R
>> and V.
>>
>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the
>> vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a
>> multiple RHS system and act accordingly.
>>
>> Which would be the more efficient option?
>>
>>
>> Use 1.
>>
>>
>> As a side-note, I am also wondering if there is a way to use row-major
>> storage of the vector v.
>>
>>
>> No
>>
>> The reason is that this could allow for more coalesced memory access when
>> doing matvecs.
>>
>>
>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for
>> the computation so in theory they should already be well-optimized
>>
>>
>> Thanks,
>> Sreeram
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231207/4f10e11d/attachment.html>

From mfadams at lbl.gov  Thu Dec  7 16:03:58 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 7 Dec 2023 17:03:58 -0500
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
Message-ID: <CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>

N.B., AMGX interface is a bit experimental.
Mark

On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu> wrote:

> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly
> was also tricky so hopefully the HYPRE build will be easier.
>
> Thanks,
> Sreeram
>
> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>
>>
>>
>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>>
>> Thank you Barry and Pierre; I will proceed with the first option.
>>
>> I want to use the AMGX preconditioner for the KSP. I will try it out and
>> see how it performs.
>>
>>
>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no
>> PCMatApply() implementation.
>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>> implementation.
>> But let us know if you need assistance figuring things out.
>>
>> Thanks,
>> Pierre
>>
>> Thanks,
>> Sreeram
>>
>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>
>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult
>>> with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce
>>> this on your own with
>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>> Also, I?m guessing you are using some sort of preconditioner within your
>>> KSP.
>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>> right-hand sides column by column, which is very inefficient.
>>> You could run your code with -info dump and send us dump.0 to see what
>>> needs to be done on our end to make things more efficient, should you not
>>> be satisfied with the current performance of the code.
>>>
>>> Thanks,
>>> Pierre
>>>
>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>
>>>
>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x
>>> n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has
>>> size n. The data for v can be stored either in column-major or row-major
>>> order.  Now, I want to do 2 types of operations:
>>>
>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>
>>> From what I have read on the documentation, I can think of 2 approaches.
>>>
>>> 1. Get the pointer to the data in v (column-major) and use it to create
>>> a dense matrix V. Then do a MatMatMult with M*V = W, and take the data
>>> pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R
>>> and V.
>>>
>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the
>>> vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a
>>> multiple RHS system and act accordingly.
>>>
>>> Which would be the more efficient option?
>>>
>>>
>>> Use 1.
>>>
>>>
>>> As a side-note, I am also wondering if there is a way to use row-major
>>> storage of the vector v.
>>>
>>>
>>> No
>>>
>>> The reason is that this could allow for more coalesced memory access
>>> when doing matvecs.
>>>
>>>
>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for
>>> the computation so in theory they should already be well-optimized
>>>
>>>
>>> Thanks,
>>> Sreeram
>>>
>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231207/970c979e/attachment-0001.html>

From srvenkat at utexas.edu  Fri Dec  8 12:53:13 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Fri, 8 Dec 2023 12:53:13 -0600
Subject: [petsc-users] Configure error while building PETSc with
 CUDA/MVAPICH2-GDR
Message-ID: <CADtq7Mtn4nd+tj-JxoHMwZU8vw8sbzo_vr6bSz3zf8_+U3LOkQ@mail.gmail.com>

I am trying to build PETSc with CUDA using the CUDA-Aware MVAPICH2-GDR.

Here is my configure command:

./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre
 --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true
--with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis
--download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90

which errors with:

          UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for
details):
---------------------------------------------------------------------------------------------
  CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 -Xcompiler
-fPIC
  -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
arch=compute_80,code=sm_80"
  generated from "--with-cuda-arch=80"


The same configure command works when I use the Intel MPI and I can build
with CUDA. The full config.log file is attached. Please let me know if you
need any other information. I appreciate your help with this.

Thanks,
Sreeram
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231208/9701bb33/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 2306445 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231208/9701bb33/attachment-0001.obj>

From knepley at gmail.com  Fri Dec  8 13:00:33 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 8 Dec 2023 14:00:33 -0500
Subject: [petsc-users] Configure error while building PETSc with
 CUDA/MVAPICH2-GDR
In-Reply-To: <CADtq7Mtn4nd+tj-JxoHMwZU8vw8sbzo_vr6bSz3zf8_+U3LOkQ@mail.gmail.com>
References: <CADtq7Mtn4nd+tj-JxoHMwZU8vw8sbzo_vr6bSz3zf8_+U3LOkQ@mail.gmail.com>
Message-ID: <CAMYG4Gn5werXD_5U0ceKBzA1dERecpTamXBdiZ6B5R3SqFH2nA@mail.gmail.com>

On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat <srvenkat at utexas.edu> wrote:

> I am trying to build PETSc with CUDA using the CUDA-Aware MVAPICH2-GDR.
>
> Here is my configure command:
>
> ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre
>  --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true
> --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis
> --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90
>
> which errors with:
>
>           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for
> details):
>
> ---------------------------------------------------------------------------------------------
>   CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14
> -Xcompiler -fPIC
>   -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
> arch=compute_80,code=sm_80"
>   generated from "--with-cuda-arch=80"
>
>
>
> The same configure command works when I use the Intel MPI and I can build
> with CUDA. The full config.log file is attached. Please let me know if you
> need any other information. I appreciate your help with this.
>

The proximate error is

Executing: nvcc -c -o /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o
-I/tmp/petsc-kn3f29gl/config.setCompilers
-I/tmp/petsc-kn3f29gl/config.types
-I/tmp/petsc-kn3f29gl/config.packages.cuda  -ccbin mpic++ -std=c++14
-Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
arch=compute_80,code=sm_80  /tmp/petsc-kn3f29gl/config.packages.cuda/
conftest.cu
stdout:
/opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
instance of overloaded function "__nv_associate_access_property_impl" has
"C" linkage
1 error detected in the compilation of
"/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu".
Possible ERROR while running compiler: exit code 1
stderr:
/opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
instance of overloaded function "__nv_associate_access_property_impl" has
"C" linkage

1 error detected in the compilation of
"/tmp/petsc-kn3f29gl/config.packages.cuda

This looks like screwed up headers to me, but I will let someone that
understands CUDA compilation reply.

  Thanks,

     Matt

Thanks,
> Sreeram
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231208/6ef8870a/attachment.html>

From balay at mcs.anl.gov  Fri Dec  8 13:14:58 2023
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 8 Dec 2023 13:14:58 -0600 (CST)
Subject: [petsc-users] Configure error while building PETSc with
 CUDA/MVAPICH2-GDR
In-Reply-To: <CAMYG4Gn5werXD_5U0ceKBzA1dERecpTamXBdiZ6B5R3SqFH2nA@mail.gmail.com>
References: <CADtq7Mtn4nd+tj-JxoHMwZU8vw8sbzo_vr6bSz3zf8_+U3LOkQ@mail.gmail.com>
	<CAMYG4Gn5werXD_5U0ceKBzA1dERecpTamXBdiZ6B5R3SqFH2nA@mail.gmail.com>
Message-ID: <1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov>

Executing: mpicc -show
stdout: icc -I/opt/apps/cuda/11.4/include -I/opt/apps/cuda/11.4/include -lcuda -L/opt/apps/cuda/11.4/lib64/stubs -L/opt/apps/cuda/11.4/lib64 -lcudart -lrt -Wl,-rpath,/opt/apps/cuda/11.4/lib64 -Wl,-rpath,XORIGIN/placeholder -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/ -lm -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags -lmpi

    Checking for program /opt/apps/cuda/12.0/bin/nvcc...found

Looks like you are trying to mix in 2 different cuda versions in this build.

Perhaps you need to use cuda-11.4 - with this install of mvapich..

Satish

On Fri, 8 Dec 2023, Matthew Knepley wrote:

> On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> > I am trying to build PETSc with CUDA using the CUDA-Aware MVAPICH2-GDR.
> >
> > Here is my configure command:
> >
> > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre
> >  --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true
> > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis
> > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90
> >
> > which errors with:
> >
> >           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for
> > details):
> >
> > ---------------------------------------------------------------------------------------------
> >   CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14
> > -Xcompiler -fPIC
> >   -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
> > arch=compute_80,code=sm_80"
> >   generated from "--with-cuda-arch=80"
> >
> >
> >
> > The same configure command works when I use the Intel MPI and I can build
> > with CUDA. The full config.log file is attached. Please let me know if you
> > need any other information. I appreciate your help with this.
> >
> 
> The proximate error is
> 
> Executing: nvcc -c -o /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o
> -I/tmp/petsc-kn3f29gl/config.setCompilers
> -I/tmp/petsc-kn3f29gl/config.types
> -I/tmp/petsc-kn3f29gl/config.packages.cuda  -ccbin mpic++ -std=c++14
> -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
> arch=compute_80,code=sm_80  /tmp/petsc-kn3f29gl/config.packages.cuda/
> conftest.cu
> stdout:
> /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
> instance of overloaded function "__nv_associate_access_property_impl" has
> "C" linkage
> 1 error detected in the compilation of
> "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu".
> Possible ERROR while running compiler: exit code 1
> stderr:
> /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
> instance of overloaded function "__nv_associate_access_property_impl" has
> "C" linkage
> 
> 1 error detected in the compilation of
> "/tmp/petsc-kn3f29gl/config.packages.cuda
> 
> This looks like screwed up headers to me, but I will let someone that
> understands CUDA compilation reply.
> 
>   Thanks,
> 
>      Matt
> 
> Thanks,
> > Sreeram
> >
> 
> 
> 

From srvenkat at utexas.edu  Fri Dec  8 15:29:20 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Fri, 8 Dec 2023 15:29:20 -0600
Subject: [petsc-users] Configure error while building PETSc with
 CUDA/MVAPICH2-GDR
In-Reply-To: <1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov>
References: <CADtq7Mtn4nd+tj-JxoHMwZU8vw8sbzo_vr6bSz3zf8_+U3LOkQ@mail.gmail.com>
	<CAMYG4Gn5werXD_5U0ceKBzA1dERecpTamXBdiZ6B5R3SqFH2nA@mail.gmail.com>
	<1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov>
Message-ID: <CADtq7MvMCahtSO4i8+My-o-YRV+MNNzfSy=ye4_ZoakEyUBu1w@mail.gmail.com>

Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr module
didn't require CUDA 11.4 as a dependency, so I was using 12.0

On Fri, Dec 8, 2023 at 1:15?PM Satish Balay <balay at mcs.anl.gov> wrote:

> Executing: mpicc -show
> stdout: icc -I/opt/apps/cuda/11.4/include -I/opt/apps/cuda/11.4/include
> -lcuda -L/opt/apps/cuda/11.4/lib64/stubs -L/opt/apps/cuda/11.4/lib64
> -lcudart -lrt -Wl,-rpath,/opt/apps/cuda/11.4/lib64
> -Wl,-rpath,XORIGIN/placeholder -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/
> -lm -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include
> -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath
> -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags -lmpi
>
>     Checking for program /opt/apps/cuda/12.0/bin/nvcc...found
>
> Looks like you are trying to mix in 2 different cuda versions in this
> build.
>
> Perhaps you need to use cuda-11.4 - with this install of mvapich..
>
> Satish
>
> On Fri, 8 Dec 2023, Matthew Knepley wrote:
>
> > On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat <srvenkat at utexas.edu>
> wrote:
> >
> > > I am trying to build PETSc with CUDA using the CUDA-Aware MVAPICH2-GDR.
> > >
> > > Here is my configure command:
> > >
> > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre
> > >  --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true
> > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis
> > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90
> > >
> > > which errors with:
> > >
> > >           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for
> > > details):
> > >
> > >
> ---------------------------------------------------------------------------------------------
> > >   CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14
> > > -Xcompiler -fPIC
> > >   -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
> > > arch=compute_80,code=sm_80"
> > >   generated from "--with-cuda-arch=80"
> > >
> > >
> > >
> > > The same configure command works when I use the Intel MPI and I can
> build
> > > with CUDA. The full config.log file is attached. Please let me know if
> you
> > > need any other information. I appreciate your help with this.
> > >
> >
> > The proximate error is
> >
> > Executing: nvcc -c -o /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o
> > -I/tmp/petsc-kn3f29gl/config.setCompilers
> > -I/tmp/petsc-kn3f29gl/config.types
> > -I/tmp/petsc-kn3f29gl/config.packages.cuda  -ccbin mpic++ -std=c++14
> > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
> > arch=compute_80,code=sm_80  /tmp/petsc-kn3f29gl/config.packages.cuda/
> > conftest.cu
> > stdout:
> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
> > instance of overloaded function "__nv_associate_access_property_impl" has
> > "C" linkage
> > 1 error detected in the compilation of
> > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu".
> > Possible ERROR while running compiler: exit code 1
> > stderr:
> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
> > instance of overloaded function "__nv_associate_access_property_impl" has
> > "C" linkage
> >
> > 1 error detected in the compilation of
> > "/tmp/petsc-kn3f29gl/config.packages.cuda
> >
> > This looks like screwed up headers to me, but I will let someone that
> > understands CUDA compilation reply.
> >
> >   Thanks,
> >
> >      Matt
> >
> > Thanks,
> > > Sreeram
> > >
> >
> >
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231208/a84ee2e8/attachment-0001.html>

From srvenkat at utexas.edu  Fri Dec  8 16:16:54 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Fri, 8 Dec 2023 16:16:54 -0600
Subject: [petsc-users] Configure error while building PETSc with
 CUDA/MVAPICH2-GDR
In-Reply-To: <CADtq7MvMCahtSO4i8+My-o-YRV+MNNzfSy=ye4_ZoakEyUBu1w@mail.gmail.com>
References: <CADtq7Mtn4nd+tj-JxoHMwZU8vw8sbzo_vr6bSz3zf8_+U3LOkQ@mail.gmail.com>
	<CAMYG4Gn5werXD_5U0ceKBzA1dERecpTamXBdiZ6B5R3SqFH2nA@mail.gmail.com>
	<1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov>
	<CADtq7MvMCahtSO4i8+My-o-YRV+MNNzfSy=ye4_ZoakEyUBu1w@mail.gmail.com>
Message-ID: <CADtq7MsQ2puRu_=UvVtZZ4i-YhkmBW8Fb07+nRzy9jsrWY3gBw@mail.gmail.com>

Actually, when I compile my program with this build of PETSc and run, I
still get the error:

PETSC ERROR: PETSc is configured with GPU support, but your MPI is not
GPU-aware. For better performance, please use a GPU-aware MPI.

I have the mvapich2-gdr module loaded and MV2_USE_CUDA=1.

Is there anything else I need to do?

Thanks,
Sreeram

On Fri, Dec 8, 2023 at 3:29?PM Sreeram R Venkat <srvenkat at utexas.edu> wrote:

> Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr module
> didn't require CUDA 11.4 as a dependency, so I was using 12.0
>
> On Fri, Dec 8, 2023 at 1:15?PM Satish Balay <balay at mcs.anl.gov> wrote:
>
>> Executing: mpicc -show
>> stdout: icc -I/opt/apps/cuda/11.4/include -I/opt/apps/cuda/11.4/include
>> -lcuda -L/opt/apps/cuda/11.4/lib64/stubs -L/opt/apps/cuda/11.4/lib64
>> -lcudart -lrt -Wl,-rpath,/opt/apps/cuda/11.4/lib64
>> -Wl,-rpath,XORIGIN/placeholder -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/
>> -lm -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include
>> -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath
>> -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags -lmpi
>>
>>     Checking for program /opt/apps/cuda/12.0/bin/nvcc...found
>>
>> Looks like you are trying to mix in 2 different cuda versions in this
>> build.
>>
>> Perhaps you need to use cuda-11.4 - with this install of mvapich..
>>
>> Satish
>>
>> On Fri, 8 Dec 2023, Matthew Knepley wrote:
>>
>> > On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>> >
>> > > I am trying to build PETSc with CUDA using the CUDA-Aware
>> MVAPICH2-GDR.
>> > >
>> > > Here is my configure command:
>> > >
>> > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre
>> > >  --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true
>> > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis
>> > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90
>> > >
>> > > which errors with:
>> > >
>> > >           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log
>> for
>> > > details):
>> > >
>> > >
>> ---------------------------------------------------------------------------------------------
>> > >   CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14
>> > > -Xcompiler -fPIC
>> > >   -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
>> > > arch=compute_80,code=sm_80"
>> > >   generated from "--with-cuda-arch=80"
>> > >
>> > >
>> > >
>> > > The same configure command works when I use the Intel MPI and I can
>> build
>> > > with CUDA. The full config.log file is attached. Please let me know
>> if you
>> > > need any other information. I appreciate your help with this.
>> > >
>> >
>> > The proximate error is
>> >
>> > Executing: nvcc -c -o
>> /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o
>> > -I/tmp/petsc-kn3f29gl/config.setCompilers
>> > -I/tmp/petsc-kn3f29gl/config.types
>> > -I/tmp/petsc-kn3f29gl/config.packages.cuda  -ccbin mpic++ -std=c++14
>> > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
>> > arch=compute_80,code=sm_80  /tmp/petsc-kn3f29gl/config.packages.cuda/
>> > conftest.cu
>> > stdout:
>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
>> > instance of overloaded function "__nv_associate_access_property_impl"
>> has
>> > "C" linkage
>> > 1 error detected in the compilation of
>> > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu".
>> > Possible ERROR while running compiler: exit code 1
>> > stderr:
>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
>> > instance of overloaded function "__nv_associate_access_property_impl"
>> has
>> > "C" linkage
>> >
>> > 1 error detected in the compilation of
>> > "/tmp/petsc-kn3f29gl/config.packages.cuda
>> >
>> > This looks like screwed up headers to me, but I will let someone that
>> > understands CUDA compilation reply.
>> >
>> >   Thanks,
>> >
>> >      Matt
>> >
>> > Thanks,
>> > > Sreeram
>> > >
>> >
>> >
>> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231208/0ac5e391/attachment.html>

From mfadams at lbl.gov  Fri Dec  8 17:30:34 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 8 Dec 2023 18:30:34 -0500
Subject: [petsc-users] Configure error while building PETSc with
 CUDA/MVAPICH2-GDR
In-Reply-To: <CADtq7MsQ2puRu_=UvVtZZ4i-YhkmBW8Fb07+nRzy9jsrWY3gBw@mail.gmail.com>
References: <CADtq7Mtn4nd+tj-JxoHMwZU8vw8sbzo_vr6bSz3zf8_+U3LOkQ@mail.gmail.com>
	<CAMYG4Gn5werXD_5U0ceKBzA1dERecpTamXBdiZ6B5R3SqFH2nA@mail.gmail.com>
	<1da79702-c1eb-0ad8-6efc-64580e02bd07@mcs.anl.gov>
	<CADtq7MvMCahtSO4i8+My-o-YRV+MNNzfSy=ye4_ZoakEyUBu1w@mail.gmail.com>
	<CADtq7MsQ2puRu_=UvVtZZ4i-YhkmBW8Fb07+nRzy9jsrWY3gBw@mail.gmail.com>
Message-ID: <CADOhEh7R3AQJp75wOnaqCiJ1nO=D9fHkMFQhwxHXv7Qi=8fYBQ@mail.gmail.com>

You may need to set some env variables. This can be system specific so you
might want to look at docs or ask TACC how to run with GPU-aware MPI.

Mark

On Fri, Dec 8, 2023 at 5:17?PM Sreeram R Venkat <srvenkat at utexas.edu> wrote:

> Actually, when I compile my program with this build of PETSc and run, I
> still get the error:
>
> PETSC ERROR: PETSc is configured with GPU support, but your MPI is not
> GPU-aware. For better performance, please use a GPU-aware MPI.
>
> I have the mvapich2-gdr module loaded and MV2_USE_CUDA=1.
>
> Is there anything else I need to do?
>
> Thanks,
> Sreeram
>
> On Fri, Dec 8, 2023 at 3:29?PM Sreeram R Venkat <srvenkat at utexas.edu>
> wrote:
>
>> Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr module
>> didn't require CUDA 11.4 as a dependency, so I was using 12.0
>>
>> On Fri, Dec 8, 2023 at 1:15?PM Satish Balay <balay at mcs.anl.gov> wrote:
>>
>>> Executing: mpicc -show
>>> stdout: icc -I/opt/apps/cuda/11.4/include -I/opt/apps/cuda/11.4/include
>>> -lcuda -L/opt/apps/cuda/11.4/lib64/stubs -L/opt/apps/cuda/11.4/lib64
>>> -lcudart -lrt -Wl,-rpath,/opt/apps/cuda/11.4/lib64
>>> -Wl,-rpath,XORIGIN/placeholder -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/
>>> -lm -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include
>>> -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath
>>> -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags -lmpi
>>>
>>>     Checking for program /opt/apps/cuda/12.0/bin/nvcc...found
>>>
>>> Looks like you are trying to mix in 2 different cuda versions in this
>>> build.
>>>
>>> Perhaps you need to use cuda-11.4 - with this install of mvapich..
>>>
>>> Satish
>>>
>>> On Fri, 8 Dec 2023, Matthew Knepley wrote:
>>>
>>> > On Fri, Dec 8, 2023 at 1:54?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>> >
>>> > > I am trying to build PETSc with CUDA using the CUDA-Aware
>>> MVAPICH2-GDR.
>>> > >
>>> > > Here is my configure command:
>>> > >
>>> > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre
>>> > >  --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true
>>> > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental --download-metis
>>> > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx
>>> --with-fc=mpif90
>>> > >
>>> > > which errors with:
>>> > >
>>> > >           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log
>>> for
>>> > > details):
>>> > >
>>> > >
>>> ---------------------------------------------------------------------------------------------
>>> > >   CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14
>>> > > -Xcompiler -fPIC
>>> > >   -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
>>> > > arch=compute_80,code=sm_80"
>>> > >   generated from "--with-cuda-arch=80"
>>> > >
>>> > >
>>> > >
>>> > > The same configure command works when I use the Intel MPI and I can
>>> build
>>> > > with CUDA. The full config.log file is attached. Please let me know
>>> if you
>>> > > need any other information. I appreciate your help with this.
>>> > >
>>> >
>>> > The proximate error is
>>> >
>>> > Executing: nvcc -c -o
>>> /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o
>>> > -I/tmp/petsc-kn3f29gl/config.setCompilers
>>> > -I/tmp/petsc-kn3f29gl/config.types
>>> > -I/tmp/petsc-kn3f29gl/config.packages.cuda  -ccbin mpic++ -std=c++14
>>> > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
>>> > arch=compute_80,code=sm_80  /tmp/petsc-kn3f29gl/config.packages.cuda/
>>> > conftest.cu
>>> > stdout:
>>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
>>> > instance of overloaded function "__nv_associate_access_property_impl"
>>> has
>>> > "C" linkage
>>> > 1 error detected in the compilation of
>>> > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu".
>>> > Possible ERROR while running compiler: exit code 1
>>> > stderr:
>>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than one
>>> > instance of overloaded function "__nv_associate_access_property_impl"
>>> has
>>> > "C" linkage
>>> >
>>> > 1 error detected in the compilation of
>>> > "/tmp/petsc-kn3f29gl/config.packages.cuda
>>> >
>>> > This looks like screwed up headers to me, but I will let someone that
>>> > understands CUDA compilation reply.
>>> >
>>> >   Thanks,
>>> >
>>> >      Matt
>>> >
>>> > Thanks,
>>> > > Sreeram
>>> > >
>>> >
>>> >
>>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231208/0c6b1858/attachment.html>

From almaeder at student.ethz.ch  Sat Dec  9 05:54:41 2023
From: almaeder at student.ethz.ch (Maeder  Alexander)
Date: Sat, 9 Dec 2023 11:54:41 +0000
Subject: [petsc-users] PETSc and MPI-3/RMA
Message-ID: <8d365fe0be30429db2b7064412e49d2a@student.ethz.ch>

I am a new user of PETSc

and want to know more about the underlying implementation for matrix-vector multiplication (Ax=y).

PETSc utilizes a 1D distribution and communicates only parts of the vector x utilized depending on the sparsity pattern of A.

Is the communication of x done with MPI-3 RMA and utilizes cuda-aware mpi for RMA?


Best regards,


Alexander Maeder
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231209/0a6d0aeb/attachment-0001.html>

From jed at jedbrown.org  Sat Dec  9 18:13:38 2023
From: jed at jedbrown.org (Jed Brown)
Date: Sat, 09 Dec 2023 17:13:38 -0700
Subject: [petsc-users] PETSc and MPI-3/RMA
In-Reply-To: <8d365fe0be30429db2b7064412e49d2a@student.ethz.ch>
References: <8d365fe0be30429db2b7064412e49d2a@student.ethz.ch>
Message-ID: <871qbuq5hp.fsf@jedbrown.org>

It uses nonblocking point-to-point by default since that tends to perform better and is less prone to MPI implementation bugs, but you can select `-sf_type window` to try it, or use other strategies here depending on the sort of problem you're working with.

#define PETSCSFBASIC      "basic"
#define PETSCSFNEIGHBOR   "neighbor"
#define PETSCSFALLGATHERV "allgatherv"
#define PETSCSFALLGATHER  "allgather"
#define PETSCSFGATHERV    "gatherv"
#define PETSCSFGATHER     "gather"
#define PETSCSFALLTOALL   "alltoall"
#define PETSCSFWINDOW     "window"

PETSc does try to use GPU-aware MPI, though implementation bugs are present on many machines and it often requires a delicate environment arrangement.

"Maeder  Alexander" <almaeder at student.ethz.ch> writes:

> I am a new user of PETSc
>
> and want to know more about the underlying implementation for matrix-vector multiplication (Ax=y).
>
> PETSc utilizes a 1D distribution and communicates only parts of the vector x utilized depending on the sparsity pattern of A.
>
> Is the communication of x done with MPI-3 RMA and utilizes cuda-aware mpi for RMA?
>
>
> Best regards,
>
>
> Alexander Maeder

From stephan.koehler at math.tu-freiberg.de  Sun Dec 10 01:20:20 2023
From: stephan.koehler at math.tu-freiberg.de (=?UTF-8?Q?Stephan_K=C3=B6hler?=)
Date: Sun, 10 Dec 2023 08:20:20 +0100
Subject: [petsc-users] Bug Report TaoALMM class
In-Reply-To: <CAMYG4GmafqenDV=i9tJZ=KEY=nnbdinaRCZBTztRMvQ-2acr-A@mail.gmail.com>
References: <d076eb69-5b72-6eaa-0140-9d01414b209d@math.tu-freiberg.de>
	<CAMYG4GmafqenDV=i9tJZ=KEY=nnbdinaRCZBTztRMvQ-2acr-A@mail.gmail.com>
Message-ID: <d8e197b9-ba94-4f78-821f-f03baab927fd@math.tu-freiberg.de>

Dear PETSc/Tao team,

this is still an open issue andI haven't heard anything else so far that I'm wrong.

Kind regards,
Stephan K?hler

Am 18.07.23 um 02:21 schrieb Matthew Knepley:
> Toby and Hansol,
>
> Has anyone looked at this?
>
>    Thanks,
>
>       Matt
>
> On Mon, Jun 12, 2023 at 8:24?AM Stephan K?hler <
> stephan.koehler at math.tu-freiberg.de> wrote:
>
>> Dear PETSc/Tao team,
>>
>> I think there might be a bug in the Tao ALMM class:  In the function
>> TaoALMMComputeAugLagAndGradient_Private(), see, eg.
>>
>> https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html#TAOALMM
>> line 648 the gradient seems to be wrong.
>>
>> The given function and gradient computation is
>> Lc = F + Ye^TCe + Yi^T(Ci - S) + 0.5*mu*[Ce^TCe + (Ci - S)^T(Ci - S)],
>> dLc/dX = dF/dX + Ye^TAe + Yi^TAi + 0.5*mu*[Ce^TAe + (Ci - S)^TAi],
>>
>> but I think the gradient should be (without 0.5)
>>
>> dLc/dX = dF/dX + Ye^TAe + Yi^TAi + mu*[Ce^TAe + (Ci - S)^TAi].
>>
>> Kind regards,
>> Stephan K?hler
>>
>> --
>> Stephan K?hler
>> TU Bergakademie Freiberg
>> Institut f?r numerische Mathematik und Optimierung
>>
>> Akademiestra?e 6
>> 09599 Freiberg
>> Geb?udeteil Mittelbau, Zimmer 2.07
>>
>> Telefon: +49 (0)3731 39-3173 (B?ro)
>>
>>

-- 
Stephan K?hler
TU Bergakademie Freiberg
Institut f?r numerische Mathematik und Optimierung

Akademiestra?e 6
09599 Freiberg
Geb?udeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3188 (B?ro)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/055c8c7e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xC9BF2C20DFE9F713.asc
Type: application/pgp-keys
Size: 758 bytes
Desc: OpenPGP public key
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/055c8c7e/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/055c8c7e/attachment.sig>

From stephan.koehler at math.tu-freiberg.de  Sun Dec 10 01:40:56 2023
From: stephan.koehler at math.tu-freiberg.de (=?UTF-8?Q?Stephan_K=C3=B6hler?=)
Date: Sun, 10 Dec 2023 08:40:56 +0100
Subject: [petsc-users] Bug report VecNorm
Message-ID: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>

Dear PETSc/Tao team,

there is a bug in the voector interface:? In the function
VecNorm, see, eg. 
https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm 
line 197 the check for consistency in line 214 is done on the wrong 
communicator.? The communicator should be PETSC_COMM_SELF.
Otherwise the program may hang when PetscCheck is executed.

Please find a minimal example attached.


Kind regards,
Stephan K?hler

-- 
Stephan K?hler
TU Bergakademie Freiberg
Institut f?r numerische Mathematik und Optimierung

Akademiestra?e 6
09599 Freiberg
Geb?udeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3188 (B?ro)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/8e2303fe/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: minimal_ex_vec_norm.cpp
Type: text/x-c++src
Size: 1792 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/8e2303fe/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xC9BF2C20DFE9F713.asc
Type: application/pgp-keys
Size: 758 bytes
Desc: OpenPGP public key
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/8e2303fe/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/8e2303fe/attachment-0001.sig>

From bsmith at petsc.dev  Sun Dec 10 09:00:10 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Sun, 10 Dec 2023 10:00:10 -0500
Subject: [petsc-users] Bug report VecNorm
In-Reply-To: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
Message-ID: <394A4E7D-302C-4B51-931D-DE1CBEBA4A61@petsc.dev>


   I don't fully understand your code and what it is trying to demonstrate, but VecGetArrayWrite is Logically Collective. Having 

  if(rank == 0)
  {
    PetscCall(VecGetArrayWrite(vec, &xx));
    PetscCall(VecRestoreArrayWrite(vec, &xx));
  }

is not allowed.  The reason is that VecRestoreArrayWrite() changes the PetscObjectState of the vector, and this state must be changed consistently across all MPI processes that share the vector. 


> On Dec 10, 2023, at 2:40?AM, Stephan K?hler <stephan.koehler at math.tu-freiberg.de> wrote:
> 
> Dear PETSc/Tao team, 
> 
> there is a bug in the voector interface:  In the function 
> VecNorm, see, eg. https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm line 197 the check for consistency in line 214 is done on the wrong communicator.  The  communicator should be PETSC_COMM_SELF.
> Otherwise the program may hang when PetscCheck is executed.
> 
> Please find a minimal example attached.
> 
> 
> Kind regards, 
> Stephan K?hler
> -- 
> Stephan K?hler
> TU Bergakademie Freiberg
> Institut f?r numerische Mathematik und Optimierung
> 
> Akademiestra?e 6
> 09599 Freiberg
> Geb?udeteil Mittelbau, Zimmer 2.07
> 
> Telefon: +49 (0)3731 39-3188 (B?ro)
> <minimal_ex_vec_norm.cpp><OpenPGP_0xC9BF2C20DFE9F713.asc>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/8d155004/attachment.html>

From knepley at gmail.com  Sun Dec 10 11:54:02 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 10 Dec 2023 12:54:02 -0500
Subject: [petsc-users] Bug report VecNorm
In-Reply-To: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
Message-ID: <CAMYG4GmJdbwQp8fuZJ-QPxQkuzn+On-=pcXQreXLBy1XBmcuKQ@mail.gmail.com>

On Sun, Dec 10, 2023 at 2:41?AM Stephan K?hler <
stephan.koehler at math.tu-freiberg.de> wrote:

> Dear PETSc/Tao team,
>
> there is a bug in the voector interface:  In the function
> VecNorm, see, eg.
> https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm
> line 197 the check for consistency in line 214 is done on the wrong
> communicator.  The  communicator should be PETSC_COMM_SELF.
> Otherwise the program may hang when PetscCheck is executed.
>
> Please find a minimal example attached.
>

This is entirely right. I will fix it.

  Thanks,

     Matt


>
>
> Kind regards,
> Stephan K?hler
>
> --
> Stephan K?hler
> TU Bergakademie Freiberg
> Institut f?r numerische Mathematik und Optimierung
>
> Akademiestra?e 6
> 09599 Freiberg
> Geb?udeteil Mittelbau, Zimmer 2.07
>
> Telefon: +49 (0)3731 39-3188 (B?ro)
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/e4417112/attachment.html>

From knepley at gmail.com  Sun Dec 10 11:57:28 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 10 Dec 2023 12:57:28 -0500
Subject: [petsc-users] Bug report VecNorm
In-Reply-To: <CAMYG4GmJdbwQp8fuZJ-QPxQkuzn+On-=pcXQreXLBy1XBmcuKQ@mail.gmail.com>
References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
	<CAMYG4GmJdbwQp8fuZJ-QPxQkuzn+On-=pcXQreXLBy1XBmcuKQ@mail.gmail.com>
Message-ID: <CAMYG4Gk2X-k_gKYak7FtV88+DW8ACs2deXFHb-1G+Ye33ecz=A@mail.gmail.com>

On Sun, Dec 10, 2023 at 12:54?PM Matthew Knepley <knepley at gmail.com> wrote:

> On Sun, Dec 10, 2023 at 2:41?AM Stephan K?hler <
> stephan.koehler at math.tu-freiberg.de> wrote:
>
>> Dear PETSc/Tao team,
>>
>> there is a bug in the voector interface:  In the function
>> VecNorm, see, eg.
>> https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm
>> line 197 the check for consistency in line 214 is done on the wrong
>> communicator.  The  communicator should be PETSC_COMM_SELF.
>> Otherwise the program may hang when PetscCheck is executed.
>>
>> Please find a minimal example attached.
>>
>
> This is entirely right. I will fix it.
>

Here is the MR.

  Thanks,

    Matt


>   Thanks,
>
>      Matt
>
>
>>
>>
>> Kind regards,
>> Stephan K?hler
>>
>> --
>> Stephan K?hler
>> TU Bergakademie Freiberg
>> Institut f?r numerische Mathematik und Optimierung
>>
>> Akademiestra?e 6
>> 09599 Freiberg
>> Geb?udeteil Mittelbau, Zimmer 2.07
>>
>> Telefon: +49 (0)3731 39-3188 (B?ro)
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/9b0bd586/attachment-0001.html>

From knepley at gmail.com  Sun Dec 10 11:57:41 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 10 Dec 2023 12:57:41 -0500
Subject: [petsc-users] Bug report VecNorm
In-Reply-To: <CAMYG4Gk2X-k_gKYak7FtV88+DW8ACs2deXFHb-1G+Ye33ecz=A@mail.gmail.com>
References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
	<CAMYG4GmJdbwQp8fuZJ-QPxQkuzn+On-=pcXQreXLBy1XBmcuKQ@mail.gmail.com>
	<CAMYG4Gk2X-k_gKYak7FtV88+DW8ACs2deXFHb-1G+Ye33ecz=A@mail.gmail.com>
Message-ID: <CAMYG4Gk_pwGxqux+KHwWWDGXd5SKySbwQb+DPRpCLJBstt-AMA@mail.gmail.com>

On Sun, Dec 10, 2023 at 12:57?PM Matthew Knepley <knepley at gmail.com> wrote:

> On Sun, Dec 10, 2023 at 12:54?PM Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Sun, Dec 10, 2023 at 2:41?AM Stephan K?hler <
>> stephan.koehler at math.tu-freiberg.de> wrote:
>>
>>> Dear PETSc/Tao team,
>>>
>>> there is a bug in the voector interface:  In the function
>>> VecNorm, see, eg.
>>> https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm
>>> line 197 the check for consistency in line 214 is done on the wrong
>>> communicator.  The  communicator should be PETSC_COMM_SELF.
>>> Otherwise the program may hang when PetscCheck is executed.
>>>
>>> Please find a minimal example attached.
>>>
>>
>> This is entirely right. I will fix it.
>>
>
> Here is the MR.
>

https://gitlab.com/petsc/petsc/-/merge_requests/7102

  Thanks,

    Matt


>   Thanks,
>
>     Matt
>
>
>>   Thanks,
>>
>>      Matt
>>
>>
>>>
>>>
>>> Kind regards,
>>> Stephan K?hler
>>>
>>> --
>>> Stephan K?hler
>>> TU Bergakademie Freiberg
>>> Institut f?r numerische Mathematik und Optimierung
>>>
>>> Akademiestra?e 6
>>> 09599 Freiberg
>>> Geb?udeteil Mittelbau, Zimmer 2.07
>>>
>>> Telefon: +49 (0)3731 39-3188 (B?ro)
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/deada490/attachment.html>

From pierre at joliv.et  Sun Dec 10 12:47:43 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Sun, 10 Dec 2023 19:47:43 +0100
Subject: [petsc-users] Bug report VecNorm
In-Reply-To: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
Message-ID: <C47DCE8B-ABB0-40DE-AA26-8EF08431D194@joliv.et>


> On 10 Dec 2023, at 8:40?AM, Stephan K?hler <stephan.koehler at math.tu-freiberg.de> wrote:
> 
> Dear PETSc/Tao team, 
> 
> there is a bug in the voector interface:  In the function 
> VecNorm, see, eg. https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm line 197 the check for consistency in line 214 is done on the wrong communicator.  The  communicator should be PETSC_COMM_SELF.
> Otherwise the program may hang when PetscCheck is executed.

I think the communicator should not be changed, but instead, the check/conditional should be changed, ? la PetscValidLogicalCollectiveBool().

Thanks,
Pierre

> Please find a minimal example attached.
> 
> 
> Kind regards, 
> Stephan K?hler
> -- 
> Stephan K?hler
> TU Bergakademie Freiberg
> Institut f?r numerische Mathematik und Optimierung
> 
> Akademiestra?e 6
> 09599 Freiberg
> Geb?udeteil Mittelbau, Zimmer 2.07
> 
> Telefon: +49 (0)3731 39-3188 (B?ro)
> <minimal_ex_vec_norm.cpp><OpenPGP_0xC9BF2C20DFE9F713.asc>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231210/190b0f40/attachment.html>

From jed at jedbrown.org  Sun Dec 10 19:57:21 2023
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 10 Dec 2023 18:57:21 -0700
Subject: [petsc-users] Bug report VecNorm
In-Reply-To: <C47DCE8B-ABB0-40DE-AA26-8EF08431D194@joliv.et>
References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
	<C47DCE8B-ABB0-40DE-AA26-8EF08431D194@joliv.et>
Message-ID: <87v895o60u.fsf@jedbrown.org>

Pierre Jolivet <pierre at joliv.et> writes:

>> On 10 Dec 2023, at 8:40?AM, Stephan K?hler <stephan.koehler at math.tu-freiberg.de> wrote:
>> 
>> Dear PETSc/Tao team, 
>> 
>> there is a bug in the voector interface:  In the function 
>> VecNorm, see, eg. https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm line 197 the check for consistency in line 214 is done on the wrong communicator.  The  communicator should be PETSC_COMM_SELF.
>> Otherwise the program may hang when PetscCheck is executed.
>
> I think the communicator should not be changed, but instead, the check/conditional should be changed, ? la PetscValidLogicalCollectiveBool().

I agree -- it's no extra cost to discover collectively whether all, none, or some have the norm. In this case, it could be a MPI_SUM, in which case the error message could report how many processes took each path.

From yc17470 at connect.um.edu.mo  Mon Dec 11 03:32:23 2023
From: yc17470 at connect.um.edu.mo (Gong Yujie)
Date: Mon, 11 Dec 2023 09:32:23 +0000
Subject: [petsc-users] Question on output vector in vtk file
Message-ID: <OSZP286MB1061F4294E200911B25C5D70EB8FA@OSZP286MB1061.JPNP286.PROD.OUTLOOK.COM>

Dear PETSc developers,

I have a DMPlex DM with 1 field 1dof. I'd like to output a vector with block size equals to 3. It seems that there is no response using command line option or using some code about PetscViewer.

The DM is generated with (note that I'm not using PetscFE for discretization, just for allocate dof.)
PetscCall(DMPlexCreateExodusFromFile(PETSC_COMM_WORLD,"tube.exo",interpolate,&dm));
PetscCall(PetscFECreateLagrange(PETSC_COMM_SELF,dim,1,PETSC_TRUE,1,PETSC_DETERMINE,&fe));
PetscCall(PetscObjectSetName((PetscObject)fe,"potential_field"));
PetscCall(DMSetField(dm,0,NULL,(PetscObject)fe));
PetscCall(DMPlexDistribute(dm,0,&sf,&dmParallel));

The Vector is created using
PetscCall(DMCreateGlobalVector(dm,&phi_1));
PetscCall(VecSetLocalToGlobalMapping(phi_1,Itog));
PetscCall(VecGetLocalSize(phi_1,&vec_local_size_test));
PetscCall(VecCreateMPI(PETSC_COMM_WORLD, vec_local_size_test*3, PETSC_DETERMINE, &u_grad_psi));
PetscCall(VecSetBlockSize(u_grad_psi, 3));
PetscCall(VecSetLocalToGlobalMapping(u_grad_psi,Itog));

The output command line option is just --vec_view vtk:test.vtk. The PETSc version I'm using is 3.19.5.

Could you please give me some advice?

Best Regards,
Yujie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231211/c9a4e036/attachment-0001.html>

From knepley at gmail.com  Mon Dec 11 07:03:51 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 11 Dec 2023 08:03:51 -0500
Subject: [petsc-users] Bug report VecNorm
In-Reply-To: <394A4E7D-302C-4B51-931D-DE1CBEBA4A61@petsc.dev>
References: <8dd5bc56-2b25-4ad0-ace7-2752b622b858@math.tu-freiberg.de>
	<394A4E7D-302C-4B51-931D-DE1CBEBA4A61@petsc.dev>
Message-ID: <CAMYG4Gnu6-s3BsL2YtQmqAQnr+ZmwH_DJ4=eRWi7cF1nm-gMUQ@mail.gmail.com>

We already merged the fix.

  Thanks,

    Matt

On Mon, Dec 11, 2023 at 6:00?AM Barry Smith <bsmith at petsc.dev> wrote:

>
>    I don't fully understand your code and what it is trying to
> demonstrate, but VecGetArrayWrite is Logically Collective. Having
>
>   if(rank == 0)
>   {
>     PetscCall(VecGetArrayWrite(vec, &xx));
>     PetscCall(VecRestoreArrayWrite(vec, &xx));
>   }
>
> is not allowed.  The reason is that VecRestoreArrayWrite() changes the
> PetscObjectState of the vector, and this state must be changed consistently
> across all MPI processes that share the vector.
>
>
>
> On Dec 10, 2023, at 2:40?AM, Stephan K?hler <
> stephan.koehler at math.tu-freiberg.de> wrote:
>
> Dear PETSc/Tao team,
>
> there is a bug in the voector interface:  In the function
> VecNorm, see, eg.
> https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm
> line 197 the check for consistency in line 214 is done on the wrong
> communicator.  The  communicator should be PETSC_COMM_SELF.
> Otherwise the program may hang when PetscCheck is executed.
>
> Please find a minimal example attached.
>
>
> Kind regards,
> Stephan K?hler
>
> --
> Stephan K?hler
> TU Bergakademie Freiberg
> Institut f?r numerische Mathematik und Optimierung
>
> Akademiestra?e 6
> 09599 Freiberg
> Geb?udeteil Mittelbau, Zimmer 2.07
>
> Telefon: +49 (0)3731 39-3188 (B?ro)
>
> <minimal_ex_vec_norm.cpp><OpenPGP_0xC9BF2C20DFE9F713.asc>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231211/0b3b8b2d/attachment.html>

From knepley at gmail.com  Mon Dec 11 07:07:21 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 11 Dec 2023 08:07:21 -0500
Subject: [petsc-users] Question on output vector in vtk file
In-Reply-To: <OSZP286MB1061F4294E200911B25C5D70EB8FA@OSZP286MB1061.JPNP286.PROD.OUTLOOK.COM>
References: <OSZP286MB1061F4294E200911B25C5D70EB8FA@OSZP286MB1061.JPNP286.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4GnCxPpVFDU4GyQQY124WkmHhOTHMHzSixh8OX-QKuVECA@mail.gmail.com>

On Mon, Dec 11, 2023 at 4:32?AM Gong Yujie <yc17470 at connect.um.edu.mo>
wrote:

> Dear PETSc developers,
>
> I have a DMPlex DM with 1 field 1dof. I'd like to output a vector with
> block size equals to 3. It seems that there is no response using command
> line option or using some code about PetscViewer.
>

I am not sure how we can do this. If you only have 1 dof per cell (I
assume), how can we have a blocksize of 3?

  Thanks,

     Matt


> The DM is generated with (note that I'm not using PetscFE for
> discretization, just for allocate dof.)
>
> *PetscCall(DMPlexCreateExodusFromFile(PETSC_COMM_WORLD,"tube.exo",interpolate,&dm));*
>
> *PetscCall(PetscFECreateLagrange(PETSC_COMM_SELF,dim,1,PETSC_TRUE,1,PETSC_DETERMINE,&fe));*
> *PetscCall(PetscObjectSetName((PetscObject)fe,"potential_field"));*
> *PetscCall(DMSetField(dm,0,NULL,(PetscObject)fe));*
> *PetscCall(DMPlexDistribute(dm,0,&sf,&dmParallel));*
>
> The Vector is created using
> *PetscCall(DMCreateGlobalVector(dm,&phi_1));*
> *PetscCall(VecSetLocalToGlobalMapping(phi_1,Itog));*
> *PetscCall(VecGetLocalSize(phi_1,&vec_local_size_test));*
> *PetscCall(VecCreateMPI(PETSC_COMM_WORLD, vec_local_size_test*3,
> PETSC_DETERMINE, &u_grad_psi));*
> *PetscCall(VecSetBlockSize(u_grad_psi, 3));*
> *PetscCall(VecSetLocalToGlobalMapping(u_grad_psi,Itog));*
>
> The output command line option is just --vec_view vtk:test.vtk. The PETSc
> version I'm using is 3.19.5.
>
> Could you please give me some advice?
>
> Best Regards,
> Yujie
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231211/14c3c514/attachment.html>

From 1807580692 at qq.com  Mon Dec 11 01:51:54 2023
From: 1807580692 at qq.com (=?gb18030?B?MTgwNzU4MDY5Mg==?=)
Date: Mon, 11 Dec 2023 15:51:54 +0800
Subject: [petsc-users] (no subject)
Message-ID: <tencent_94E1D078306F234EF23713A9FAE563BC4809@qq.com>

Hello, I have encountered some problems.&nbsp;Here are some of my configurations.
OS Version and Type:&nbsp; Linux daihuanhe-Aspire-A315-55G 5.15.0-89-generic #99~20.04.1-Ubuntu SMP Thu Nov 2 15:16:47 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
PETSc Version: #define PETSC_VERSION_RELEASE&nbsp; &nbsp; 1
		#define PETSC_VERSION_MAJOR&nbsp; &nbsp; &nbsp; 3
		#define PETSC_VERSION_MINOR&nbsp; &nbsp; &nbsp; 19
		#define PETSC_VERSION_SUBMINOR&nbsp; &nbsp;0
		#define PETSC_RELEASE_DATE&nbsp; &nbsp; &nbsp; &nbsp;"Mar 30, 2023"
		#define PETSC_VERSION_DATE&nbsp; &nbsp; &nbsp; &nbsp;"unknown"
MPI implementation: MPICH	
Compiler and version: Gnu C
The problem is when I type&nbsp;
?mpiexec -n 4 ./ex19 -lidvelocity 100 -prandtl 0.72 -grashof 10000 -da_grid_x 64 -da_grid_y 64 -snes_type newtonls -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type symmetric_multiplicative -pc_fieldsplit_block_size 4 -pc_fieldsplit_0_fields 0,1,2,3 -pc_fieldsplit_1_fields 0,1,2,3 -fieldsplit_0_pc_type asm -fieldsplit_0_pc_asm_type restrict -fieldsplit_0_pc_asm_overlap 5 -fieldsplit_0_sub_pc_type lu -fieldsplit_1_pc_type asm -fieldsplit_1_pc_asm_type restrict -fieldsplit_1_pc_asm_overlap 5 -fieldsplit_1_sub_pc_type lu&nbsp; -snes_monitor -snes_converged_reason -fieldsplit_0_ksp_atol 1e-10&nbsp; -fieldsplit_1_ksp_atol 1e-10&nbsp; -fieldsplit_0_ksp_rtol 1e-6&nbsp; -fieldsplit_1_ksp_rtol 1e-6 -fieldsplit_0_snes_atol 1e-10&nbsp; -fieldsplit_1_snes_atol 1e-10&nbsp; -fieldsplit_0_snes_rtol 1e-6&nbsp; -fieldsplit_1_snes_rtol 1e-6?
in the command line, where my path is /petsc/src/snes/tutorials.


It returns&nbsp;
?WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 8 unused database options. They are:
Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line
Option left: name:-fieldsplit_0_ksp_rtol value: 1e-6 source: command line
Option left: name:-fieldsplit_0_snes_atol value: 1e-10 source: command line
Option left: name:-fieldsplit_0_snes_rtol value: 1e-6 source: command line
Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line
Option left: name:-fieldsplit_1_ksp_rtol value: 1e-6 source: command line
Option left: name:-fieldsplit_1_snes_atol value: 1e-10 source: command line
Option left: name:-fieldsplit_1_snes_rtol value: 1e-6 source: command line?.
Please tell me what should I do?Thank you very much.


1807580692
1807580692 at qq.com


&nbsp;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231211/b023390e/attachment-0001.html>

From bsmith at petsc.dev  Mon Dec 11 11:00:23 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 11 Dec 2023 12:00:23 -0500
Subject: [petsc-users] (no subject)
In-Reply-To: <tencent_94E1D078306F234EF23713A9FAE563BC4809@qq.com>
References: <tencent_94E1D078306F234EF23713A9FAE563BC4809@qq.com>
Message-ID: <C9707C9B-BC67-4D5B-8639-D2BAA8B814E5@petsc.dev>


   The snes options are not relevant since the parts of a PCFIELDSPLIT are always linear problems.

    By default PCFIELDSPLIT uses a KSP type of preonly on each split (that is it applies the preconditioner exactly once inside the PCApply_FieldSplit() hence the -fieldsplit_*_ksp_ options are not relevent. You can use -fieldsplit_ksp_type gmres for example to have it use gmres on each of the splits, but note that then you should use -ksp_type fgmres since using gmres inside a preconditioner results in a nonlinear preconditioner.

    You can always run with -ksp_view to see the solver being used and the prefixes that currently make sense.

  Barry


> On Dec 11, 2023, at 2:51?AM, 1807580692 <1807580692 at qq.com> wrote:
> 
> Hello, I have encountered some problems. Here are some of my configurations.
> OS Version and Type:  Linux daihuanhe-Aspire-A315-55G 5.15.0-89-generic #99~20.04.1-Ubuntu SMP Thu Nov 2 15:16:47 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
> PETSc Version: #define PETSC_VERSION_RELEASE    1
> 		#define PETSC_VERSION_MAJOR      3
> 		#define PETSC_VERSION_MINOR      19
> 		#define PETSC_VERSION_SUBMINOR   0
> 		#define PETSC_RELEASE_DATE       "Mar 30, 2023"
> 		#define PETSC_VERSION_DATE       "unknown"
> MPI implementation: MPICH	
> Compiler and version: Gnu C
> The problem is when I type 
> ?mpiexec -n 4 ./ex19 -lidvelocity 100 -prandtl 0.72 -grashof 10000 -da_grid_x 64 -da_grid_y 64 -snes_type newtonls -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type symmetric_multiplicative -pc_fieldsplit_block_size 4 -pc_fieldsplit_0_fields 0,1,2,3 -pc_fieldsplit_1_fields 0,1,2,3 -fieldsplit_0_pc_type asm -fieldsplit_0_pc_asm_type restrict -fieldsplit_0_pc_asm_overlap 5 -fieldsplit_0_sub_pc_type lu -fieldsplit_1_pc_type asm -fieldsplit_1_pc_asm_type restrict -fieldsplit_1_pc_asm_overlap 5 -fieldsplit_1_sub_pc_type lu  -snes_monitor -snes_converged_reason -fieldsplit_0_ksp_atol 1e-10  -fieldsplit_1_ksp_atol 1e-10  -fieldsplit_0_ksp_rtol 1e-6  -fieldsplit_1_ksp_rtol 1e-6 -fieldsplit_0_snes_atol 1e-10  -fieldsplit_1_snes_atol 1e-10  -fieldsplit_0_snes_rtol 1e-6  -fieldsplit_1_snes_rtol 1e-6?
> in the command line, where my path is /petsc/src/snes/tutorials.
> 
> It returns 
> ?WARNING! There are options you set that were not used!
> WARNING! could be spelling mistake, etc!
> There are 8 unused database options. They are:
> Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line
> Option left: name:-fieldsplit_0_ksp_rtol value: 1e-6 source: command line
> Option left: name:-fieldsplit_0_snes_atol value: 1e-10 source: command line
> Option left: name:-fieldsplit_0_snes_rtol value: 1e-6 source: command line
> Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line
> Option left: name:-fieldsplit_1_ksp_rtol value: 1e-6 source: command line
> Option left: name:-fieldsplit_1_snes_atol value: 1e-10 source: command line
> Option left: name:-fieldsplit_1_snes_rtol value: 1e-6 source: command line?.
> Please tell me what should I do?Thank you very much.
> 	
> 1807580692
> 1807580692 at qq.com
>  <https://wx.mail.qq.com/home/index?t=readmail_businesscard_midpage&nocheck=true&name=1807580692&icon=http%3A%2F%2Fthirdqq.qlogo.cn%2Fg%3Fb%3Dsdk%26k%3DsDkxrKtOBYniaVteAHHIz5g%26s%3D100%26t%3D1647684318%3Frand%3D1648799506&mail=1807580692%40qq.com&code=>
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231211/aeb88386/attachment.html>

From Bruce.Palmer at pnnl.gov  Tue Dec 12 10:27:42 2023
From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J)
Date: Tue, 12 Dec 2023 16:27:42 +0000
Subject: [petsc-users] Fortran Interface
Message-ID: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>

Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release) but if you go under the tab C/Fortran API, only descriptions for the C interface are there.

Bruce Palmer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/829d69a8/attachment.html>

From knepley at gmail.com  Tue Dec 12 10:31:55 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Dec 2023 11:31:55 -0500
Subject: [petsc-users] Fortran Interface
In-Reply-To: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
Message-ID: <CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>

On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Does documentation for the PETSc fortran interface still exist? I looked
> at the web pages for 3.20 (petsc.org/release) but if you go under the tab
> C/Fortran API, only descriptions for the C interface are there.
>

I think after the most recent changes, the interface was supposed to be
very close to C, so we just document the differences on specific pages, and
put the general stuff here:

  https://petsc.org/release/manual/fortran/

   Thanks,

     Matt


> Bruce Palmer
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/b488a7d3/attachment.html>

From Bruce.Palmer at pnnl.gov  Tue Dec 12 10:40:12 2023
From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J)
Date: Tue, 12 Dec 2023 16:40:12 +0000
Subject: [petsc-users] Fortran Interface
In-Reply-To: <CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
Message-ID: <CO6PR09MB7957D340C80D08583C6869ACFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>

Thanks! It might be useful if there were a link to this page near the top of the C/Fortran API page.

Bruce

From: Matthew Knepley <knepley at gmail.com>
Date: Tuesday, December 12, 2023 at 8:33 AM
To: Palmer, Bruce J <Bruce.Palmer at pnnl.gov>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Fortran Interface
Check twice before you click! This email originated from outside PNNL.

On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release<http://petsc.org/release>) but if you go under the tab C/Fortran API, only descriptions for the C interface are there.

I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here:

  https://petsc.org/release/manual/fortran/

   Thanks,

     Matt

Bruce Palmer


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/b3820e0c/attachment-0001.html>

From bsmith at petsc.dev  Tue Dec 12 11:07:32 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 12 Dec 2023 12:07:32 -0500
Subject: [petsc-users] Fortran Interface
In-Reply-To: <CO6PR09MB7957D340C80D08583C6869ACFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
	<CO6PR09MB7957D340C80D08583C6869ACFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
Message-ID: <B0631191-BBBA-49DB-855A-46285057EE5B@petsc.dev>


  It is unlikely we will ever be able to maintain full manual pages for Fortran for all routines. But yes, the current pages are C-centric. 

  Do you have any suggestions on what we could add to the current manual pages or how to format them etc that would make them better for Fortran users who are not used to C?  A Fortran synopsis as well as the C one, or a single synopsis that is easier for both Fortran and C users to follow?

  Barry

I am not sure it is trivial to automatically generate the Fortran synposis with appropriate use and include information but one could argue that we should.


> On Dec 12, 2023, at 11:40?AM, Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Thanks! It might be useful if there were a link to this page near the top of the C/Fortran API page.
>  
> Bruce
>  
> From: Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>
> Date: Tuesday, December 12, 2023 at 8:33 AM
> To: Palmer, Bruce J <Bruce.Palmer at pnnl.gov <mailto:Bruce.Palmer at pnnl.gov>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] Fortran Interface
> 
> Check twice before you click! This email originated from outside PNNL.
>  
> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
> Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release <http://petsc.org/release>) but if you go under the tab C/Fortran API, only descriptions for the C interface are there.
>  
> I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here:
>  
>   https://petsc.org/release/manual/fortran/
>  
>    Thanks,
>  
>      Matt
>  
> Bruce Palmer
> 
>  
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/c7c9e1c1/attachment.html>

From s_g at berkeley.edu  Tue Dec 12 11:17:14 2023
From: s_g at berkeley.edu (Sanjay Govindjee)
Date: Tue, 12 Dec 2023 09:17:14 -0800
Subject: [petsc-users] Fortran Interface
In-Reply-To: <B0631191-BBBA-49DB-855A-46285057EE5B@petsc.dev>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
	<CO6PR09MB7957D340C80D08583C6869ACFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<B0631191-BBBA-49DB-855A-46285057EE5B@petsc.dev>
Message-ID: <0085ad64-2045-44ad-a686-86b5ae5c88a9@berkeley.edu>

I agree with Bruce that having a link to 
https://petsc.org/release/manual/fortran/ at the top of the C/Fortran 
API page (https://petsc.org/release/manualpages/) would be helpful.?? 
The C descriptions themselves are 98% of the way there for Fortran users 
(like myself).? The only time that more information would be help on the 
manual pages themselves is when there
is a strong variance between the C and Fortran usage, but that can not 
be easily automated.

-sanjay

On 12/12/23 9:07 AM, Barry Smith wrote:
>
> ? It is unlikely we will ever be able to maintain full manual pages 
> for Fortran for all routines. But yes, the current pages are C-centric.
>
> ? Do you have any suggestions on what we could add to the current 
> manual pages or how to format them etc that would make them better for 
> Fortran users who are not used to C? ?A Fortran synopsis as well as 
> the C one, or a single synopsis that is easier for both Fortran and C 
> users to follow?
>
> ? Barry
>
> I am not sure it is trivial to automatically generate the Fortran 
> synposis with appropriate use and include information but one could 
> argue that we should.
>
>
>
>> On Dec 12, 2023, at 11:40?AM, Palmer, Bruce J via petsc-users 
>> <petsc-users at mcs.anl.gov> wrote:
>>
>> Thanks! It might be useful if there were a link to this page near the 
>> top of the C/Fortran API page.
>> Bruce
>>
>> *From:*Matthew Knepley <knepley at gmail.com>
>> *Date:*Tuesday, December 12, 2023 at 8:33 AM
>> *To:*Palmer, Bruce J <Bruce.Palmer at pnnl.gov>
>> *Cc:*petsc-users at mcs.anl.gov<petsc-users at mcs.anl.gov>
>> *Subject:*Re: [petsc-users] Fortran Interface
>>
>> Check twice before you click! This email originated from outside PNNL.
>> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users 
>> <petsc-users at mcs.anl.gov> wrote:
>>
>>     Does documentation for the PETSc fortran interface still exist? I
>>     looked at the web pages for 3.20 (petsc.org/release
>>     <http://petsc.org/release>) but if you go under the tab C/Fortran
>>     API, only descriptions for the C interface are there.
>>
>> I think after the most recent changes, the interface was supposed to 
>> be very close to C, so we just document the differences on specific 
>> pages, and put the general stuff here:
>> https://petsc.org/release/manual/fortran/
>> ? ?Thanks,
>> ? ? ?Matt
>>
>>     Bruce Palmer
>>
>>
>> --
>> What most experimenters take for granted before they begin their 
>> experiments is infinitely more interesting than any results to which 
>> their experiments lead.
>> -- Norbert Wiener
>> https://www.cse.buffalo.edu/~knepley/ 
>> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/cd13c2bc/attachment-0001.html>

From onur.notonur at proton.me  Tue Dec 12 11:21:31 2023
From: onur.notonur at proton.me (onur.notonur)
Date: Tue, 12 Dec 2023 17:21:31 +0000
Subject: [petsc-users] DMPlex "Could not find orientation for quadrilateral"
Message-ID: <GUzuCgxuCY0OT2kqCygKIywdgr3WIQaiia2WtuuEiPgiLQ4oXQ1cK1O41nKsRiQnkHAtOgdnvjHMpenCdIwqHA5Ok7SVCbZ-zBuqZBB8OwU=@proton.me>

Hi,

I hope this email finds you well. I am currently working on importing an OpenFOAM PolyMesh into DMPlex, and I've encountered an issue. The PolyMesh format includes face owner cells/neighbor cells and face-to-vertex connectivity. I was using the "DMPlexCreateFromCellListPetsc()" function, which required cell-to-vertex connectivity. However, when attempting to create the cell connectivity using an edge loop [p_0, p_1, ..., p_7] (p_n and p_(n+1) are valid edges in my mesh), I encountered an error stating, "Could not find orientation for quadrilateral."

(Actually at first, I generated the connectivity list by simply creating a cell-to-face list and then using that to create a cell-to-vertex list. (just map over the list and remove duplicates) This created a DMPlex successfully, however, resulted in a mesh that was incorrect when looking with ParaView. I think that was because of I stated wrong edge loop to create cells)

I understand that I may need to follow a different format for connectivity, but I'm not sure what that format is. My current mesh is hexahedral, consisting of 8 corner elements(if important). I would appreciate any guidance on a general approach to address this issue.

Thank you for your time and assistance.

Best,
Onur

Sent with Proton Mail secure email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/8b29c305/attachment.html>

From bldenton at buffalo.edu  Tue Dec 12 11:22:37 2023
From: bldenton at buffalo.edu (Brandon Denton)
Date: Tue, 12 Dec 2023 17:22:37 +0000
Subject: [petsc-users] Applying Natural Boundary Conditions using PETSc FEM
 Technology
Message-ID: <PH7PR15MB60589EA4B2BEA7B025E01574C18EA@PH7PR15MB6058.namprd15.prod.outlook.com>

Good Afternoon,

I am currently working on an Inviscid Navier-Stokes problem and would like to apply DM_BC_NATURAL boundary conditions to my domain. Looking through the example files on petsc.org, I noticed that in almost all cases there are the following series of calls.

PetscCall(DMAddBoundary(dm, DM_BC_NATURAL, "wall", label, 1, &id, 0, 0, NULL, NULL, NULL, user, &bd));
PetscCall(PetscDSGetBoundary(ds, bd, &wf, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL));
PetscCall(PetscWeakFormSetIndexBdResidual(wf, label, id, 0, 0, 0, f0_bd_u, 0, NULL));

Is this the standard way of applying Natural boundary conditions in PETSc for FEM? Also, I noticed in the signature for the  f0_bd_u function, there is a const PetscReal n[] array. What is this array and what information does it hold. Is it the normal vector at the point?

static void f0_bd_u(PetscInt dim, PetscInt Nf, PetscInt NfAux, const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[], const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[], PetscReal t, const PetscReal x[], const PetscReal n[], PetscInt numConstants, const PetscScalar constants[], PetscScalar f0[])

Thank you in advance for your time.
Brandon


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/c50024ae/attachment.html>

From mmolinos at us.es  Tue Dec 12 13:36:01 2023
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Tue, 12 Dec 2023 19:36:01 +0000
Subject: [petsc-users] Domain decomposition in PETSc for Molecular
 Dynamics
In-Reply-To: <2BEA961D-00D5-4880-A162-7262E398C048@us.es>
References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es>
Message-ID: <A06F4FB7-EE3E-4394-82BF-0825946A56A9@us.es>

Dear Matthew and Mark,

Thank you for four useful guidance.  I have taken as a starting point the example in "dm/tutorials/swarm_ex3.c" to build a first approximation for domain decomposition in my molecular dynamics code (diffusive molecular dynamic to be more precise :-) ). And I must say that I am very happy with the result. However, in my journey integrating domain decomposition into my code, I am facing some new (and expected) problems.  The first is in the implementation of the nearest neighbor algorithm (list of atoms closest to each atom).

My current approach to the problem is a brute force algorithm (double loop through the list of atoms and calculate the distance). However, it seems that if I call the "neighbours" function after the "DMSwarmMigrate" function the search algorithm does not work correctly. My thoughts / hints are:

  *   The two nested for loops should be done on the global indexing of the atoms instead of the local one (I don't know how to get this number).
  *   If I print the mean position of atom #0 (for example) each range prints a different value of the average position. One of them is the correct position corresponding to site #0, the others are different (but identically labeled) atomic sites. Which means that the site_i index is not bijective.

I believe that solving this problem will increase my understanding of the domain decomposition approach and may allow me to fix the remaining parts of my code.

Any additional comments are greatly appreciated. For instance, I will be happy to be pointed to any piece of code (petsc examples for example) with solves a similar problem in order to self-learn learn by example.

Many thanks in advance.

Best,
Miguel

This is the piece of code (simplified) which computes the list of neighbours for each atomic site. DMD is a structure which contains the atomistic information (DMSWARM), and the background mesh and bounding cell (DMDA and DMShell)

int neighbours(DMD* Simulation) {

PetscFunctionBegin;
PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));

PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local));

//! Get array with the mean position of the atoms
DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL,
(void**)&mean_q_ptr);
Eigen::Map<MatrixType> mean_q(mean_q_ptr, n_atoms_local, dim);

int* neigh = Simulation->neigh;
int* numneigh = Simulation->numneigh;

for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) {

//! Get mean position of site i
Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0);

//! Search neighbourhs in the main cell (+ periodic cells)
for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) {
if (site_i != site_j) {
//! Get mean position of site j in the periodic box
Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0);

//! Check is site j is the neibourhood of the site i
double norm_r_ij = (mean_q_i - mean_q_j).norm();
if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) {
neigh[site_i * maxneigh + numneigh[site_i]] = site_j;
numneigh[site_i] += 1;
}
}
}

} // MPI for loop (site_i)

DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize,
NULL, (void**)&mean_q_ptr);

return EXIT_SUCCESS;
}


This is the piece of code that I use to read the atomic positions (mean_q) from a file:
//! @brief mean_q: Mean value of each atomic position
double* mean_q;
PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &blocksize,
NULL, (void**)&mean_q));

cnt = 0;
for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) {
if (cnt < n_atoms) {
mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0];
mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1];
mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2];

cnt++;
}
}
PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor,
&blocksize, NULL, (void**)&mean_q));


[Screenshot 2023-12-12 at 19.42.13.png]
On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

?Thank you Mark! I will have a look to it.

Best,
Miguel


On 4 Nov 2023, at 13:54, Matthew Knepley <knepley at gmail.com> wrote:

?
On Sat, Nov 4, 2023 at 8:40?AM Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>> wrote:
Hi MIGUEL,

This might be a good place to start: https://petsc.org/main/manual/vec/
Feel free to ask more specific questions, but the docs are a good place to start.

Thanks,
Mark

On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Dear all,

I am currently working on the development of a in-house molecular dynamics code using PETSc and C++. So far the code works great, however it is a little bit slow since I am not exploiting MPI for PETSc vectors. I was wondering if there is a way to perform the domain decomposition efficiently using some PETSc functionality. Any feedback is highly appreciated.

It sounds like you mean "is there a way to specify a communication construct that can send my particle
information automatically". We use PetscSF for that. You can see how this works with the DMSwarm class, which represents a particle discretization. You can either use that, or if it does not work for you, do the same things with your class.

  Thanks,

     Matt

Best regards,
Miguel


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/9a1ee103/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2023-12-12 at 19.42.13.png
Type: image/png
Size: 1252024 bytes
Desc: Screenshot 2023-12-12 at 19.42.13.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/9a1ee103/attachment-0001.png>

From Bruce.Palmer at pnnl.gov  Tue Dec 12 13:54:57 2023
From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J)
Date: Tue, 12 Dec 2023 19:54:57 +0000
Subject: [petsc-users] Fortran Interface
In-Reply-To: <B0631191-BBBA-49DB-855A-46285057EE5B@petsc.dev>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
	<CO6PR09MB7957D340C80D08583C6869ACFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<B0631191-BBBA-49DB-855A-46285057EE5B@petsc.dev>
Message-ID: <CO6PR09MB79572BFC0B57EF2057EA9ED8FC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>

I think having a link to your fortran interface page on the C/Fortran API tab is probably sufficient, particularly if the interfaces are similar. If functions have significant differences between C and Fortran, it would be helpful if the notes about it are on the page describing the function.

I?m the project lead for Global Arrays and we wrote our API documentation in LaTeX. Each function has C and Fortran-specific documentation as well as some generic documentation that can apply to either interface. We run the tex files through a preprocessor that filters out just the C or Fortran-specific text to build the documentation for the C or Fortran API. It sorta works, but it is a fair amount of effort to keep everything synched up and we have a lot fewer functions in our API than you do. The one advantage is that everything about a particular function is located in one spot, so it makes it relatively easy to fix everything up if you make changes.

Bruce

From: Barry Smith <bsmith at petsc.dev>
Date: Tuesday, December 12, 2023 at 9:07 AM
To: Palmer, Bruce J <Bruce.Palmer at pnnl.gov>
Cc: Matthew Knepley <knepley at gmail.com>, petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Fortran Interface

  It is unlikely we will ever be able to maintain full manual pages for Fortran for all routines. But yes, the current pages are C-centric.

  Do you have any suggestions on what we could add to the current manual pages or how to format them etc that would make them better for Fortran users who are not used to C?  A Fortran synopsis as well as the C one, or a single synopsis that is easier for both Fortran and C users to follow?

  Barry

I am not sure it is trivial to automatically generate the Fortran synposis with appropriate use and include information but one could argue that we should.


On Dec 12, 2023, at 11:40?AM, Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov> wrote:

Thanks! It might be useful if there were a link to this page near the top of the C/Fortran API page.

Bruce

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Tuesday, December 12, 2023 at 8:33 AM
To: Palmer, Bruce J <Bruce.Palmer at pnnl.gov<mailto:Bruce.Palmer at pnnl.gov>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Fortran Interface
Check twice before you click! This email originated from outside PNNL.

On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release<http://petsc.org/release>) but if you go under the tab C/Fortran API, only descriptions for the C interface are there.

I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here:

  https://petsc.org/release/manual/fortran/

   Thanks,

     Matt

Bruce Palmer


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/494f8dda/attachment.html>

From Bruce.Palmer at pnnl.gov  Tue Dec 12 14:22:59 2023
From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J)
Date: Tue, 12 Dec 2023 20:22:59 +0000
Subject: [petsc-users] Fortran Interface
In-Reply-To: <CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
Message-ID: <CO6PR09MB795740BA24FD98AC04867EC3FC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>

What do you do with something like a void pointer? I?m looking at the TaoSetObjectiveAndGradient function and it wants to pass a void *ctx pointer. You can set this to null, but apparently you have to specify the type. What type should I use? Is there something called PETSC_NULL_VOID or PETSC_NULL_CONTEXT or do I use something else?

From: Matthew Knepley <knepley at gmail.com>
Date: Tuesday, December 12, 2023 at 8:33 AM
To: Palmer, Bruce J <Bruce.Palmer at pnnl.gov>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Fortran Interface
Check twice before you click! This email originated from outside PNNL.

On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release<http://petsc.org/release>) but if you go under the tab C/Fortran API, only descriptions for the C interface are there.

I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here:

  https://petsc.org/release/manual/fortran/

   Thanks,

     Matt

Bruce Palmer


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/d4b5e284/attachment-0001.html>

From rlmackie862 at gmail.com  Tue Dec 12 16:06:10 2023
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Tue, 12 Dec 2023 14:06:10 -0800
Subject: [petsc-users] valgrind errors
Message-ID: <C89D0ED3-2602-4592-9EE8-36FED4390CEC@gmail.com>

It now seems to me that petsc+mpich is no longer valgrind clean, or I am doing something wrong.

A simple program:


Program test
 
#include "petsc/finclude/petscsys.h"
  use petscsys
 
  PetscInt :: ierr
 
  call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
  call PetscFinalize(ierr)
 
end program test


PETSc compiled in debug mode, complex scalars, and download-mpich, when run with valgrind generates errors like these:

==3997== Syscall param writev(vector[...]) points to uninitialised byte(s)
==3997==    at 0x8C31867: writev (writev.c:26)
==3997==    by 0x9C20DE4: MPL_large_writev (mpl_sock.c:31)
==3997==    by 0x9BF1050: MPIDI_CH3I_Sock_writev (sock.c:2689)
==3997==    by 0x9BF9812: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:92)
==3997==    by 0x9BA7790: MPIDI_CH3_EagerContigSend (ch3u_eager.c:191)
==3997==    by 0x9BCA7EC: MPID_Send (mpid_send.c:132)
==3997==    by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206)
==3997==    by 0x9A2AC7C: MPIC_Send (helper_fns.c:126)
==3997==    by 0x993A645: MPIR_Bcast_intra_binomial (bcast_intra_binomial.c:146)
==3997==    by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323)
==3997==    by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420)
==3997==    by 0x99FCF86: MPID_Bcast (mpid_coll.h:30)
==3997==    by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465)
==3997==    by 0x974A513: internal_Bcast (bcast.c:93)
==3997==    by 0x974A72B: PMPI_Bcast (bcast.c:143)
==3997==    by 0x4B8D6DB: PETScParseFortranArgs_Private (zstart.c:182)
==3997==    by 0x4B8DDFA: PetscInitFortran_Private (zstart.c:200)
==3997==    by 0x4B34931: PetscInitialize_Common (pinit.c:974)
==3997==    by 0x4B8E8C7: petscinitializef_ (zstart.c:284)
==3997==    by 0x4959434: __petscsys_MOD_petscinitializenohelp (petscsysmod.F90:374)
==3997==  Address 0x1ffeffcac0 is on thread 1's stack
==3997==  in frame #4, created by MPIDI_CH3_EagerContigSend (ch3u_eager.c:160)
==3997==  Uninitialised value was created by a stack allocation
==3997==    at 0x9BA7601: MPIDI_CH3_EagerContigSend (ch3u_eager.c:160)
==3997==
 
==3997== Syscall param write(buf) points to uninitialised byte(s)
==3997==    at 0x8C2B697: write (write.c:26)
==3997==    by 0x9BF0F1D: MPIDI_CH3I_Sock_write (sock.c:2614)
==3997==    by 0x9BF7AAE: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68)
==3997==    by 0x9BA7A27: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:262)
==3997==    by 0x9BCA766: MPID_Send (mpid_send.c:119)
==3997==    by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206)
==3997==    by 0x9A2AC7C: MPIC_Send (helper_fns.c:126)
==3997==    by 0x993A645: MPIR_Bcast_intra_binomial (bcast_intra_binomial.c:146)
==3997==    by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323)
==3997==    by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420)
==3997==    by 0x99FCF86: MPID_Bcast (mpid_coll.h:30)
==3997==    by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465)
==3997==    by 0x974A513: internal_Bcast (bcast.c:93)
==3997==    by 0x974A72B: PMPI_Bcast (bcast.c:143)
==3997==    by 0x4DB95A2: PetscOptionsGetenv (pdisplay.c:61)
==3997==    by 0x4E0D745: PetscStrreplace (str.c:572)
==3997==    by 0x4AC8DEA: PetscOptionsFilename (options.c:416)
==3997==    by 0x4ACF0B5: PetscOptionsInsertFile (options.c:632)
==3997==    by 0x4AD3CB5: PetscOptionsInsert (options.c:861)
==3997==    by 0x4B8E0EF: PetscInitFortran_Private (zstart.c:206)
==3997==  Address 0x1ffeff7998 is on thread 1's stack
==3997==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:223)
==3997==  Uninitialised value was created by a stack allocation
==3997==    at 0x9BA788F: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:223)
==3997==

Is this a known issue or am I doing something wrong?

Thanks, Randy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/081a6741/attachment.html>

From s.roongta at mpie.de  Tue Dec 12 16:49:48 2023
From: s.roongta at mpie.de (Sharan Roongta)
Date: Tue, 12 Dec 2023 23:49:48 +0100
Subject: [petsc-users] difference in Face Sets in latest petsc release
Message-ID: <3518806234-6560@xmail1.mpie.de>

Hello,


I see discrepancy in the size/value of the 'Face Sets' printed in the current release v3.20.2 , and v3.18.6

Attached is the .msh file

-dm_view with v3.18.6

DM Object: Generated Mesh 1 MPI process
? type: plex
Generated Mesh in 3 dimensions:
? Number of 0-cells per rank: 14
? Number of 1-cells per rank: 49
? Number of 2-cells per rank: 60
? Number of 3-cells per rank: 24
Labels:
? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49))
? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24))
? Cell Sets: 1 strata with value/size (1 (24))
? Face Sets: 5 strata with value/size (1 (4), 2 (4), 3 (4), 4 (4), 5 (4))

-dm_view with the current release (commit?4b9a870af96)


DM Object: Generated Mesh 1 MPI process
? type: plex
Generated Mesh in 3 dimensions:
? Number of 0-cells per rank: 14
? Number of 1-cells per rank: 49
? Number of 2-cells per rank: 60
? Number of 3-cells per rank: 24
Labels:
? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49))
? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24))
? Cell Sets: 1 strata with value/size (1 (24))
? Face Sets: 12 strata with value/size (1 (5), 2 (5), 3 (5), 4 (5), 5 (5), 6 (1), 7 (1), 8 (1), 9 (1), 10 (1), 11 (1), 12 (1))
I believe the older version printed the correct thing??Has something changed in the interpretation of Face Sets?

Thanks,
Sharan


Group - Theory & Simulation
Department of Microstructure Physics & Alloy Design

-------------------------------------------------
Stay up to date and follow us on LinkedIn, Twitter and YouTube.

Max-Planck-Institut f?r Eisenforschung GmbH
Max-Planck-Stra?e 1
D-40237 D?sseldorf
 
Handelsregister B 2533 
Amtsgericht D?sseldorf
 
Gesch?ftsf?hrung
Prof. Dr. Gerhard Dehm
Prof. Dr. J?rg Neugebauer
Prof. Dr. Dierk Raabe
Dr. Kai de Weldige
 
Ust.-Id.-Nr.: DE 11 93 58 514 
Steuernummer: 105 5891 1000


Please consider that invitations and e-mails of our institute are 
only valid if they end with ?@mpie.de. 
If you are not sure of the validity please contact rco at mpie.de

Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. 
In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
-------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/ddc2bd00/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: m1.tar.xz
Type: application/x-xz
Size: 844 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/ddc2bd00/attachment-0001.xz>

From knepley at gmail.com  Tue Dec 12 17:02:14 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Dec 2023 18:02:14 -0500
Subject: [petsc-users] Applying Natural Boundary Conditions using PETSc
 FEM Technology
In-Reply-To: <PH7PR15MB60589EA4B2BEA7B025E01574C18EA@PH7PR15MB6058.namprd15.prod.outlook.com>
References: <PH7PR15MB60589EA4B2BEA7B025E01574C18EA@PH7PR15MB6058.namprd15.prod.outlook.com>
Message-ID: <CAMYG4G=Q=nbGPzhvp8LA2EjaFhqfDdZ3=iGbSF_W6oSgnSmwTA@mail.gmail.com>

On Tue, Dec 12, 2023 at 12:23?PM Brandon Denton via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Good Afternoon,
>
> I am currently working on an Inviscid Navier-Stokes problem and would like
> to apply DM_BC_NATURAL boundary conditions to my domain. Looking through
> the example files on petsc.org, I noticed that in almost all cases there
> are the following series of calls.
>
> PetscCall(DMAddBoundary(dm, DM_BC_NATURAL, "wall", label, 1, &id, 0, 0,
> NULL, NULL, NULL, user, &bd));
> PetscCall(PetscDSGetBoundary(ds, bd, &wf, NULL, NULL, NULL, NULL, NULL,
> NULL, NULL, NULL, NULL, NULL, NULL));
> PetscCall(PetscWeakFormSetIndexBdResidual(wf, label, id, 0, 0, 0, f0_bd_u,
> 0, NULL));
>
> Is this the standard way of applying Natural boundary conditions in PETSc
> for FEM?
>

Yes. The problem is that AddBoundary was designed just to deliver boundary
values, but inhomogeneous Neumann conditions really want weak forms, and
the weak form interface came later. It is a little clunky.


> Also, I noticed in the signature for the  f0_bd_u function, there is a
> const PetscReal n[] array. What is this array and what information does it
> hold. Is it the normal vector at the point?
>

That is the normal at the evaluation point.

  Thanks,

     Matt


> static void f0_bd_u(PetscInt dim, PetscInt Nf, PetscInt NfAux, const
> PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const
> PetscScalar u_t[], const PetscScalar u_x[], const PetscInt aOff[], const
> PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const
> PetscScalar a_x[], PetscReal t, const PetscReal x[], const PetscReal n[],
> PetscInt numConstants, const PetscScalar constants[], PetscScalar f0[])
>
> Thank you in advance for your time.
> Brandon
>
>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/87a823bf/attachment.html>

From knepley at gmail.com  Tue Dec 12 17:16:40 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Dec 2023 18:16:40 -0500
Subject: [petsc-users] DMPlex "Could not find orientation for
 quadrilateral"
In-Reply-To: <GUzuCgxuCY0OT2kqCygKIywdgr3WIQaiia2WtuuEiPgiLQ4oXQ1cK1O41nKsRiQnkHAtOgdnvjHMpenCdIwqHA5Ok7SVCbZ-zBuqZBB8OwU=@proton.me>
References: <GUzuCgxuCY0OT2kqCygKIywdgr3WIQaiia2WtuuEiPgiLQ4oXQ1cK1O41nKsRiQnkHAtOgdnvjHMpenCdIwqHA5Ok7SVCbZ-zBuqZBB8OwU=@proton.me>
Message-ID: <CAMYG4G=Cmt=J_9A4p1NkD1C3TMhPUaZamT3YcmgARw7BjLxKDA@mail.gmail.com>

On Tue, Dec 12, 2023 at 12:22?PM onur.notonur via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi,
>
> I hope this email finds you well. I am currently working on importing an
> OpenFOAM PolyMesh into DMPlex, and I've encountered an issue. The PolyMesh
> format includes face owner cells/neighbor cells and face-to-vertex
> connectivity. I was using the "DMPlexCreateFromCellListPetsc()" function,
> which required cell-to-vertex connectivity. However, when attempting to
> create the cell connectivity using an edge loop [p_0, p_1, ..., p_7] (p_n
> and p_(n+1) are valid edges in my mesh), I encountered an error stating,
> "Could not find orientation for quadrilateral."
>
> (Actually at first, I generated the connectivity list by simply creating a
> cell-to-face list and then using that to create a cell-to-vertex list.
> (just map over the list and remove duplicates) This created a DMPlex
> successfully, however, resulted in a mesh that was incorrect when looking
> with ParaView. I think that was because of I stated wrong edge loop to
> create cells)
>
> I understand that I may need to follow a different format for
> connectivity, but I'm not sure what that format is. My current mesh is
> hexahedral, consisting of 8 corner elements(if important). I would
> appreciate any guidance on a general approach to address this issue.
>
Can you start by giving the PolyMesh format, or some URL with it documented?

  Thanks,

    Matt

> Thank you for your time and assistance.
> Best,
> Onur
> Sent with Proton Mail secure email.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/80bc75df/attachment.html>

From knepley at gmail.com  Tue Dec 12 17:51:46 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Dec 2023 18:51:46 -0500
Subject: [petsc-users] difference in Face Sets in latest petsc release
In-Reply-To: <3518806234-6560@xmail1.mpie.de>
References: <3518806234-6560@xmail1.mpie.de>
Message-ID: <CAMYG4GkJBmGEe9-DjvLH8jGm8yeVYHRsCpX4Zu08y0RxGAT4Ww@mail.gmail.com>

On Tue, Dec 12, 2023 at 5:50?PM Sharan Roongta <s.roongta at mpie.de> wrote:

> Hello,
>
> I see discrepancy in the size/value of the 'Face Sets' printed in the
> current release v3.20.2 , and v3.18.6
>
> Attached is the .msh file
>
> -dm_view with v3.18.6
> DM Object: Generated Mesh 1 MPI process
>   type: plex
> Generated Mesh in 3 dimensions:
>   Number of 0-cells per rank: 14
>   Number of 1-cells per rank: 49
>   Number of 2-cells per rank: 60
>   Number of 3-cells per rank: 24
> Labels:
>   celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49))
>   depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24))
>   Cell Sets: 1 strata with value/size (1 (24))
>   Face Sets: 5 strata with value/size (1 (4), 2 (4), 3 (4), 4 (4), 5 (4))
>
>
> -dm_view with the current release (commit 4b9a870af96)
>
> DM Object: Generated Mesh 1 MPI process
>   type: plex
> Generated Mesh in 3 dimensions:
>   Number of 0-cells per rank: 14
>   Number of 1-cells per rank: 49
>   Number of 2-cells per rank: 60
>   Number of 3-cells per rank: 24
> Labels:
>   celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49))
>   depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24))
>   Cell Sets: 1 strata with value/size (1 (24))
>   Face Sets: 12 strata with value/size (1 (5), 2 (5), 3 (5), 4 (5), 5 (5),
> 6 (1), 7 (1), 8 (1), 9 (1), 10 (1), 11 (1), 12 (1))
>
> I believe the older version printed the correct thing? Has something
> changed in the interpretation of Face Sets?
>

Yes. In the older version, I was only labeling cells, faces, and vertices.
There were complaints, so I put in the edge labels. If you check, all the
additional labels are on edges, and checking your .msh file, those edges
clearly have those labels.

  Thanks,

     Matt


> Thanks,
> Sharan
>
> *Group - Theory & Simulation*
> *Department of Microstructure Physics & Alloy Design*
>
>
> ------------------------------
> -------------------------------------------------
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>
> Max-Planck-Institut f?r Eisenforschung GmbH
> Max-Planck-Stra?e 1
> D-40237 D?sseldorf
>
> Handelsregister B 2533
> Amtsgericht D?sseldorf
>
> Gesch?ftsf?hrung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. J?rg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
>
> Ust.-Id.-Nr.: DE 11 93 58 514
> Steuernummer: 105 5891 1000
>
>
> Please consider that invitations and e-mails of our institute are
> only valid if they end with ?@mpie.de.
> If you are not sure of the validity please contact rco at mpie.de
>
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind.
> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
> -------------------------------------------------
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/c7c45592/attachment-0001.html>

From knepley at gmail.com  Tue Dec 12 18:01:13 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Dec 2023 19:01:13 -0500
Subject: [petsc-users] Domain decomposition in PETSc for Molecular
 Dynamics
In-Reply-To: <A06F4FB7-EE3E-4394-82BF-0825946A56A9@us.es>
References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es>
	<A06F4FB7-EE3E-4394-82BF-0825946A56A9@us.es>
Message-ID: <CAMYG4Gmy-y8GNgLCRTAkpQB7Zg20wpPgdExvHWiaHWTTrOnxmA@mail.gmail.com>

On Tue, Dec 12, 2023 at 2:36?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

> Dear Matthew and Mark,
>
> Thank you for four useful guidance.  I have taken as a starting point the
> example in "dm/tutorials/swarm_ex3.c" to build a first approximation for
> domain decomposition in my molecular dynamics code (diffusive molecular
> dynamic to be more precise :-) ). And I must say that I am very happy with
> the result. However, in my journey integrating domain decomposition into my
> code, I am facing some new (and expected) problems.  The first is in the
> implementation of the nearest neighbor algorithm (list of atoms closest to
> each atom).
>

Can you help me understand this? For a given atom, there should be a single
"closest" atom (barring
degeneracies in distance). What do you mean by the list of closest atoms?

  Thanks,

     Matt


> My current approach to the problem is a brute force algorithm (double loop
> through the list of atoms and calculate the distance). However, it seems
> that if I call the "neighbours" function after the "DMSwarmMigrate"
> function the search algorithm does not work correctly. My thoughts / hints
> are:
>
>    - The two nested for loops should be done on the global indexing of
>    the atoms instead of the local one (I don't know how to get this number).
>    - If I print the mean position of atom #0 (for example) each range
>    prints a different value of the average position. One of them is the
>    correct position corresponding to site #0, the others are different (but
>    identically labeled) atomic sites. Which means that the site_i index is not
>    bijective.
>
>
> I believe that solving this problem will increase my understanding of the
> domain decomposition approach and may allow me to fix the remaining parts
> of my code.
>
> Any additional comments are greatly appreciated. For instance, I will be
> happy to be pointed to any piece of code (petsc examples for example) with
> solves a similar problem in order to self-learn learn by example.
>
> Many thanks in advance.
>
> Best,
> Miguel
>
> This is the piece of code (simplified) which computes the list of
> neighbours for each atomic site. DMD is a structure which contains the
> atomistic information (DMSWARM), and the background mesh and bounding
> cell (DMDA and DMShell)
>
> int neighbours(DMD* Simulation) {
>
> PetscFunctionBegin;
> PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));
>
> PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local
> ));
>
> //! Get array with the mean position of the atoms
> DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, &
> blocksize, NULL,
> (void**)&mean_q_ptr);
> Eigen::Map<MatrixType> mean_q(mean_q_ptr, n_atoms_local, dim);
>
> int* neigh = Simulation->neigh;
> int* numneigh = Simulation->numneigh;
>
> for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) {
>
> //! Get mean position of site i
> Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0);
>
> //! Search neighbourhs in the main cell (+ periodic cells)
> for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) {
> if (site_i != site_j) {
> //! Get mean position of site j in the periodic box
> Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0);
>
> //! Check is site j is the neibourhood of the site i
> double norm_r_ij = (mean_q_i - mean_q_j).norm();
> if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) {
> neigh[site_i * maxneigh + numneigh[site_i]] = site_j;
> numneigh[site_i] += 1;
> }
> }
> }
>
> } // MPI for loop (site_i)
>
> DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, &
> blocksize,
> NULL, (void**)&mean_q_ptr);
>
> return EXIT_SUCCESS;
> }
>
>
> This is the piece of code that I use to read the atomic positions (mean_q)
> from a file:
> //! @brief mean_q: Mean value of each atomic position
> double* mean_q;
> PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &blocksize
> ,
> NULL, (void**)&mean_q));
>
> cnt = 0;
> for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) {
> if (cnt < n_atoms) {
> mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0];
> mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1];
> mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2];
>
> cnt++;
> }
> }
> PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor,
> &blocksize, NULL, (void**)&mean_q));
>
>
>
> [image: Screenshot 2023-12-12 at 19.42.13.png]
>
> On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>
> ?Thank you Mark! I will have a look to it.
>
> Best,
> Miguel
>
>
> On 4 Nov 2023, at 13:54, Matthew Knepley <knepley at gmail.com> wrote:
>
> ?
> On Sat, Nov 4, 2023 at 8:40?AM Mark Adams <mfadams at lbl.gov> wrote:
>
>> Hi MIGUEL,
>>
>> This might be a good place to start: https://petsc.org/main/manual/vec/
>> Feel free to ask more specific questions, but the docs are a good place
>> to start.
>>
>> Thanks,
>> Mark
>>
>> On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ <mmolinos at us.es>
>> wrote:
>>
>>> Dear all,
>>>
>>> I am currently working on the development of a in-house molecular
>>> dynamics code using PETSc and C++. So far the code works great, however it
>>> is a little bit slow since I am not exploiting MPI for PETSc vectors. I was
>>> wondering if there is a way to perform the domain decomposition efficiently
>>> using some PETSc functionality. Any feedback is highly appreciated.
>>>
>>
> It sounds like you mean "is there a way to specify a communication
> construct that can send my particle
> information automatically". We use PetscSF for that. You can see how this
> works with the DMSwarm class, which represents a particle discretization.
> You can either use that, or if it does not work for you, do the same things
> with your class.
>
>   Thanks,
>
>      Matt
>
>
>> Best regards,
>>> Miguel
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/a5d04bbe/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2023-12-12 at 19.42.13.png
Type: image/png
Size: 1252024 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/a5d04bbe/attachment-0001.png>

From s.roongta at mpie.de  Tue Dec 12 18:13:02 2023
From: s.roongta at mpie.de (Sharan Roongta)
Date: Wed, 13 Dec 2023 01:13:02 +0100
Subject: [petsc-users] difference in Face Sets in latest petsc release
In-Reply-To: <CAMYG4GkJBmGEe9-DjvLH8jGm8yeVYHRsCpX4Zu08y0RxGAT4Ww@mail.gmail.com>
Message-ID: <3523697218-10216@xmail1.mpie.de>

Thanks for the clarification.
However, would it be more consistent to differentiate between 0D (vertex sets), 1D (edge sets), 2d (faces) and 3D (cell sets)??
If I want to now apply boundary condition on a face with tag 1, it would contain the 4 edges making up that face and an additional edge with the same physical tag??
Basically, I can?t differentiate between the two entities


Thanks,
Sharan?


Group - Theory & Simulation
Department of Microstructure Physics & Alloy Design


 From:   Matthew Knepley <knepley at gmail.com> 
 To:   Sharan Roongta <s.roongta at mpie.de> 
 Cc:   "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
 Sent:   13/12/2023 12:51 AM 
 Subject:   Re: [petsc-users] difference in Face Sets in latest petsc release 


On Tue, Dec 12, 2023 at 5:50?PM Sharan Roongta <s.roongta at mpie.de> wrote:


Hello,


I see discrepancy in the size/value of the 'Face Sets' printed in the current release v3.20.2 , and v3.18.6

Attached is the .msh file

-dm_view with v3.18.6

DM Object: Generated Mesh 1 MPI process
? type: plex
Generated Mesh in 3 dimensions:
? Number of 0-cells per rank: 14
? Number of 1-cells per rank: 49
? Number of 2-cells per rank: 60
? Number of 3-cells per rank: 24
Labels:
? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49))
? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24))
? Cell Sets: 1 strata with value/size (1 (24))
? Face Sets: 5 strata with value/size (1 (4), 2 (4), 3 (4), 4 (4), 5 (4))

-dm_view with the current release (commit?4b9a870af96)


DM Object: Generated Mesh 1 MPI process
? type: plex
Generated Mesh in 3 dimensions:
? Number of 0-cells per rank: 14
? Number of 1-cells per rank: 49
? Number of 2-cells per rank: 60
? Number of 3-cells per rank: 24
Labels:
? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49))
? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24))
? Cell Sets: 1 strata with value/size (1 (24))
? Face Sets: 12 strata with value/size (1 (5), 2 (5), 3 (5), 4 (5), 5 (5), 6 (1), 7 (1), 8 (1), 9 (1), 10 (1), 11 (1), 12 (1))
I believe the older version printed the correct thing??Has something changed in the interpretation of Face Sets?


Yes. In the older version, I was only labeling cells, faces, and vertices. There were complaints, so I put in the edge labels. If you check, all the additional labels are on edges, and checking your .msh file, those edges clearly have those labels.


? Thanks,


? ? ?Matt
?

Thanks,
Sharan


Group - Theory & Simulation
Department of Microstructure Physics & Alloy Design
 
 
----------------
 -------------------------------------------------
 Stay?up?to?date?and?follow?us?on?LinkedIn,?Twitter?and?YouTube.
 
 Max-Planck-Institut?f?r?Eisenforschung?GmbH
 Max-Planck-Stra?e?1
 D-40237?D?sseldorf
 ?
 Handelsregister?B?2533?
 Amtsgericht?D?sseldorf
 ?
 Gesch?ftsf?hrung
 Prof.?Dr.?Gerhard?Dehm
 Prof.?Dr.?J?rg?Neugebauer
 Prof.?Dr.?Dierk?Raabe
 Dr.?Kai?de?Weldige
 ?
 Ust.-Id.-Nr.:?DE?11?93?58?514?
 Steuernummer:?105?5891?1000
 
 
 Please?consider?that?invitations?and?e-mails?of?our?institute?are?
 only?valid?if?they?end?with??@mpie.de.?
 If?you?are?not?sure?of?the?validity?please?contact?rco at mpie.de
 
 Bitte?beachten?Sie,?dass?Einladungen?zu?Veranstaltungen?und?E-Mails
 aus?unserem?Haus?nur?mit?der?Endung??@mpie.de?g?ltig?sind.?
 In?Zweifelsf?llen?wenden?Sie?sich?bitte?an?rco at mpie.de
 -------------------------------------------------
 

-- 


What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener


https://www.cse.buffalo.edu/~knepley/
 

-------------------------------------------------
Stay up to date and follow us on LinkedIn, Twitter and YouTube.

Max-Planck-Institut f?r Eisenforschung GmbH
Max-Planck-Stra?e 1
D-40237 D?sseldorf
 
Handelsregister B 2533 
Amtsgericht D?sseldorf
 
Gesch?ftsf?hrung
Prof. Dr. Gerhard Dehm
Prof. Dr. J?rg Neugebauer
Prof. Dr. Dierk Raabe
Dr. Kai de Weldige
 
Ust.-Id.-Nr.: DE 11 93 58 514 
Steuernummer: 105 5891 1000


Please consider that invitations and e-mails of our institute are 
only valid if they end with ?@mpie.de. 
If you are not sure of the validity please contact rco at mpie.de

Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. 
In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
-------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/1e2e4e35/attachment.html>

From knepley at gmail.com  Tue Dec 12 18:20:22 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Dec 2023 19:20:22 -0500
Subject: [petsc-users] difference in Face Sets in latest petsc release
In-Reply-To: <3523697218-10216@xmail1.mpie.de>
References: <CAMYG4GkJBmGEe9-DjvLH8jGm8yeVYHRsCpX4Zu08y0RxGAT4Ww@mail.gmail.com>
	<3523697218-10216@xmail1.mpie.de>
Message-ID: <CAMYG4Gkk1EKLr_BU3okJXrwcnALkzQcE6iUahNu41P1-5Bh4oQ@mail.gmail.com>

On Tue, Dec 12, 2023 at 7:13?PM Sharan Roongta <s.roongta at mpie.de> wrote:

> Thanks for the clarification.
> However, would it be more consistent to differentiate between 0D (vertex
> sets), 1D (edge sets), 2d (faces) and 3D (cell sets)?
> If I want to now apply boundary condition on a face with tag 1, it would
> contain the 4 edges making up that face and an additional edge with the
> same physical tag?
> Basically, I can?t differentiate between the two entities
>

When we do this in PyLith, we only tag the things we want tagged. In this
mesh, absolutely everything is tagged, and with overlapping tag ranges.
That seems counterproductive.

  Thanks,

     Matt


> Thanks,
> Sharan
>
> *Group - Theory & Simulation*
> *Department of Microstructure Physics & Alloy Design*
>
>
>
> * From: * Matthew Knepley <knepley at gmail.com>
> * To: * Sharan Roongta <s.roongta at mpie.de>
> * Cc: * "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> * Sent: * 13/12/2023 12:51 AM
> * Subject: * Re: [petsc-users] difference in Face Sets in latest petsc
> release
>
> On Tue, Dec 12, 2023 at 5:50?PM Sharan Roongta <s.roongta at mpie.de> wrote:
>
> Hello,
>
> I see discrepancy in the size/value of the 'Face Sets' printed in the
> current release v3.20.2 , and v3.18.6
>
> Attached is the .msh file
>
> -dm_view with v3.18.6
> DM Object: Generated Mesh 1 MPI process
>   type: plex
> Generated Mesh in 3 dimensions:
>   Number of 0-cells per rank: 14
>   Number of 1-cells per rank: 49
>   Number of 2-cells per rank: 60
>   Number of 3-cells per rank: 24
> Labels:
>   celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49))
>   depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24))
>   Cell Sets: 1 strata with value/size (1 (24))
>   Face Sets: 5 strata with value/size (1 (4), 2 (4), 3 (4), 4 (4), 5 (4))
>
>
> -dm_view with the current release (commit 4b9a870af96)
>
> DM Object: Generated Mesh 1 MPI process
>   type: plex
> Generated Mesh in 3 dimensions:
>   Number of 0-cells per rank: 14
>   Number of 1-cells per rank: 49
>   Number of 2-cells per rank: 60
>   Number of 3-cells per rank: 24
> Labels:
>   celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49))
>   depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24))
>   Cell Sets: 1 strata with value/size (1 (24))
>   Face Sets: 12 strata with value/size (1 (5), 2 (5), 3 (5), 4 (5), 5 (5),
> 6 (1), 7 (1), 8 (1), 9 (1), 10 (1), 11 (1), 12 (1))
>
> I believe the older version printed the correct thing? Has something
> changed in the interpretation of Face Sets?
>
>
> Yes. In the older version, I was only labeling cells, faces, and vertices.
> There were complaints, so I put in the edge labels. If you check, all the
> additional labels are on edges, and checking your .msh file, those edges
> clearly have those labels.
>
>   Thanks,
>
>      Matt
>
>
> Thanks,
> Sharan
>
> *Group - Theory & Simulation*
> *Department of Microstructure Physics & Alloy Design*
>
>
> ------------------------------
> -------------------------------------------------
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>
> Max-Planck-Institut f?r Eisenforschung GmbH
> Max-Planck-Stra?e 1
> D-40237 D?sseldorf
>
> Handelsregister B 2533
> Amtsgericht D?sseldorf
>
> Gesch?ftsf?hrung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. J?rg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
>
> Ust.-Id.-Nr.: DE 11 93 58 514
> Steuernummer: 105 5891 1000
>
>
> Please consider that invitations and e-mails of our institute are
> only valid if they end with ?@mpie.de.
> If you are not sure of the validity please contact rco at mpie.de
>
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind.
> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
> -------------------------------------------------
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
> ------------------------------
> -------------------------------------------------
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>
> Max-Planck-Institut f?r Eisenforschung GmbH
> Max-Planck-Stra?e 1
> D-40237 D?sseldorf
>
> Handelsregister B 2533
> Amtsgericht D?sseldorf
>
> Gesch?ftsf?hrung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. J?rg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
>
> Ust.-Id.-Nr.: DE 11 93 58 514
> Steuernummer: 105 5891 1000
>
>
> Please consider that invitations and e-mails of our institute are
> only valid if they end with ?@mpie.de.
> If you are not sure of the validity please contact rco at mpie.de
>
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind.
> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
> -------------------------------------------------
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/9eacebef/attachment-0001.html>

From mmolinos at us.es  Tue Dec 12 18:28:17 2023
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 13 Dec 2023 00:28:17 +0000
Subject: [petsc-users] Domain decomposition in PETSc for Molecular
 Dynamics
In-Reply-To: <CAMYG4Gmy-y8GNgLCRTAkpQB7Zg20wpPgdExvHWiaHWTTrOnxmA@mail.gmail.com>
References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es>
	<A06F4FB7-EE3E-4394-82BF-0825946A56A9@us.es>
	<CAMYG4Gmy-y8GNgLCRTAkpQB7Zg20wpPgdExvHWiaHWTTrOnxmA@mail.gmail.com>
Message-ID: <9072F31D-93C8-43B6-8D96-5DB4CFDECFF9@us.es>

I meant the list of atoms which lies inside of a sphere of radius R_cutoff centered at the mean position of a given atom.

Best,
Miguel

On 13 Dec 2023, at 01:14, Matthew Knepley <knepley at gmail.com> wrote:

?
On Tue, Dec 12, 2023 at 2:36?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Dear Matthew and Mark,

Thank you for four useful guidance.  I have taken as a starting point the example in "dm/tutorials/swarm_ex3.c" to build a first approximation for domain decomposition in my molecular dynamics code (diffusive molecular dynamic to be more precise :-) ). And I must say that I am very happy with the result. However, in my journey integrating domain decomposition into my code, I am facing some new (and expected) problems.  The first is in the implementation of the nearest neighbor algorithm (list of atoms closest to each atom).

Can you help me understand this? For a given atom, there should be a single "closest" atom (barring
degeneracies in distance). What do you mean by the list of closest atoms?

  Thanks,

     Matt

My current approach to the problem is a brute force algorithm (double loop through the list of atoms and calculate the distance). However, it seems that if I call the "neighbours" function after the "DMSwarmMigrate" function the search algorithm does not work correctly. My thoughts / hints are:

  *   The two nested for loops should be done on the global indexing of the atoms instead of the local one (I don't know how to get this number).
  *   If I print the mean position of atom #0 (for example) each range prints a different value of the average position. One of them is the correct position corresponding to site #0, the others are different (but identically labeled) atomic sites. Which means that the site_i index is not bijective.

I believe that solving this problem will increase my understanding of the domain decomposition approach and may allow me to fix the remaining parts of my code.

Any additional comments are greatly appreciated. For instance, I will be happy to be pointed to any piece of code (petsc examples for example) with solves a similar problem in order to self-learn learn by example.

Many thanks in advance.

Best,
Miguel

This is the piece of code (simplified) which computes the list of neighbours for each atomic site. DMD is a structure which contains the atomistic information (DMSWARM), and the background mesh and bounding cell (DMDA and DMShell)

int neighbours(DMD* Simulation) {

PetscFunctionBegin;
PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));

PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local));

//! Get array with the mean position of the atoms
DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL,
(void**)&mean_q_ptr);
Eigen::Map<MatrixType> mean_q(mean_q_ptr, n_atoms_local, dim);

int* neigh = Simulation->neigh;
int* numneigh = Simulation->numneigh;

for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) {

//! Get mean position of site i
Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0);

//! Search neighbourhs in the main cell (+ periodic cells)
for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) {
if (site_i != site_j) {
//! Get mean position of site j in the periodic box
Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0);

//! Check is site j is the neibourhood of the site i
double norm_r_ij = (mean_q_i - mean_q_j).norm();
if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) {
neigh[site_i * maxneigh + numneigh[site_i]] = site_j;
numneigh[site_i] += 1;
}
}
}

} // MPI for loop (site_i)

DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize,
NULL, (void**)&mean_q_ptr);

return EXIT_SUCCESS;
}


This is the piece of code that I use to read the atomic positions (mean_q) from a file:
//! @brief mean_q: Mean value of each atomic position
double* mean_q;
PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &blocksize,
NULL, (void**)&mean_q));

cnt = 0;
for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) {
if (cnt < n_atoms) {
mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0];
mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1];
mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2];

cnt++;
}
}
PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor,
&blocksize, NULL, (void**)&mean_q));


<Screenshot 2023-12-12 at 19.42.13.png>

On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:

?Thank you Mark! I will have a look to it.

Best,
Miguel


On 4 Nov 2023, at 13:54, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

?
On Sat, Nov 4, 2023 at 8:40?AM Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>> wrote:
Hi MIGUEL,

This might be a good place to start: https://petsc.org/main/manual/vec/
Feel free to ask more specific questions, but the docs are a good place to start.

Thanks,
Mark

On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Dear all,

I am currently working on the development of a in-house molecular dynamics code using PETSc and C++. So far the code works great, however it is a little bit slow since I am not exploiting MPI for PETSc vectors. I was wondering if there is a way to perform the domain decomposition efficiently using some PETSc functionality. Any feedback is highly appreciated.

It sounds like you mean "is there a way to specify a communication construct that can send my particle
information automatically". We use PetscSF for that. You can see how this works with the DMSwarm class, which represents a particle discretization. You can either use that, or if it does not work for you, do the same things with your class.

  Thanks,

     Matt

Best regards,
Miguel


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/edfe359f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2023-12-12 at 19.42.13.png
Type: image/png
Size: 1252024 bytes
Desc: Screenshot 2023-12-12 at 19.42.13.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/edfe359f/attachment-0001.png>

From knepley at gmail.com  Tue Dec 12 18:43:29 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Dec 2023 19:43:29 -0500
Subject: [petsc-users] Domain decomposition in PETSc for Molecular
 Dynamics
In-Reply-To: <9072F31D-93C8-43B6-8D96-5DB4CFDECFF9@us.es>
References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es>
	<A06F4FB7-EE3E-4394-82BF-0825946A56A9@us.es>
	<CAMYG4Gmy-y8GNgLCRTAkpQB7Zg20wpPgdExvHWiaHWTTrOnxmA@mail.gmail.com>
	<9072F31D-93C8-43B6-8D96-5DB4CFDECFF9@us.es>
Message-ID: <CAMYG4GnZZowsdGj6gGEWSW_O_Du5k-e25=nD79T5naRLuzkbng@mail.gmail.com>

On Tue, Dec 12, 2023 at 7:28?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

> I meant the list of atoms which lies inside of a sphere of radius R_cutoff
> centered at the mean position of a given atom.
>

Okay, this is possible in parallel, but would require hard work and I have
not done it yet, although I think all the tools are coded.

In serial, there are at least two ways to do it. First, you can use a k-d
tree implementation, since they usually have the radius query. I have not
put one of these in, because I did not like any implementation, but
Underworld and Firedrake have and it works fine. Second, you can choose a
grid size of R_cutoff for the background grid, and then check neighbors.
This is probably how I would start.

  Thanks,

     Matt


> Best,
> Miguel
>
> On 13 Dec 2023, at 01:14, Matthew Knepley <knepley at gmail.com> wrote:
>
> ?
> On Tue, Dec 12, 2023 at 2:36?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es>
> wrote:
>
>> Dear Matthew and Mark,
>>
>> Thank you for four useful guidance.  I have taken as a starting point the
>> example in "dm/tutorials/swarm_ex3.c" to build a first approximation for
>> domain decomposition in my molecular dynamics code (diffusive molecular
>> dynamic to be more precise :-) ). And I must say that I am very happy with
>> the result. However, in my journey integrating domain decomposition into my
>> code, I am facing some new (and expected) problems.  The first is in the
>> implementation of the nearest neighbor algorithm (list of atoms closest to
>> each atom).
>>
>
> Can you help me understand this? For a given atom, there should be a
> single "closest" atom (barring
> degeneracies in distance). What do you mean by the list of closest atoms?
>
>   Thanks,
>
>      Matt
>
>
>> My current approach to the problem is a brute force algorithm (double
>> loop through the list of atoms and calculate the distance). However, it
>> seems that if I call the "neighbours" function after the "DMSwarmMigrate"
>> function the search algorithm does not work correctly. My thoughts / hints
>> are:
>>
>>    - The two nested for loops should be done on the global indexing of
>>    the atoms instead of the local one (I don't know how to get this number).
>>    - If I print the mean position of atom #0 (for example) each range
>>    prints a different value of the average position. One of them is the
>>    correct position corresponding to site #0, the others are different (but
>>    identically labeled) atomic sites. Which means that the site_i index is not
>>    bijective.
>>
>>
>> I believe that solving this problem will increase my understanding of the
>> domain decomposition approach and may allow me to fix the remaining parts
>> of my code.
>>
>> Any additional comments are greatly appreciated. For instance, I will be
>> happy to be pointed to any piece of code (petsc examples for example) with
>> solves a similar problem in order to self-learn learn by example.
>>
>> Many thanks in advance.
>>
>> Best,
>> Miguel
>>
>> This is the piece of code (simplified) which computes the list of
>> neighbours for each atomic site. DMD is a structure which contains the
>> atomistic information (DMSWARM), and the background mesh and bounding
>> cell (DMDA and DMShell)
>>
>> int neighbours(DMD* Simulation) {
>>
>> PetscFunctionBegin;
>> PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));
>>
>> PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local
>> ));
>>
>> //! Get array with the mean position of the atoms
>> DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, &
>> blocksize, NULL,
>> (void**)&mean_q_ptr);
>> Eigen::Map<MatrixType> mean_q(mean_q_ptr, n_atoms_local, dim);
>>
>> int* neigh = Simulation->neigh;
>> int* numneigh = Simulation->numneigh;
>>
>> for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) {
>>
>> //! Get mean position of site i
>> Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0);
>>
>> //! Search neighbourhs in the main cell (+ periodic cells)
>> for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) {
>> if (site_i != site_j) {
>> //! Get mean position of site j in the periodic box
>> Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0);
>>
>> //! Check is site j is the neibourhood of the site i
>> double norm_r_ij = (mean_q_i - mean_q_j).norm();
>> if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) {
>> neigh[site_i * maxneigh + numneigh[site_i]] = site_j;
>> numneigh[site_i] += 1;
>> }
>> }
>> }
>>
>> } // MPI for loop (site_i)
>>
>> DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, &
>> blocksize,
>> NULL, (void**)&mean_q_ptr);
>>
>> return EXIT_SUCCESS;
>> }
>>
>>
>> This is the piece of code that I use to read the atomic positions
>> (mean_q) from a file:
>> //! @brief mean_q: Mean value of each atomic position
>> double* mean_q;
>> PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &
>> blocksize,
>> NULL, (void**)&mean_q));
>>
>> cnt = 0;
>> for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) {
>> if (cnt < n_atoms) {
>> mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0];
>> mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1];
>> mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2];
>>
>> cnt++;
>> }
>> }
>> PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor,
>> &blocksize, NULL, (void**)&mean_q));
>>
>>
>>
>> <Screenshot 2023-12-12 at 19.42.13.png>
>>
>> On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>
>> ?Thank you Mark! I will have a look to it.
>>
>> Best,
>> Miguel
>>
>>
>> On 4 Nov 2023, at 13:54, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> ?
>> On Sat, Nov 4, 2023 at 8:40?AM Mark Adams <mfadams at lbl.gov> wrote:
>>
>>> Hi MIGUEL,
>>>
>>> This might be a good place to start: https://petsc.org/main/manual/vec/
>>> Feel free to ask more specific questions, but the docs are a good place
>>> to start.
>>>
>>> Thanks,
>>> Mark
>>>
>>> On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ <mmolinos at us.es>
>>> wrote:
>>>
>>>> Dear all,
>>>>
>>>> I am currently working on the development of a in-house molecular
>>>> dynamics code using PETSc and C++. So far the code works great, however it
>>>> is a little bit slow since I am not exploiting MPI for PETSc vectors. I was
>>>> wondering if there is a way to perform the domain decomposition efficiently
>>>> using some PETSc functionality. Any feedback is highly appreciated.
>>>>
>>>
>> It sounds like you mean "is there a way to specify a communication
>> construct that can send my particle
>> information automatically". We use PetscSF for that. You can see how this
>> works with the DMSwarm class, which represents a particle discretization.
>> You can either use that, or if it does not work for you, do the same things
>> with your class.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Best regards,
>>>> Miguel
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/52990b36/attachment.html>

From junchao.zhang at gmail.com  Tue Dec 12 20:53:36 2023
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Tue, 12 Dec 2023 19:53:36 -0700
Subject: [petsc-users] valgrind errors
In-Reply-To: <C89D0ED3-2602-4592-9EE8-36FED4390CEC@gmail.com>
References: <C89D0ED3-2602-4592-9EE8-36FED4390CEC@gmail.com>
Message-ID: <CA+MQGp-Om17HkEhvWpVS46a2j8S42vrcBnjh2F=3RyETFZgFgA@mail.gmail.com>

I was able to reproduce it.  Let me ask MPICH developers.

--Junchao Zhang


On Tue, Dec 12, 2023 at 3:06?PM Randall Mackie <rlmackie862 at gmail.com>
wrote:

> It now seems to me that petsc+mpich is no longer valgrind clean, or I am
> doing something wrong.
>
> A simple program:
>
>
> Program test
>
> #include "petsc/finclude/petscsys.h"
>   use petscsys
>
>   PetscInt :: ierr
>
>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>   call PetscFinalize(ierr)
>
> end program test
>
>
> PETSc compiled in debug mode, complex scalars, and download-mpich, when
> run with valgrind generates errors like these:
>
> ==3997== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==3997==    at 0x8C31867: writev (writev.c:26)
> ==3997==    by 0x9C20DE4: MPL_large_writev (mpl_sock.c:31)
> ==3997==    by 0x9BF1050: MPIDI_CH3I_Sock_writev (sock.c:2689)
> ==3997==    by 0x9BF9812: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:92)
> ==3997==    by 0x9BA7790: MPIDI_CH3_EagerContigSend (ch3u_eager.c:191)
> ==3997==    by 0x9BCA7EC: MPID_Send (mpid_send.c:132)
> ==3997==    by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206)
> ==3997==    by 0x9A2AC7C: MPIC_Send (helper_fns.c:126)
> ==3997==    by 0x993A645: MPIR_Bcast_intra_binomial
> (bcast_intra_binomial.c:146)
> ==3997==    by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323)
> ==3997==    by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420)
> ==3997==    by 0x99FCF86: MPID_Bcast (mpid_coll.h:30)
> ==3997==    by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465)
> ==3997==    by 0x974A513: internal_Bcast (bcast.c:93)
> ==3997==    by 0x974A72B: PMPI_Bcast (bcast.c:143)
> ==3997==    by 0x4B8D6DB: PETScParseFortranArgs_Private (zstart.c:182)
> ==3997==    by 0x4B8DDFA: PetscInitFortran_Private (zstart.c:200)
> ==3997==    by 0x4B34931: PetscInitialize_Common (pinit.c:974)
> ==3997==    by 0x4B8E8C7: petscinitializef_ (zstart.c:284)
> ==3997==    by 0x4959434: __petscsys_MOD_petscinitializenohelp
> (petscsysmod.F90:374)
> ==3997==  Address 0x1ffeffcac0 is on thread 1's stack
> ==3997==  in frame #4, created by MPIDI_CH3_EagerContigSend
> (ch3u_eager.c:160)
> ==3997==  Uninitialised value was created by a stack allocation
> ==3997==    at 0x9BA7601: MPIDI_CH3_EagerContigSend (ch3u_eager.c:160)
> ==3997==
>
> ==3997== Syscall param write(buf) points to uninitialised byte(s)
> ==3997==    at 0x8C2B697: write (write.c:26)
> ==3997==    by 0x9BF0F1D: MPIDI_CH3I_Sock_write (sock.c:2614)
> ==3997==    by 0x9BF7AAE: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68)
> ==3997==    by 0x9BA7A27: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:262)
> ==3997==    by 0x9BCA766: MPID_Send (mpid_send.c:119)
> ==3997==    by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206)
> ==3997==    by 0x9A2AC7C: MPIC_Send (helper_fns.c:126)
> ==3997==    by 0x993A645: MPIR_Bcast_intra_binomial
> (bcast_intra_binomial.c:146)
> ==3997==    by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323)
> ==3997==    by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420)
> ==3997==    by 0x99FCF86: MPID_Bcast (mpid_coll.h:30)
> ==3997==    by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465)
> ==3997==    by 0x974A513: internal_Bcast (bcast.c:93)
> ==3997==    by 0x974A72B: PMPI_Bcast (bcast.c:143)
> ==3997==    by 0x4DB95A2: PetscOptionsGetenv (pdisplay.c:61)
> ==3997==    by 0x4E0D745: PetscStrreplace (str.c:572)
> ==3997==    by 0x4AC8DEA: PetscOptionsFilename (options.c:416)
> ==3997==    by 0x4ACF0B5: PetscOptionsInsertFile (options.c:632)
> ==3997==    by 0x4AD3CB5: PetscOptionsInsert (options.c:861)
> ==3997==    by 0x4B8E0EF: PetscInitFortran_Private (zstart.c:206)
> ==3997==  Address 0x1ffeff7998 is on thread 1's stack
> ==3997==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:223)
> ==3997==  Uninitialised value was created by a stack allocation
> ==3997==    at 0x9BA788F: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:223)
> ==3997==
>
> Is this a known issue or am I doing something wrong?
>
> Thanks, Randy
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/e1f40666/attachment-0001.html>

From bsmith at petsc.dev  Tue Dec 12 21:10:40 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 12 Dec 2023 22:10:40 -0500
Subject: [petsc-users] Fortran Interface
In-Reply-To: <CO6PR09MB795740BA24FD98AC04867EC3FC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
	<CO6PR09MB795740BA24FD98AC04867EC3FC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
Message-ID: <94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev>


See https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran and https://gitlab.com/petsc/petsc/-/merge_requests/7114


> On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> What do you do with something like a void pointer? I?m looking at the TaoSetObjectiveAndGradient function and it wants to pass a void *ctx pointer. You can set this to null, but apparently you have to specify the type. What type should I use? Is there something called PETSC_NULL_VOID or PETSC_NULL_CONTEXT or do I use something else?
>  
> From: Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>
> Date: Tuesday, December 12, 2023 at 8:33 AM
> To: Palmer, Bruce J <Bruce.Palmer at pnnl.gov <mailto:Bruce.Palmer at pnnl.gov>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] Fortran Interface
> 
> Check twice before you click! This email originated from outside PNNL.
>  
> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
> Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release <http://petsc.org/release>) but if you go under the tab C/Fortran API, only descriptions for the C interface are there.
>  
> I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here:
>  
>   https://petsc.org/release/manual/fortran/
>  
>    Thanks,
>  
>      Matt
>  
> Bruce Palmer
> 
>  
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/29e6a15e/attachment.html>

From s_g at berkeley.edu  Tue Dec 12 22:17:06 2023
From: s_g at berkeley.edu (Sanjay Govindjee)
Date: Tue, 12 Dec 2023 20:17:06 -0800
Subject: [petsc-users] Fortran Interface
In-Reply-To: <94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
	<CO6PR09MB795740BA24FD98AC04867EC3FC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev>
Message-ID: <CA+bBegm438FexJic6iXg3Aiy4Tk36weC+3kWBdW0S2n7p2185Q@mail.gmail.com>

did you mean to write

type (userctx)   ctx

in this example?


subroutine func(snes, x, f, ctx, ierr)
SNES snes
Vec x,f
type (userctx)   user
PetscErrorCode ierr
...

external func
SNESSetFunction(snes, r, func, ctx, ierr)
SNES snes
Vec r
PetscErrorCode ierr
type (userctx)   user


On Tue, Dec 12, 2023 at 7:10?PM Barry Smith <bsmith at petsc.dev> wrote:

>
> See
> https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran
>  and https://gitlab.com/petsc/petsc/-/merge_requests/7114
>
>
>
> On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> What do you do with something like a void pointer? I?m looking at the
> TaoSetObjectiveAndGradient function and it wants to pass a void *ctx
> pointer. You can set this to null, but apparently you have to specify the
> type. What type should I use? Is there something called PETSC_NULL_VOID or
> PETSC_NULL_CONTEXT or do I use something else?
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Tuesday, December 12, 2023 at 8:33 AM
> *To: *Palmer, Bruce J <Bruce.Palmer at pnnl.gov>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] Fortran Interface
> Check twice before you click! This email originated from outside PNNL.
>
> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Does documentation for the PETSc fortran interface still exist? I looked
> at the web pages for 3.20 (petsc.org/release) but if you go under the tab
> C/Fortran API, only descriptions for the C interface are there.
>
>
> I think after the most recent changes, the interface was supposed to be
> very close to C, so we just document the differences on specific pages, and
> put the general stuff here:
>
>   https://petsc.org/release/manual/fortran/
>
>    Thanks,
>
>      Matt
>
>
> Bruce Palmer
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/49e5a7ed/attachment-0001.html>

From junchao.zhang at gmail.com  Tue Dec 12 22:22:03 2023
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Tue, 12 Dec 2023 21:22:03 -0700
Subject: [petsc-users] valgrind errors
In-Reply-To: <CA+MQGp-Om17HkEhvWpVS46a2j8S42vrcBnjh2F=3RyETFZgFgA@mail.gmail.com>
References: <C89D0ED3-2602-4592-9EE8-36FED4390CEC@gmail.com>
	<CA+MQGp-Om17HkEhvWpVS46a2j8S42vrcBnjh2F=3RyETFZgFgA@mail.gmail.com>
Message-ID: <CA+MQGp_GODvuy-B83wY5Aa_3qaLHzSo8NL8DG84yMqRFkQH6Cg@mail.gmail.com>

MPICH folks confirmed it's an MPICH problem and an issue is created at
https://github.com/pmodels/mpich/issues/6843

--Junchao Zhang


On Tue, Dec 12, 2023 at 7:53?PM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> I was able to reproduce it.  Let me ask MPICH developers.
>
> --Junchao Zhang
>
>
> On Tue, Dec 12, 2023 at 3:06?PM Randall Mackie <rlmackie862 at gmail.com>
> wrote:
>
>> It now seems to me that petsc+mpich is no longer valgrind clean, or I am
>> doing something wrong.
>>
>> A simple program:
>>
>>
>> Program test
>>
>> #include "petsc/finclude/petscsys.h"
>>   use petscsys
>>
>>   PetscInt :: ierr
>>
>>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>>   call PetscFinalize(ierr)
>>
>> end program test
>>
>>
>> PETSc compiled in debug mode, complex scalars, and download-mpich, when
>> run with valgrind generates errors like these:
>>
>> ==3997== Syscall param writev(vector[...]) points to uninitialised byte(s)
>> ==3997==    at 0x8C31867: writev (writev.c:26)
>> ==3997==    by 0x9C20DE4: MPL_large_writev (mpl_sock.c:31)
>> ==3997==    by 0x9BF1050: MPIDI_CH3I_Sock_writev (sock.c:2689)
>> ==3997==    by 0x9BF9812: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:92)
>> ==3997==    by 0x9BA7790: MPIDI_CH3_EagerContigSend (ch3u_eager.c:191)
>> ==3997==    by 0x9BCA7EC: MPID_Send (mpid_send.c:132)
>> ==3997==    by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206)
>> ==3997==    by 0x9A2AC7C: MPIC_Send (helper_fns.c:126)
>> ==3997==    by 0x993A645: MPIR_Bcast_intra_binomial
>> (bcast_intra_binomial.c:146)
>> ==3997==    by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323)
>> ==3997==    by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420)
>> ==3997==    by 0x99FCF86: MPID_Bcast (mpid_coll.h:30)
>> ==3997==    by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465)
>> ==3997==    by 0x974A513: internal_Bcast (bcast.c:93)
>> ==3997==    by 0x974A72B: PMPI_Bcast (bcast.c:143)
>> ==3997==    by 0x4B8D6DB: PETScParseFortranArgs_Private (zstart.c:182)
>> ==3997==    by 0x4B8DDFA: PetscInitFortran_Private (zstart.c:200)
>> ==3997==    by 0x4B34931: PetscInitialize_Common (pinit.c:974)
>> ==3997==    by 0x4B8E8C7: petscinitializef_ (zstart.c:284)
>> ==3997==    by 0x4959434: __petscsys_MOD_petscinitializenohelp
>> (petscsysmod.F90:374)
>> ==3997==  Address 0x1ffeffcac0 is on thread 1's stack
>> ==3997==  in frame #4, created by MPIDI_CH3_EagerContigSend
>> (ch3u_eager.c:160)
>> ==3997==  Uninitialised value was created by a stack allocation
>> ==3997==    at 0x9BA7601: MPIDI_CH3_EagerContigSend (ch3u_eager.c:160)
>> ==3997==
>>
>> ==3997== Syscall param write(buf) points to uninitialised byte(s)
>> ==3997==    at 0x8C2B697: write (write.c:26)
>> ==3997==    by 0x9BF0F1D: MPIDI_CH3I_Sock_write (sock.c:2614)
>> ==3997==    by 0x9BF7AAE: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68)
>> ==3997==    by 0x9BA7A27: MPIDI_CH3_EagerContigShortSend
>> (ch3u_eager.c:262)
>> ==3997==    by 0x9BCA766: MPID_Send (mpid_send.c:119)
>> ==3997==    by 0x9BCAC64: MPID_Send_coll (mpid_send.c:206)
>> ==3997==    by 0x9A2AC7C: MPIC_Send (helper_fns.c:126)
>> ==3997==    by 0x993A645: MPIR_Bcast_intra_binomial
>> (bcast_intra_binomial.c:146)
>> ==3997==    by 0x99FF64A: MPIR_Bcast_allcomm_auto (mpir_coll.c:323)
>> ==3997==    by 0x99FFC06: MPIR_Bcast_impl (mpir_coll.c:420)
>> ==3997==    by 0x99FCF86: MPID_Bcast (mpid_coll.h:30)
>> ==3997==    by 0x99FFE13: MPIR_Bcast (mpir_coll.c:465)
>> ==3997==    by 0x974A513: internal_Bcast (bcast.c:93)
>> ==3997==    by 0x974A72B: PMPI_Bcast (bcast.c:143)
>> ==3997==    by 0x4DB95A2: PetscOptionsGetenv (pdisplay.c:61)
>> ==3997==    by 0x4E0D745: PetscStrreplace (str.c:572)
>> ==3997==    by 0x4AC8DEA: PetscOptionsFilename (options.c:416)
>> ==3997==    by 0x4ACF0B5: PetscOptionsInsertFile (options.c:632)
>> ==3997==    by 0x4AD3CB5: PetscOptionsInsert (options.c:861)
>> ==3997==    by 0x4B8E0EF: PetscInitFortran_Private (zstart.c:206)
>> ==3997==  Address 0x1ffeff7998 is on thread 1's stack
>> ==3997==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
>> (ch3u_eager.c:223)
>> ==3997==  Uninitialised value was created by a stack allocation
>> ==3997==    at 0x9BA788F: MPIDI_CH3_EagerContigShortSend
>> (ch3u_eager.c:223)
>> ==3997==
>>
>> Is this a known issue or am I doing something wrong?
>>
>> Thanks, Randy
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231212/b49cf17c/attachment.html>

From onur.notonur at proton.me  Wed Dec 13 02:12:51 2023
From: onur.notonur at proton.me (onur.notonur)
Date: Wed, 13 Dec 2023 08:12:51 +0000
Subject: [petsc-users] DMPlex "Could not find orientation for
 quadrilateral"
In-Reply-To: <CAMYG4G=Cmt=J_9A4p1NkD1C3TMhPUaZamT3YcmgARw7BjLxKDA@mail.gmail.com>
References: <GUzuCgxuCY0OT2kqCygKIywdgr3WIQaiia2WtuuEiPgiLQ4oXQ1cK1O41nKsRiQnkHAtOgdnvjHMpenCdIwqHA5Ok7SVCbZ-zBuqZBB8OwU=@proton.me>
	<CAMYG4G=Cmt=J_9A4p1NkD1C3TMhPUaZamT3YcmgARw7BjLxKDA@mail.gmail.com>
Message-ID: <SL0ha9AC3uVc6utj3CXtSeAQJbqGbi0QdEIlq45TPM75ZDDPVOZw6w0ynFoM5dkV8-k8UYwdiEd96qUhm3bDWCrYLUaRQ-ATb0DlvpEHi38=@proton.me>

Hi,

This page explains polyMesh in the section 4.1.2: https://www.openfoam.com/documentation/user-guide/4-mesh-generation-and-conversion/4.1-mesh-description
(Also I added my polyMesh files as attachment.)

After my first email I generated valid mesh by finding 2 of the 6 faces which have no common vertex, then arranging them to look like this: 1-2-3-4 is my first face, 5-6-7-8 is my second face. But this approach isn't a general solution (but currently it works because my mesh consists of only hexahedral elements), If I could learn the true solution I'd be happy! Thank you again

1 2
+------+.
|`. | `.
| 4+--+---+3
| | | |
5+---+--+8 |
`. | `.|  6+------+7

Sent with [Proton Mail](https://proton.me/) secure email.

On Wednesday, December 13th, 2023 at 2:16 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Dec 12, 2023 at 12:22?PM onur.notonur via petsc-users <petsc-users at mcs.anl.gov> wrote:
>
>> Hi,
>>
>> I hope this email finds you well. I am currently working on importing an OpenFOAM PolyMesh into DMPlex, and I've encountered an issue. The PolyMesh format includes face owner cells/neighbor cells and face-to-vertex connectivity. I was using the "DMPlexCreateFromCellListPetsc()" function, which required cell-to-vertex connectivity. However, when attempting to create the cell connectivity using an edge loop [p_0, p_1, ..., p_7] (p_n and p_(n+1) are valid edges in my mesh), I encountered an error stating, "Could not find orientation for quadrilateral."
>>
>> (Actually at first, I generated the connectivity list by simply creating a cell-to-face list and then using that to create a cell-to-vertex list. (just map over the list and remove duplicates) This created a DMPlex successfully, however, resulted in a mesh that was incorrect when looking with ParaView. I think that was because of I stated wrong edge loop to create cells)
>>
>> I understand that I may need to follow a different format for connectivity, but I'm not sure what that format is. My current mesh is hexahedral, consisting of 8 corner elements(if important). I would appreciate any guidance on a general approach to address this issue.
>
> Can you start by giving the PolyMesh format, or some URL with it documented?
>
> Thanks,
>
> Matt
>
>> Thank you for your time and assistance.
>>
>> Best,
>> Onur
>>
>> Sent with Proton Mail secure email.
>
> --
>
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>
> [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/aaef6c7e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: polyMesh.zip
Type: application/zip
Size: 426688 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/aaef6c7e/attachment-0001.zip>

From 1807580692 at qq.com  Wed Dec 13 02:51:09 2023
From: 1807580692 at qq.com (=?gb18030?B?MTgwNzU4MDY5Mg==?=)
Date: Wed, 13 Dec 2023 16:51:09 +0800
Subject: [petsc-users] (no subject)
Message-ID: <tencent_9F54F2045E6F70463CA6B75727799B7D5408@qq.com>

Hello, I have encountered some problems. Here are some of my configurations.
OS Version and Type:&nbsp; Linux daihuanhe-Aspire-A315-55G 5.15.0-89-generic #99~20.04.1-Ubuntu SMP Thu Nov 2 15:16:47 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
PETSc Version: #define PETSC_VERSION_RELEASE&nbsp; &nbsp; 1
		#define PETSC_VERSION_MAJOR&nbsp; &nbsp; &nbsp; 3
		#define PETSC_VERSION_MINOR&nbsp; &nbsp; &nbsp; 19
		#define PETSC_VERSION_SUBMINOR&nbsp; &nbsp;0
		#define PETSC_RELEASE_DATE&nbsp; &nbsp; &nbsp; &nbsp;"Mar 30, 2023"
		#define PETSC_VERSION_DATE&nbsp; &nbsp; &nbsp; &nbsp;"unknown"
MPI implementation: MPICH	
Compiler and version: Gnu C
The problem is when I type
?mpiexec -n 4 ./ex19_1 -lidvelocity 100 -prandtl 0.72 -grashof 10000 -da_grid_x 128 -da_grid_y 128 -snes_type newtonls -pc_type fieldsplit -pc_fieldsplit_type multiplicative -pc_fieldsplit_block_size 4 -pc_fieldsplit_0_fields 0,2 -pc_fieldsplit_1_fields 1,3 -fieldsplit_0_pc_type asm -fieldsplit_0_pc_asm_type restrict -fieldsplit_0_pc_asm_overlap 5 -fieldsplit_0_sub_pc_type lu -fieldsplit_1_pc_type asm -fieldsplit_1_pc_asm_type restrict -fieldsplit_1_pc_asm_overlap 5 -fieldsplit_1_sub_pc_type lu -snes_monitor -snes_converged_reason&nbsp; -fieldsplit_ksp_type gmres -fieldsplit_0_ksp_atol 1e-10 -fieldsplit_0_ksp_rtol 1e-8 -fieldsplit_1_ksp_atol 1e-10 -fieldsplit_1_ksp_rtol 1e-8?
in the command line, where my path is /petsc/src/snes/tutorials.


It returns
?lid velocity = 100., prandtl # = 0.72, grashof # = 10000.
&nbsp; 0 SNES Function norm 1.125212317214e+03 
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Block size 2 is incompatible with the indices: non consecutive indices 0 2
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: Invalid argument
[1]PETSC ERROR: Block size 2 is incompatible with the indices: non consecutive indices 16384 16386
[1]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line
[2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[2]PETSC ERROR: Invalid argument
[2]PETSC ERROR: Block size 2 is incompatible with the indices: non consecutive indices 32768 32770
[2]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_ksp_rtol value: 1e-8 source: command line
[2]PETSC ERROR: [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[3]PETSC ERROR: Invalid argument
[3]PETSC ERROR: Block size 2 is incompatible with the indices: non consecutive indices 49152 49154
[3]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_ksp_rtol value: 1e-8 source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_ksp_atol value: 1e-10 source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_ksp_rtol value: 1e-8 source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_asm_overlap value: 5 source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_asm_type value: restrict source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_type value: asm source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_sub_pc_type value: lu source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_ksp_rtol value: 1e-8 source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_asm_overlap value: 5 source: command line
[0]PETSC ERROR: [1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_ksp_rtol value: 1e-8 source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_asm_overlap value: 5 source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_asm_type value: restrict source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_type value: asm source: command line
&nbsp; Option left: name:-fieldsplit_0_pc_asm_overlap value: 5 source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_asm_type value: restrict source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_type value: asm source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_sub_pc_type value: lu source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_asm_overlap value: 5 source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_asm_type value: restrict source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_pc_type value: asm source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_sub_pc_type value: lu source: command line
&nbsp; Option left: name:-fieldsplit_1_pc_asm_type value: restrict source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_type value: asm source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_sub_pc_type value: lu source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_ksp_type value: gmres source: command line
[0]PETSC ERROR:&nbsp; &nbsp;Option left: name:-snes_converged_reason (no value) source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_0_sub_pc_type value: lu source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_ksp_rtol value: 1e-8 source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_asm_overlap value: 5 source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_asm_type value: restrict source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_type value: asm source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_ksp_atol value: 1e-10 source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_ksp_rtol value: 1e-8 source: command line
[3]PETSC ERROR: [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.20.2, unknown 
[0]PETSC ERROR: ./ex19_1 on a arch-linux-c-debug named daihuanhe-Aspire-A315-55G by daihuanhe Wed Dec 13 16:47:08 2023
&nbsp; Option left: name:-fieldsplit_1_ksp_rtol value: 1e-8 source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_asm_overlap value: 5 source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_asm_type value: restrict source: command line
[1]PETSC ERROR: [2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_sub_pc_type value: lu source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_ksp_type value: gmres source: command line
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-snes_converged_reason (no value) source: command line
&nbsp; Option left: name:-fieldsplit_1_pc_asm_overlap value: 5 source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_asm_type value: restrict source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_pc_type value: asm source: command line
[3]PETSC ERROR: [0]PETSC ERROR: Configure options 
&nbsp; Option left: name:-fieldsplit_1_pc_type value: asm source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_sub_pc_type value: lu source: command line
[1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_ksp_type value: gmres source: command line
[2]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[2]PETSC ERROR: Petsc Release Version 3.20.2, unknown 
[2]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_1_sub_pc_type value: lu source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-fieldsplit_ksp_type value: gmres source: command line
[3]PETSC ERROR:&nbsp; &nbsp;Option left: name:-snes_converged_reason (no value) source: command line
[0]PETSC ERROR: #1 ISSetBlockSize() at /home/daihuanhe/petsc-v3.20.2/src/vec/is/is/interface/index.c:1933
[0]PETSC ERROR: [1]PETSC ERROR:&nbsp; &nbsp;Option left: name:-snes_converged_reason (no value) source: command line
[1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.20.2, unknown 
./ex19_1 on a arch-linux-c-debug named daihuanhe-Aspire-A315-55G by daihuanhe Wed Dec 13 16:47:08 2023
[2]PETSC ERROR: Configure options 
[2]PETSC ERROR: [3]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[3]PETSC ERROR: Petsc Release Version 3.20.2, unknown 
[3]PETSC ERROR: ./ex19_1 on a arch-linux-c-debug named daihuanhe-Aspire-A315-55G by daihuanhe Wed Dec 13 16:47:08 2023
#2 PCSetUp_FieldSplit() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c:633
[0]PETSC ERROR: #3 PCSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/interface/precon.c:1080
[1]PETSC ERROR: ./ex19_1 on a arch-linux-c-debug named daihuanhe-Aspire-A315-55G by daihuanhe Wed Dec 13 16:47:08 2023
[1]PETSC ERROR: Configure options 
#1 ISSetBlockSize() at /home/daihuanhe/petsc-v3.20.2/src/vec/is/is/interface/index.c:1933
[2]PETSC ERROR: #2 PCSetUp_FieldSplit() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c:633
[2]PETSC ERROR: [3]PETSC ERROR: Configure options 
[3]PETSC ERROR: #1 ISSetBlockSize() at /home/daihuanhe/petsc-v3.20.2/src/vec/is/is/interface/index.c:1933
[0]PETSC ERROR: #4 KSPSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:415
[0]PETSC ERROR: #5 KSPSolve_Private() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:836
[1]PETSC ERROR: #1 ISSetBlockSize() at /home/daihuanhe/petsc-v3.20.2/src/vec/is/is/interface/index.c:1933
[1]PETSC ERROR: #2 PCSetUp_FieldSplit() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c:633
#3 PCSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/interface/precon.c:1080
[2]PETSC ERROR: #4 KSPSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:415
[2]PETSC ERROR: #5 KSPSolve_Private() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:836
[3]PETSC ERROR: #2 PCSetUp_FieldSplit() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c:633
[3]PETSC ERROR: [0]PETSC ERROR: #6 KSPSolve() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:1083
[0]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/daihuanhe/petsc-v3.20.2/src/snes/impls/ls/ls.c:215
[0]PETSC ERROR: [1]PETSC ERROR: #3 PCSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/interface/precon.c:1080
[1]PETSC ERROR: #4 KSPSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:415
[1]PETSC ERROR: [2]PETSC ERROR: #6 KSPSolve() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:1083
[2]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/daihuanhe/petsc-v3.20.2/src/snes/impls/ls/ls.c:215
[2]PETSC ERROR: #3 PCSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/pc/interface/precon.c:1080
[3]PETSC ERROR: #4 KSPSetUp() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:415
[3]PETSC ERROR: #5 KSPSolve_Private() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:836
[3]PETSC ERROR: #6 KSPSolve() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:1083
#8 SNESSolve() at /home/daihuanhe/petsc-v3.20.2/src/snes/interface/snes.c:4659
[0]PETSC ERROR: #9 main() at ex19_1.c:159
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -da_grid_x 128 (source: command line)
[0]PETSC ERROR: #5 KSPSolve_Private() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:836
[1]PETSC ERROR: #6 KSPSolve() at /home/daihuanhe/petsc-v3.20.2/src/ksp/ksp/interface/itfunc.c:1083
[1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/daihuanhe/petsc-v3.20.2/src/snes/impls/ls/ls.c:215
[1]PETSC ERROR: #8 SNESSolve() at /home/daihuanhe/petsc-v3.20.2/src/snes/interface/snes.c:4659
#8 SNESSolve() at /home/daihuanhe/petsc-v3.20.2/src/snes/interface/snes.c:4659
[2]PETSC ERROR: #9 main() at ex19_1.c:159
[2]PETSC ERROR: PETSc Option Table entries:
[2]PETSC ERROR: -da_grid_x 128 (source: command line)
[2]PETSC ERROR: -da_grid_y 128 (source: command line)
[3]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/daihuanhe/petsc-v3.20.2/src/snes/impls/ls/ls.c:215
[3]PETSC ERROR: #8 SNESSolve() at /home/daihuanhe/petsc-v3.20.2/src/snes/interface/snes.c:4659
[3]PETSC ERROR: -da_grid_y 128 (source: command line)
[0]PETSC ERROR: -fieldsplit_0_ksp_atol 1e-10 (source: command line)
[0]PETSC ERROR: -fieldsplit_0_ksp_rtol 1e-8 (source: command line)
[0]PETSC ERROR: -fieldsplit_0_pc_asm_overlap 5 (source: command line)
[1]PETSC ERROR: #9 main() at ex19_1.c:159
[1]PETSC ERROR: PETSc Option Table entries:
[1]PETSC ERROR: -da_grid_x 128 (source: command line)
[2]PETSC ERROR: -fieldsplit_0_ksp_atol 1e-10 (source: command line)
[2]PETSC ERROR: -fieldsplit_0_ksp_rtol 1e-8 (source: command line)
[2]PETSC ERROR: -fieldsplit_0_pc_asm_overlap 5 (source: command line)
[2]PETSC ERROR: #9 main() at ex19_1.c:159
[3]PETSC ERROR: PETSc Option Table entries:
[3]PETSC ERROR: -da_grid_x 128 (source: command line)
[3]PETSC ERROR: [0]PETSC ERROR: -fieldsplit_0_pc_asm_type restrict (source: command line)
[0]PETSC ERROR: -fieldsplit_0_pc_type asm (source: command line)
[0]PETSC ERROR: -fieldsplit_0_sub_pc_type lu (source: command line)
[0]PETSC ERROR: [1]PETSC ERROR: -da_grid_y 128 (source: command line)
[1]PETSC ERROR: -fieldsplit_0_ksp_atol 1e-10 (source: command line)
[1]PETSC ERROR: -fieldsplit_0_ksp_rtol 1e-8 (source: command line)
-fieldsplit_0_pc_asm_type restrict (source: command line)
[2]PETSC ERROR: -fieldsplit_0_pc_type asm (source: command line)
[2]PETSC ERROR: -fieldsplit_0_sub_pc_type lu (source: command line)
[2]PETSC ERROR: -fieldsplit_1_ksp_atol 1e-10 (source: command line)
-da_grid_y 128 (source: command line)
[3]PETSC ERROR: -fieldsplit_0_ksp_atol 1e-10 (source: command line)
[3]PETSC ERROR: -fieldsplit_0_ksp_rtol 1e-8 (source: command line)
[3]PETSC ERROR: -fieldsplit_1_ksp_atol 1e-10 (source: command line)
[0]PETSC ERROR: -fieldsplit_1_ksp_rtol 1e-8 (source: command line)
[0]PETSC ERROR: -fieldsplit_1_pc_asm_overlap 5 (source: command line)
[0]PETSC ERROR: [1]PETSC ERROR: -fieldsplit_0_pc_asm_overlap 5 (source: command line)
[1]PETSC ERROR: -fieldsplit_0_pc_asm_type restrict (source: command line)
[1]PETSC ERROR: -fieldsplit_0_pc_type asm (source: command line)
[2]PETSC ERROR: -fieldsplit_1_ksp_rtol 1e-8 (source: command line)
[2]PETSC ERROR: -fieldsplit_1_pc_asm_overlap 5 (source: command line)
[2]PETSC ERROR: -fieldsplit_1_pc_asm_type restrict (source: command line)
[2]PETSC ERROR: -fieldsplit_0_pc_asm_overlap 5 (source: command line)
[3]PETSC ERROR: -fieldsplit_0_pc_asm_type restrict (source: command line)
[3]PETSC ERROR: -fieldsplit_0_pc_type asm (source: command line)
[3]PETSC ERROR: -fieldsplit_1_pc_asm_type restrict (source: command line)
[0]PETSC ERROR: -fieldsplit_1_pc_type asm (source: command line)
[0]PETSC ERROR: -fieldsplit_1_sub_pc_type lu (source: command line)
[0]PETSC ERROR: [1]PETSC ERROR: -fieldsplit_0_sub_pc_type lu (source: command line)
[1]PETSC ERROR: -fieldsplit_1_ksp_atol 1e-10 (source: command line)
[1]PETSC ERROR: -fieldsplit_1_ksp_rtol 1e-8 (source: command line)
-fieldsplit_1_pc_type asm (source: command line)
[2]PETSC ERROR: -fieldsplit_1_sub_pc_type lu (source: command line)
[2]PETSC ERROR: -fieldsplit_ksp_type gmres (source: command line)
[2]PETSC ERROR: -grashof 10000 (source: command line)
-fieldsplit_0_sub_pc_type lu (source: command line)
[3]PETSC ERROR: -fieldsplit_1_ksp_atol 1e-10 (source: command line)
[3]PETSC ERROR: -fieldsplit_1_ksp_rtol 1e-8 (source: command line)
[3]PETSC ERROR: -fieldsplit_ksp_type gmres (source: command line)
[0]PETSC ERROR: -grashof 10000 (source: command line)
[0]PETSC ERROR: -lidvelocity 100 (source: command line)
[0]PETSC ERROR: -pc_fieldsplit_0_fields 0,2 (source: command line)
[1]PETSC ERROR: -fieldsplit_1_pc_asm_overlap 5 (source: command line)
[1]PETSC ERROR: -fieldsplit_1_pc_asm_type restrict (source: command line)
[1]PETSC ERROR: -fieldsplit_1_pc_type asm (source: command line)
[2]PETSC ERROR: -lidvelocity 100 (source: command line)
[2]PETSC ERROR: -pc_fieldsplit_0_fields 0,2 (source: command line)
[2]PETSC ERROR: -pc_fieldsplit_1_fields 1,3 (source: command line)
[2]PETSC ERROR: -fieldsplit_1_pc_asm_overlap 5 (source: command line)
[3]PETSC ERROR: -fieldsplit_1_pc_asm_type restrict (source: command line)
[3]PETSC ERROR: -fieldsplit_1_pc_type asm (source: command line)
[3]PETSC ERROR: [0]PETSC ERROR: -pc_fieldsplit_1_fields 1,3 (source: command line)
[0]PETSC ERROR: -pc_fieldsplit_block_size 4 (source: command line)
[0]PETSC ERROR: -pc_fieldsplit_type multiplicative (source: command line)
[1]PETSC ERROR: -fieldsplit_1_sub_pc_type lu (source: command line)
[1]PETSC ERROR: -fieldsplit_ksp_type gmres (source: command line)
[1]PETSC ERROR: -grashof 10000 (source: command line)
-pc_fieldsplit_block_size 4 (source: command line)
[2]PETSC ERROR: -pc_fieldsplit_type multiplicative (source: command line)
[2]PETSC ERROR: -pc_type fieldsplit (source: command line)
[2]PETSC ERROR: -prandtl 0.72 (source: command line)
-fieldsplit_1_sub_pc_type lu (source: command line)
[3]PETSC ERROR: -fieldsplit_ksp_type gmres (source: command line)
[3]PETSC ERROR: -grashof 10000 (source: command line)
[3]PETSC ERROR: [0]PETSC ERROR: -pc_type fieldsplit (source: command line)
[0]PETSC ERROR: -prandtl 0.72 (source: command line)
[0]PETSC ERROR: -snes_converged_reason (source: command line)
[1]PETSC ERROR: -lidvelocity 100 (source: command line)
[1]PETSC ERROR: -pc_fieldsplit_0_fields 0,2 (source: command line)
[1]PETSC ERROR: -pc_fieldsplit_1_fields 1,3 (source: command line)
[2]PETSC ERROR: -snes_converged_reason (source: command line)
[2]PETSC ERROR: -snes_monitor (source: command line)
[2]PETSC ERROR: -snes_type newtonls (source: command line)
-lidvelocity 100 (source: command line)
[3]PETSC ERROR: -pc_fieldsplit_0_fields 0,2 (source: command line)
[3]PETSC ERROR: -pc_fieldsplit_1_fields 1,3 (source: command line)
[3]PETSC ERROR: [0]PETSC ERROR: -snes_monitor (source: command line)
[0]PETSC ERROR: -snes_type newtonls (source: command line)
[0]PETSC ERROR: [1]PETSC ERROR: -pc_fieldsplit_block_size 4 (source: command line)
[1]PETSC ERROR: -pc_fieldsplit_type multiplicative (source: command line)
[1]PETSC ERROR: -pc_type fieldsplit (source: command line)
[2]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
Abort(62) on node 2 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0
-pc_fieldsplit_block_size 4 (source: command line)
[3]PETSC ERROR: -pc_fieldsplit_type multiplicative (source: command line)
[3]PETSC ERROR: -pc_type fieldsplit (source: command line)
[3]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
Abort(62) on node 0 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0
[1]PETSC ERROR: -prandtl 0.72 (source: command line)
[1]PETSC ERROR: -snes_converged_reason (source: command line)
[1]PETSC ERROR: -snes_monitor (source: command line)
-prandtl 0.72 (source: command line)
[3]PETSC ERROR: -snes_converged_reason (source: command line)
[3]PETSC ERROR: -snes_monitor (source: command line)
[1]PETSC ERROR: -snes_type newtonls (source: command line)
[1]PETSC ERROR: [3]PETSC ERROR: -snes_type newtonls (source: command line)
[3]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
Abort(62) on node 1 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0
Abort(62) on node 3 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0?.
Please tell me what should I do?Thank you very much.


1807580692
1807580692 at qq.com


&nbsp;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/2b63d946/attachment-0001.html>

From mmolinos at us.es  Wed Dec 13 04:18:50 2023
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 13 Dec 2023 10:18:50 +0000
Subject: [petsc-users] Domain decomposition in PETSc for Molecular
 Dynamics
In-Reply-To: <CAMYG4GnZZowsdGj6gGEWSW_O_Du5k-e25=nD79T5naRLuzkbng@mail.gmail.com>
References: <2BEA961D-00D5-4880-A162-7262E398C048@us.es>
	<A06F4FB7-EE3E-4394-82BF-0825946A56A9@us.es>
	<CAMYG4Gmy-y8GNgLCRTAkpQB7Zg20wpPgdExvHWiaHWTTrOnxmA@mail.gmail.com>
	<9072F31D-93C8-43B6-8D96-5DB4CFDECFF9@us.es>
	<CAMYG4GnZZowsdGj6gGEWSW_O_Du5k-e25=nD79T5naRLuzkbng@mail.gmail.com>
Message-ID: <F51BAF81-675A-43A6-810C-BD039C94E7B3@us.es>

Thank you for the feedback. Seems like the second option is the right one (according to MD literature).

Best,
Miguel

On 13 Dec 2023, at 01:43, Matthew Knepley <knepley at gmail.com> wrote:

On Tue, Dec 12, 2023 at 7:28?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
I meant the list of atoms which lies inside of a sphere of radius R_cutoff centered at the mean position of a given atom.

Okay, this is possible in parallel, but would require hard work and I have not done it yet, although I think all the tools are coded.

In serial, there are at least two ways to do it. First, you can use a k-d tree implementation, since they usually have the radius query. I have not put one of these in, because I did not like any implementation, but Underworld and Firedrake have and it works fine. Second, you can choose a grid size of R_cutoff for the background grid, and then check neighbors. This is probably how I would start.

  Thanks,

     Matt

Best,
Miguel

On 13 Dec 2023, at 01:14, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

?
On Tue, Dec 12, 2023 at 2:36?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Dear Matthew and Mark,

Thank you for four useful guidance.  I have taken as a starting point the example in "dm/tutorials/swarm_ex3.c" to build a first approximation for domain decomposition in my molecular dynamics code (diffusive molecular dynamic to be more precise :-) ). And I must say that I am very happy with the result. However, in my journey integrating domain decomposition into my code, I am facing some new (and expected) problems.  The first is in the implementation of the nearest neighbor algorithm (list of atoms closest to each atom).

Can you help me understand this? For a given atom, there should be a single "closest" atom (barring
degeneracies in distance). What do you mean by the list of closest atoms?

  Thanks,

     Matt

My current approach to the problem is a brute force algorithm (double loop through the list of atoms and calculate the distance). However, it seems that if I call the "neighbours" function after the "DMSwarmMigrate" function the search algorithm does not work correctly. My thoughts / hints are:

  *   The two nested for loops should be done on the global indexing of the atoms instead of the local one (I don't know how to get this number).
  *   If I print the mean position of atom #0 (for example) each range prints a different value of the average position. One of them is the correct position corresponding to site #0, the others are different (but identically labeled) atomic sites. Which means that the site_i index is not bijective.

I believe that solving this problem will increase my understanding of the domain decomposition approach and may allow me to fix the remaining parts of my code.

Any additional comments are greatly appreciated. For instance, I will be happy to be pointed to any piece of code (petsc examples for example) with solves a similar problem in order to self-learn learn by example.

Many thanks in advance.

Best,
Miguel

This is the piece of code (simplified) which computes the list of neighbours for each atomic site. DMD is a structure which contains the atomistic information (DMSWARM), and the background mesh and bounding cell (DMDA and DMShell)

int neighbours(DMD* Simulation) {

PetscFunctionBegin;
PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));

PetscCall(DMSwarmGetLocalSize(Simulation->atomistic_data, &n_atoms_local));

//! Get array with the mean position of the atoms
DMSwarmGetField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize, NULL,
(void**)&mean_q_ptr);
Eigen::Map<MatrixType> mean_q(mean_q_ptr, n_atoms_local, dim);

int* neigh = Simulation->neigh;
int* numneigh = Simulation->numneigh;

for (unsigned int site_i = 0; site_i < n_atoms_local; site_i++) {

//! Get mean position of site i
Eigen::Vector3d mean_q_i = mean_q.block<1, 3>(site_i, 0);

//! Search neighbourhs in the main cell (+ periodic cells)
for (unsigned site_j = 0; site_j < n_atoms_local; site_j++) {
if (site_i != site_j) {
//! Get mean position of site j in the periodic box
Eigen::Vector3d mean_q_j = mean_q.block<1, 3>(site_j, 0);

//! Check is site j is the neibourhood of the site i
double norm_r_ij = (mean_q_i - mean_q_j).norm();
if ((norm_r_ij <= r_cutoff_ADP) && (numneigh[site_i] < maxneigh)) {
neigh[site_i * maxneigh + numneigh[site_i]] = site_j;
numneigh[site_i] += 1;
}
}
}

} // MPI for loop (site_i)

DMSwarmRestoreField(Simulation->atomistic_data, DMSwarmPICField_coor, &blocksize,
NULL, (void**)&mean_q_ptr);

return EXIT_SUCCESS;
}


This is the piece of code that I use to read the atomic positions (mean_q) from a file:
//! @brief mean_q: Mean value of each atomic position
double* mean_q;
PetscCall(DMSwarmGetField(atomistic_data, DMSwarmPICField_coor, &blocksize,
NULL, (void**)&mean_q));

cnt = 0;
for (PetscInt site_i = 0; site_i < n_atoms_local; site_i++) {
if (cnt < n_atoms) {
mean_q[blocksize * cnt + 0] = Simulation_file.mean_q[cnt * dim + 0];
mean_q[blocksize * cnt + 1] = Simulation_file.mean_q[cnt * dim + 1];
mean_q[blocksize * cnt + 2] = Simulation_file.mean_q[cnt * dim + 2];

cnt++;
}
}
PetscCall(DMSwarmRestoreField(atomistic_data, DMSwarmPICField_coor,
&blocksize, NULL, (void**)&mean_q));


<Screenshot 2023-12-12 at 19.42.13.png>

On 4 Nov 2023, at 15:50, MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:

?Thank you Mark! I will have a look to it.

Best,
Miguel


On 4 Nov 2023, at 13:54, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

?
On Sat, Nov 4, 2023 at 8:40?AM Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>> wrote:
Hi MIGUEL,

This might be a good place to start: https://petsc.org/main/manual/vec/
Feel free to ask more specific questions, but the docs are a good place to start.

Thanks,
Mark

On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Dear all,

I am currently working on the development of a in-house molecular dynamics code using PETSc and C++. So far the code works great, however it is a little bit slow since I am not exploiting MPI for PETSc vectors. I was wondering if there is a way to perform the domain decomposition efficiently using some PETSc functionality. Any feedback is highly appreciated.

It sounds like you mean "is there a way to specify a communication construct that can send my particle
information automatically". We use PetscSF for that. You can see how this works with the DMSwarm class, which represents a particle discretization. You can either use that, or if it does not work for you, do the same things with your class.

  Thanks,

     Matt

Best regards,
Miguel


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/b49a4bef/attachment-0001.html>

From mfadams at lbl.gov  Wed Dec 13 05:38:15 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 13 Dec 2023 06:38:15 -0500
Subject: [petsc-users] Fortran Interface
In-Reply-To: <CA+bBegm438FexJic6iXg3Aiy4Tk36weC+3kWBdW0S2n7p2185Q@mail.gmail.com>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
	<CO6PR09MB795740BA24FD98AC04867EC3FC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev>
	<CA+bBegm438FexJic6iXg3Aiy4Tk36weC+3kWBdW0S2n7p2185Q@mail.gmail.com>
Message-ID: <CADOhEh4iYMvgG4Tx6xmeDsa+0wSdOEnweeWgjTzFsAZOqvWyPg@mail.gmail.com>

yep, that's a cut and paste bug.
"user" is a common name for a user context.

On Tue, Dec 12, 2023 at 11:17?PM Sanjay Govindjee via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> did you mean to write
>
> type (userctx)   ctx
>
> in this example?
>
>
> subroutine func(snes, x, f, ctx, ierr)
> SNES snes
> Vec x,f
> type (userctx)   user
> PetscErrorCode ierr
> ...
>
> external func
> SNESSetFunction(snes, r, func, ctx, ierr)
> SNES snes
> Vec r
> PetscErrorCode ierr
> type (userctx)   user
>
>
>
>
> On Tue, Dec 12, 2023 at 7:10?PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>> See
>> https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran
>>  and https://gitlab.com/petsc/petsc/-/merge_requests/7114
>>
>>
>>
>> On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>> What do you do with something like a void pointer? I?m looking at the
>> TaoSetObjectiveAndGradient function and it wants to pass a void *ctx
>> pointer. You can set this to null, but apparently you have to specify the
>> type. What type should I use? Is there something called PETSC_NULL_VOID or
>> PETSC_NULL_CONTEXT or do I use something else?
>>
>>
>> *From: *Matthew Knepley <knepley at gmail.com>
>> *Date: *Tuesday, December 12, 2023 at 8:33 AM
>> *To: *Palmer, Bruce J <Bruce.Palmer at pnnl.gov>
>> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] Fortran Interface
>> Check twice before you click! This email originated from outside PNNL.
>>
>> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>> Does documentation for the PETSc fortran interface still exist? I looked
>> at the web pages for 3.20 (petsc.org/release) but if you go under the
>> tab C/Fortran API, only descriptions for the C interface are there.
>>
>>
>> I think after the most recent changes, the interface was supposed to be
>> very close to C, so we just document the differences on specific pages, and
>> put the general stuff here:
>>
>>   https://petsc.org/release/manual/fortran/
>>
>>    Thanks,
>>
>>      Matt
>>
>>
>> Bruce Palmer
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/6206d48e/attachment.html>

From bsmith at petsc.dev  Wed Dec 13 08:27:50 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 13 Dec 2023 09:27:50 -0500
Subject: [petsc-users] Fortran Interface
In-Reply-To: <CA+bBegm438FexJic6iXg3Aiy4Tk36weC+3kWBdW0S2n7p2185Q@mail.gmail.com>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
	<CO6PR09MB795740BA24FD98AC04867EC3FC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev>
	<CA+bBegm438FexJic6iXg3Aiy4Tk36weC+3kWBdW0S2n7p2185Q@mail.gmail.com>
Message-ID: <68FF47E6-F54A-4321-9C37-E256C66E21EF@petsc.dev>

fixed


> On Dec 12, 2023, at 11:17?PM, Sanjay Govindjee <s_g at berkeley.edu> wrote:
> 
> did you mean to write
> type (userctx)   ctx
> in this example?
> 
> subroutine func(snes, x, f, ctx, ierr)
> SNES snes
> Vec x,f
> type (userctx)   user
> PetscErrorCode ierr
> ...
> 
> external func
> SNESSetFunction(snes, r, func, ctx, ierr)
> SNES snes
> Vec r
> PetscErrorCode ierr
> type (userctx)   user
> 
> 
> 
> On Tue, Dec 12, 2023 at 7:10?PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>> 
>> See https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran and https://gitlab.com/petsc/petsc/-/merge_requests/7114
>> 
>> 
>> 
>>> On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>> 
>>> What do you do with something like a void pointer? I?m looking at the TaoSetObjectiveAndGradient function and it wants to pass a void *ctx pointer. You can set this to null, but apparently you have to specify the type. What type should I use? Is there something called PETSC_NULL_VOID or PETSC_NULL_CONTEXT or do I use something else?
>>>  
>>> From: Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>
>>> Date: Tuesday, December 12, 2023 at 8:33 AM
>>> To: Palmer, Bruce J <Bruce.Palmer at pnnl.gov <mailto:Bruce.Palmer at pnnl.gov>>
>>> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
>>> Subject: Re: [petsc-users] Fortran Interface
>>> 
>>> Check twice before you click! This email originated from outside PNNL.
>>>  
>>> On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>> Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release <http://petsc.org/release>) but if you go under the tab C/Fortran API, only descriptions for the C interface are there.
>>>  
>>> I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here:
>>>  
>>>   https://petsc.org/release/manual/fortran/
>>>  
>>>    Thanks,
>>>  
>>>      Matt
>>>  
>>> Bruce Palmer
>>> 
>>>  
>>> --
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>>  
>>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/19193f9c/attachment-0001.html>

From coltonbryant2021 at u.northwestern.edu  Wed Dec 13 10:21:43 2023
From: coltonbryant2021 at u.northwestern.edu (Colton Bryant)
Date: Wed, 13 Dec 2023 10:21:43 -0600
Subject: [petsc-users] DMSTAG Gathering Vector on single process
In-Reply-To: <73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev>
References: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
	<CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>
	<CAK7cQXsRApiB_07XLaVwgEMmfJUWz209_t83UWUR-7KyCVgw1g@mail.gmail.com>
	<5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev>
	<CAMYG4GmvCJty72Y=Hu0QnSF+9qKCDzpWP2SMsHt5OjB8uW6Stw@mail.gmail.com>
	<73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev>
Message-ID: <CAK7cQXu0jNu47t=qo11QqVTLu7kcnVXSODJ-=skRGLYZB3GFPw@mail.gmail.com>

Hi,

Thanks for the help last week. The suggestions made the implementation I
had much cleaner. I had one follow up question. Is there a way to sort of
undo this operation? I know the vec scatter can be done backwards to
distribute the arrays but I didn't see an easy way to migrate the DMDA
vectors back into the DMStag object.

Thanks for any advice.

-Colton

On Wed, Dec 6, 2023 at 8:18?PM Barry Smith <bsmith at petsc.dev> wrote:

>
>
> On Dec 6, 2023, at 8:35?PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Wed, Dec 6, 2023 at 8:10?PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>   Depending on the serial library you may not need to split the vector
>> into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just
>> global to natural and scatter to zero on the full vector, now the full
>> vector is on the first rank and you can access what you need in that one
>> vector if possible.
>>
>
> Does DMStag have a GlobalToNatural?
>
>
>    Good point, it does not appear to have such a thing, though it could.
>
> Also, the serial code would have to have identical interleaving.
>
>   Thanks,
>
>      Matt
>
>> On Dec 6, 2023, at 6:37?PM, Colton Bryant <
>> coltonbryant2021 at u.northwestern.edu> wrote:
>>
>> Ah excellent! I was not aware of the ability to preallocate the objects
>> and migrate them each time.
>>
>> Thanks!
>> -Colton
>>
>> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley <knepley at gmail.com> wrote:
>>
>>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant <
>>> coltonbryant2021 at u.northwestern.edu> wrote:
>>>
>>>> Hello,
>>>>
>>>> I am working on a code in which a DMSTAG object is used to solve a
>>>> fluid flow problem and I need to gather this flow data on a single process
>>>> to interact with an existing (serial) library at each timestep of my
>>>> simulation. After looking around the solution I've tried is:
>>>>
>>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the
>>>> flow
>>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the
>>>> components naturally ordered
>>>> -use VecScatterCreateToZero to set up and then do the scatter to gather
>>>> on the single process
>>>>
>>>> Unless I'm misunderstanding something this method results in a lot of
>>>> memory allocation/freeing happening at each step of the evolution and I was
>>>> wondering if there is a way to directly perform such a scatter from the
>>>> DMSTAG object without splitting as I'm doing here.
>>>>
>>>
>>> 1) You can see here:
>>>
>>>
>>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA
>>>
>>> that this function is small. You can do the DMDA creation manually, and
>>> then just call DMStagMigrateVecDMDA() each time, which will not create
>>> anything.
>>>
>>> 2) You can create the natural vector upfront, and just scatter each time.
>>>
>>> 3) You can create the serial vector upfront, and just scatter each time.
>>>
>>> This is some data movement. You can compress the g2n and 2zero scatters
>>> using
>>>
>>>   https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/
>>>
>>> as an optimization.
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> Any advice would be much appreciated!
>>>>
>>>> Best,
>>>> Colton Bryant
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/a967ced9/attachment.html>

From Bruce.Palmer at pnnl.gov  Wed Dec 13 10:42:16 2023
From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J)
Date: Wed, 13 Dec 2023 16:42:16 +0000
Subject: [petsc-users] Fortran Interface
In-Reply-To: <68FF47E6-F54A-4321-9C37-E256C66E21EF@petsc.dev>
References: <CO6PR09MB795757E33F804A5A5C45F27FFC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<CAMYG4GmCjf2R3tuJMEmDT-W7=eSdAmqWm8o9t6M1sUiat9Q3ZA@mail.gmail.com>
	<CO6PR09MB795740BA24FD98AC04867EC3FC8EA@CO6PR09MB7957.namprd09.prod.outlook.com>
	<94202A46-2221-4F08-8B35-1BE16C7937AC@petsc.dev>
	<CA+bBegm438FexJic6iXg3Aiy4Tk36weC+3kWBdW0S2n7p2185Q@mail.gmail.com>
	<68FF47E6-F54A-4321-9C37-E256C66E21EF@petsc.dev>
Message-ID: <CO6PR09MB79575C5BB4A62E34BE82E48BFC8DA@CO6PR09MB7957.namprd09.prod.outlook.com>

Thanks, that clears things up nicely.

Bruce

From: Barry Smith <bsmith at petsc.dev>
Date: Wednesday, December 13, 2023 at 6:28 AM
To: Sanjay Govindjee <s_g at berkeley.edu>
Cc: Palmer, Bruce J <Bruce.Palmer at pnnl.gov>, petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Fortran Interface
fixed


On Dec 12, 2023, at 11:17?PM, Sanjay Govindjee <s_g at berkeley.edu> wrote:

did you mean to write

type (userctx)   ctx

in this example?


subroutine func(snes, x, f, ctx, ierr)
SNES snes
Vec x,f
type (userctx)   user
PetscErrorCode ierr
...

external func
SNESSetFunction(snes, r, func, ctx, ierr)
SNES snes
Vec r
PetscErrorCode ierr
type (userctx)   user


On Tue, Dec 12, 2023 at 7:10?PM Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>> wrote:

See https://petsc.gitlab.io/-/petsc/-/jobs/5739238224/artifacts/public/html/manual/fortran.html#ch-fortran and https://gitlab.com/petsc/petsc/-/merge_requests/7114


On Dec 12, 2023, at 3:22?PM, Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

What do you do with something like a void pointer? I?m looking at the TaoSetObjectiveAndGradient function and it wants to pass a void *ctx pointer. You can set this to null, but apparently you have to specify the type. What type should I use? Is there something called PETSC_NULL_VOID or PETSC_NULL_CONTEXT or do I use something else?

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Tuesday, December 12, 2023 at 8:33 AM
To: Palmer, Bruce J <Bruce.Palmer at pnnl.gov<mailto:Bruce.Palmer at pnnl.gov>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Fortran Interface
Check twice before you click! This email originated from outside PNNL.

On Tue, Dec 12, 2023 at 11:27?AM Palmer, Bruce J via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Does documentation for the PETSc fortran interface still exist? I looked at the web pages for 3.20 (petsc.org/release<http://petsc.org/release>) but if you go under the tab C/Fortran API, only descriptions for the C interface are there.

I think after the most recent changes, the interface was supposed to be very close to C, so we just document the differences on specific pages, and put the general stuff here:

  https://petsc.org/release/manual/fortran/

   Thanks,

     Matt

Bruce Palmer


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/4cec86c5/attachment-0001.html>

From knepley at gmail.com  Wed Dec 13 10:48:23 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 13 Dec 2023 11:48:23 -0500
Subject: [petsc-users] DMSTAG Gathering Vector on single process
In-Reply-To: <CAK7cQXu0jNu47t=qo11QqVTLu7kcnVXSODJ-=skRGLYZB3GFPw@mail.gmail.com>
References: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
	<CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>
	<CAK7cQXsRApiB_07XLaVwgEMmfJUWz209_t83UWUR-7KyCVgw1g@mail.gmail.com>
	<5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev>
	<CAMYG4GmvCJty72Y=Hu0QnSF+9qKCDzpWP2SMsHt5OjB8uW6Stw@mail.gmail.com>
	<73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev>
	<CAK7cQXu0jNu47t=qo11QqVTLu7kcnVXSODJ-=skRGLYZB3GFPw@mail.gmail.com>
Message-ID: <CAMYG4Gk=k5WX-7tLr5Z3RSR0qSEm3Su=rDzKnQBOBU9V4ASC5g@mail.gmail.com>

On Wed, Dec 13, 2023 at 11:22?AM Colton Bryant <
coltonbryant2021 at u.northwestern.edu> wrote:

> Hi,
>
> Thanks for the help last week. The suggestions made the implementation I
> had much cleaner. I had one follow up question. Is there a way to sort of
> undo this operation? I know the vec scatter can be done backwards to
> distribute the arrays but I didn't see an easy way to migrate the DMDA
> vectors back into the DMStag object.
>

It is not there. However, writing it would be straightforward. I would

  1) Expose DMStagMigrateVecDMDA(), which is not currently public

  2) Add a ScatterMode argument

  3) Put in code that calls DMStagSetValuesStencil(), rather than
GetValuesStencil()

We would be happy to take on MR on this, and could help in the review.

  Thanks,

     MAtt


> Thanks for any advice.
>
> -Colton
>
> On Wed, Dec 6, 2023 at 8:18?PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>
>> On Dec 6, 2023, at 8:35?PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Wed, Dec 6, 2023 at 8:10?PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>>   Depending on the serial library you may not need to split the vector
>>> into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just
>>> global to natural and scatter to zero on the full vector, now the full
>>> vector is on the first rank and you can access what you need in that one
>>> vector if possible.
>>>
>>
>> Does DMStag have a GlobalToNatural?
>>
>>
>>    Good point, it does not appear to have such a thing, though it could.
>>
>> Also, the serial code would have to have identical interleaving.
>>
>>   Thanks,
>>
>>      Matt
>>
>>> On Dec 6, 2023, at 6:37?PM, Colton Bryant <
>>> coltonbryant2021 at u.northwestern.edu> wrote:
>>>
>>> Ah excellent! I was not aware of the ability to preallocate the objects
>>> and migrate them each time.
>>>
>>> Thanks!
>>> -Colton
>>>
>>> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant <
>>>> coltonbryant2021 at u.northwestern.edu> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am working on a code in which a DMSTAG object is used to solve a
>>>>> fluid flow problem and I need to gather this flow data on a single process
>>>>> to interact with an existing (serial) library at each timestep of my
>>>>> simulation. After looking around the solution I've tried is:
>>>>>
>>>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the
>>>>> flow
>>>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the
>>>>> components naturally ordered
>>>>> -use VecScatterCreateToZero to set up and then do the scatter to
>>>>> gather on the single process
>>>>>
>>>>> Unless I'm misunderstanding something this method results in a lot of
>>>>> memory allocation/freeing happening at each step of the evolution and I was
>>>>> wondering if there is a way to directly perform such a scatter from the
>>>>> DMSTAG object without splitting as I'm doing here.
>>>>>
>>>>
>>>> 1) You can see here:
>>>>
>>>>
>>>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA
>>>>
>>>> that this function is small. You can do the DMDA creation manually, and
>>>> then just call DMStagMigrateVecDMDA() each time, which will not create
>>>> anything.
>>>>
>>>> 2) You can create the natural vector upfront, and just scatter each
>>>> time.
>>>>
>>>> 3) You can create the serial vector upfront, and just scatter each time.
>>>>
>>>> This is some data movement. You can compress the g2n and 2zero scatters
>>>> using
>>>>
>>>>   https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/
>>>>
>>>> as an optimization.
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> Any advice would be much appreciated!
>>>>>
>>>>> Best,
>>>>> Colton Bryant
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/1828b4b8/attachment.html>

From coltonbryant2021 at u.northwestern.edu  Wed Dec 13 11:02:21 2023
From: coltonbryant2021 at u.northwestern.edu (Colton Bryant)
Date: Wed, 13 Dec 2023 11:02:21 -0600
Subject: [petsc-users] DMSTAG Gathering Vector on single process
In-Reply-To: <CAMYG4Gk=k5WX-7tLr5Z3RSR0qSEm3Su=rDzKnQBOBU9V4ASC5g@mail.gmail.com>
References: <CAK7cQXv8=Acr-61_=NSCDKQwPbRB3r-MshK=z00fQiOzOB10cw@mail.gmail.com>
	<CAMYG4Gm3Z_7yBf_aZTZuvnPxizdj1Z9pZo=qWUhmMstVYgKdFw@mail.gmail.com>
	<CAK7cQXsRApiB_07XLaVwgEMmfJUWz209_t83UWUR-7KyCVgw1g@mail.gmail.com>
	<5A08B1BB-933D-4A46-8369-510D1C5AFDC6@petsc.dev>
	<CAMYG4GmvCJty72Y=Hu0QnSF+9qKCDzpWP2SMsHt5OjB8uW6Stw@mail.gmail.com>
	<73C92F1E-2DB7-416D-A694-AD293027E295@petsc.dev>
	<CAK7cQXu0jNu47t=qo11QqVTLu7kcnVXSODJ-=skRGLYZB3GFPw@mail.gmail.com>
	<CAMYG4Gk=k5WX-7tLr5Z3RSR0qSEm3Su=rDzKnQBOBU9V4ASC5g@mail.gmail.com>
Message-ID: <CAK7cQXuxA7QBj084s+gx3vQsUjpvd0SWaT56LAfJa-BsPHmDMA@mail.gmail.com>

Ok! Thanks for the reply. I'll take a look when I get a chance.

-Colton

On Wed, Dec 13, 2023 at 10:48?AM Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Dec 13, 2023 at 11:22?AM Colton Bryant <
> coltonbryant2021 at u.northwestern.edu> wrote:
>
>> Hi,
>>
>> Thanks for the help last week. The suggestions made the implementation I
>> had much cleaner. I had one follow up question. Is there a way to sort of
>> undo this operation? I know the vec scatter can be done backwards to
>> distribute the arrays but I didn't see an easy way to migrate the DMDA
>> vectors back into the DMStag object.
>>
>
> It is not there. However, writing it would be straightforward. I would
>
>   1) Expose DMStagMigrateVecDMDA(), which is not currently public
>
>   2) Add a ScatterMode argument
>
>   3) Put in code that calls DMStagSetValuesStencil(), rather than
> GetValuesStencil()
>
> We would be happy to take on MR on this, and could help in the review.
>
>   Thanks,
>
>      MAtt
>
>
>> Thanks for any advice.
>>
>> -Colton
>>
>> On Wed, Dec 6, 2023 at 8:18?PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>>
>>> On Dec 6, 2023, at 8:35?PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>
>>> On Wed, Dec 6, 2023 at 8:10?PM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>>   Depending on the serial library you may not need to split the vector
>>>> into DMDA vectors with DMStagVecSplitToDMDA() for each component. Just
>>>> global to natural and scatter to zero on the full vector, now the full
>>>> vector is on the first rank and you can access what you need in that one
>>>> vector if possible.
>>>>
>>>
>>> Does DMStag have a GlobalToNatural?
>>>
>>>
>>>    Good point, it does not appear to have such a thing, though it could.
>>>
>>> Also, the serial code would have to have identical interleaving.
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>> On Dec 6, 2023, at 6:37?PM, Colton Bryant <
>>>> coltonbryant2021 at u.northwestern.edu> wrote:
>>>>
>>>> Ah excellent! I was not aware of the ability to preallocate the objects
>>>> and migrate them each time.
>>>>
>>>> Thanks!
>>>> -Colton
>>>>
>>>> On Wed, Dec 6, 2023 at 5:18?PM Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>> On Wed, Dec 6, 2023 at 5:54?PM Colton Bryant <
>>>>> coltonbryant2021 at u.northwestern.edu> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I am working on a code in which a DMSTAG object is used to solve a
>>>>>> fluid flow problem and I need to gather this flow data on a single process
>>>>>> to interact with an existing (serial) library at each timestep of my
>>>>>> simulation. After looking around the solution I've tried is:
>>>>>>
>>>>>> -use DMStagVecSplitToDMDA to extract vectors of each component of the
>>>>>> flow
>>>>>> -use DMDACreateNaturalVector and DMDAGlobalToNatural to get the
>>>>>> components naturally ordered
>>>>>> -use VecScatterCreateToZero to set up and then do the scatter to
>>>>>> gather on the single process
>>>>>>
>>>>>> Unless I'm misunderstanding something this method results in a lot of
>>>>>> memory allocation/freeing happening at each step of the evolution and I was
>>>>>> wondering if there is a way to directly perform such a scatter from the
>>>>>> DMSTAG object without splitting as I'm doing here.
>>>>>>
>>>>>
>>>>> 1) You can see here:
>>>>>
>>>>>
>>>>> https://petsc.org/main/src/dm/impls/stag/stagda.c.html#DMStagVecSplitToDMDA
>>>>>
>>>>> that this function is small. You can do the DMDA creation manually,
>>>>> and then just call DMStagMigrateVecDMDA() each time, which will not
>>>>> create anything.
>>>>>
>>>>> 2) You can create the natural vector upfront, and just scatter each
>>>>> time.
>>>>>
>>>>> 3) You can create the serial vector upfront, and just scatter each
>>>>> time.
>>>>>
>>>>> This is some data movement. You can compress the g2n and 2zero
>>>>> scatters using
>>>>>
>>>>>   https://petsc.org/main/manualpages/PetscSF/PetscSFCompose/
>>>>>
>>>>> as an optimization.
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>      Matt
>>>>>
>>>>>
>>>>>> Any advice would be much appreciated!
>>>>>>
>>>>>> Best,
>>>>>> Colton Bryant
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/05c9d9c5/attachment-0001.html>

From mfadams at lbl.gov  Wed Dec 13 13:54:17 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 13 Dec 2023 14:54:17 -0500
Subject: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower)
 convergence for classical AMG (sequential and especially in parallel)
In-Reply-To: <4c9f02898f324fcd8be1fe5dcc9f0416@cea.fr>
References: <4c9f02898f324fcd8be1fe5dcc9f0416@cea.fr>
Message-ID: <CADOhEh50RmNSQjxa1ByhatJ8Wxyc6XdcZF=wMdm=YF3dqERQNw@mail.gmail.com>

Hi Pierre,

Sorry I missed this post and your issues were brought to my attention today.

First, the classic version is not supported well. The postdoc that wrote
the code is long gone and I don't know the code at all.
It is really a reference implementation that someone could build on and is
not meant for production.
In 10 years you are the first user that has connected us.

The hypre package is a very good AMG solver and it uses classical AMG as
the main solver.
I wrote GAMG ("agg") which is a smoothed aggregation AMG solver and is very
different from classical.
I would suggest you move to hypre or '-pc_gamg_type agg'.

The coarsening was developed in this time frame and there was a lot of
churn as a new strategy for aggressive coarsening did not work well for
some users and I had to add the old method in and then made it the default
(again).
This change missed v3.20, but you can get the old aggressive strategy with
'-pc_gamg_aggressive_square_graph'.
Check with -options_left to check that it is being used.

As far as your output (nice formatting, thank you), the coarse grid is
smaller in the new code.
            rows=41, cols=41   |          rows=30, cols=30
"square graph" should fix this.

You can also try not using aggressive coarsening with:
You could try '-pc_gamg_aggressive_coarsening 0'

Let me know how it goes and let's try to get you into a more sustainable
state ... I really try not to change this code but sometimes need to.

Thanks,
Mark


On Mon, Oct 9, 2023 at 10:43?AM LEDAC Pierre <Pierre.LEDAC at cea.fr> wrote:

> Hello all,
>
>
> I am struggling to find the same convergence in iterations when using
> classical algebric multigrid in my code with PETSc 3.20 compared to PETSc
> 3.14.
>
>
> I am using in order to solve a Poisson system:
>
> *-ksp_type cg -pc_type gamg -pc_gamg_type classical*
>
>
> I read the different releases notes between 3.15 and 3.20:
>
> https://petsc.org/release/changes/317
>
> https://petsc.org/main/manualpages/PC/PCGAMGSetThreshold/
>
>
> And have a look at the archive mailing list (especially this one:
> https://www.mail-archive.com/petsc-users at mcs.anl.gov/msg46688.html)
>
> so I added some other options to try to have the same behaviour than PETSc
> 3.14:
>
>
> *-ksp_type cg -pc_type gamg -pc_gamg_type classical *-mg_levels_pc_type
> sor -pc_gamg_threshold 0.
>
>
> It improves the convergence but there still a different convergence though
> (26 vs 18 iterations).
>
> On another of my test case, the number of levels is different (e.g. 6 vs
> 4) also, and here it is the same, but with a different coarsening according
> to the output from the -ksp_view option
>
> The main point is that the convergence dramatically degrades in parallel
> on a third test case, so I can't upgrade to PETSc 3.20 for now unhappily.
>
> I send you the partial report (petsc_314_vs_petsc_320.ksp_view) with
> -ksp_view (left PETSc 3.14, right PETSc 3.20) and the configure/command
> line options used (in petsc_XXX_petsc.TU files).
>
>
> Could my issue related to the following 3.18 change ? I have not tried the
> first one.
>
>
>    -
>
>    Remove PCGAMGSetSymGraph() and -pc_gamg_sym_graph. The user should now
>    indicate symmetry and structural symmetry using MatSetOption
>    <https://petsc.org/release/manualpages/Mat/MatSetOption/>() and GAMG
>    will symmetrize the graph if a symmetric options is not set
>    -
>
>    Change -pc_gamg_reuse_interpolation default from false to true.
>
>
> Any advice would be greatly appreciated,
>
>
> Pierre LEDAC
> Commissariat ? l??nergie atomique et aux ?nergies alternatives
> Centre de SACLAY
> DES/ISAS/DM2S/SGLS/LCAN
> B?timent 451 ? point courrier n?43
> F-91191 Gif-sur-Yvette
> +33 1 69 08 04 03
> +33 6 83 42 05 79
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/1e81292c/attachment.html>

From srvenkat at utexas.edu  Wed Dec 13 16:05:41 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Wed, 13 Dec 2023 16:05:41 -0600
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
Message-ID: <CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>

Hello Pierre,

I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However,
I am noticing that it is still solving column by column (this is stated
explicitly in the info dump attached). I looked at the code for
KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true,
it should do the batched solve, though I'm not sure where that gets set.

I am using the options -pc_type hypre -pc_hypre_type boomeramg when running
the code.

Can you please help me with this?

Thanks,
Sreeram


On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov> wrote:

> N.B., AMGX interface is a bit experimental.
> Mark
>
> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu>
> wrote:
>
>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>
>> Thanks,
>> Sreeram
>>
>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>
>>>
>>>
>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>>>
>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>
>>> I want to use the AMGX preconditioner for the KSP. I will try it out and
>>> see how it performs.
>>>
>>>
>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has
>>> no PCMatApply() implementation.
>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>> implementation.
>>> But let us know if you need assistance figuring things out.
>>>
>>> Thanks,
>>> Pierre
>>>
>>> Thanks,
>>> Sreeram
>>>
>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>
>>>> To expand on Barry?s answer, we have observed repeatedly that
>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>> reproduce this on your own with
>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>> Also, I?m guessing you are using some sort of preconditioner within
>>>> your KSP.
>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>>> right-hand sides column by column, which is very inefficient.
>>>> You could run your code with -info dump and send us dump.0 to see what
>>>> needs to be done on our end to make things more efficient, should you not
>>>> be satisfied with the current performance of the code.
>>>>
>>>> Thanks,
>>>> Pierre
>>>>
>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>
>>>>
>>>>
>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n
>>>> x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has
>>>> size n. The data for v can be stored either in column-major or row-major
>>>> order.  Now, I want to do 2 types of operations:
>>>>
>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>
>>>> From what I have read on the documentation, I can think of 2
>>>> approaches.
>>>>
>>>> 1. Get the pointer to the data in v (column-major) and use it to create
>>>> a dense matrix V. Then do a MatMatMult with M*V = W, and take the data
>>>> pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R
>>>> and V.
>>>>
>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with
>>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that it
>>>> is a multiple RHS system and act accordingly.
>>>>
>>>> Which would be the more efficient option?
>>>>
>>>>
>>>> Use 1.
>>>>
>>>>
>>>> As a side-note, I am also wondering if there is a way to use row-major
>>>> storage of the vector v.
>>>>
>>>>
>>>> No
>>>>
>>>> The reason is that this could allow for more coalesced memory access
>>>> when doing matvecs.
>>>>
>>>>
>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for
>>>> the computation so in theory they should already be well-optimized
>>>>
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>>
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/c6559e20/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dump.0
Type: application/octet-stream
Size: 1940230 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/c6559e20/attachment-0001.obj>

From 2111191 at tongji.edu.cn  Wed Dec 13 05:52:48 2023
From: 2111191 at tongji.edu.cn (2111191 at tongji.edu.cn)
Date: Wed, 13 Dec 2023 19:52:48 +0800 (GMT+08:00)
Subject: [petsc-users] =?utf-8?b?5o2V6I63?=
Message-ID: <376cb39c.11543.18c6305b919.Coremail.2111191@tongji.edu.cn>

Dear SLEPc Developers,

I a am student from Tongji University. Recently I am trying to write a c++ program for matrix solving, which requires importing the PETSc library that you have developed. However a lot of errors occur in the cpp file when I use #include <petscts.h> directly. I also try to use extern "C" but it gives me the error in the picture below. Is there a good way to use the PETSc library in a c++ program? (I compiled using cmake and my compiler is g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)).

My cmakelists.txt is:

cmake_minimum_required(VERSION 3.1.0)

set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)

set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH})
set(SLEPC $ENV{SLEPC_DIR}/$ENV{PETSC_ARCH})
set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig:${SLEPC}/lib/pkgconfig)

set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11")  
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99")  

project(test)

add_executable(${PROJECT_NAME} eigen_test2.cpp)
find_package(PkgConfig REQUIRED)

pkg_search_module(PETSc REQUIRED IMPORTED_TARGET PETSc)
target_link_libraries(${PROJECT_NAME} PkgConfig::PETSc)

The testing code is:eigen_test2.cpp
extern "C"{
   //#include <petsc.h>
   #include <petscts.h>
   #include <petscdm.h>
   #include <petscdmda.h>
   #include <petscdraw.h>
}

int main(int argc,char **argv)
{ 
   return 0;
}


Best regards

Weijie Xu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ??.PNG
Type: image/png
Size: 63554 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231213/8800fbe7/attachment-0001.png>

From 2111191 at tongji.edu.cn  Wed Dec 13 21:13:01 2023
From: 2111191 at tongji.edu.cn (2111191 at tongji.edu.cn)
Date: Thu, 14 Dec 2023 11:13:01 +0800 (GMT+08:00)
Subject: [petsc-users] Some question about compiling c++ program including
 PETSc using cmake
Message-ID: <771d0fcf.3be16.18c6650324c.Coremail.2111191@tongji.edu.cn>

Dear SLEPc Developers,

I a am student from Tongji University. Recently I am trying to write a c++ program for matrix solving, which requires importing the PETSc library that you have developed. However a lot of errors occur in the cpp file when I use #include <petscts.h> directly. I also try to use extern "C" but it gives me the error in the picture below. Is there a good way to use the PETSc library in a c++ program? (I compiled using cmake and my compiler is g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)).

My cmakelists.txt is:

cmake_minimum_required(VERSION 3.1.0)

set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)

set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH})
set(SLEPC $ENV{SLEPC_DIR}/$ENV{PETSC_ARCH})
set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig:${SLEPC}/lib/pkgconfig)

set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11")  
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99")  

project(test)

add_executable(${PROJECT_NAME} eigen_test2.cpp)
find_package(PkgConfig REQUIRED)

pkg_search_module(PETSc REQUIRED IMPORTED_TARGET PETSc)
target_link_libraries(${PROJECT_NAME} PkgConfig::PETSc)

The testing code is:eigen_test2.cpp
extern "C"{
   //#include <petsc.h>
   #include <petscts.h>
   #include <petscdm.h>
   #include <petscdmda.h>
   #include <petscdraw.h>
}

int main(int argc,char **argv)
{ 
   return 0;
}


Best regards

Weijie Xu

From pierre at joliv.et  Thu Dec 14 00:41:24 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 14 Dec 2023 07:41:24 +0100
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
Message-ID: <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>

Hello Sreeram,
KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once.
There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY.
If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices.

Thanks,
Pierre

PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?.

> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> Hello Pierre,
> 
> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. 
> 
> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code.
> 
> Can you please help me with this?
> 
> Thanks,
> Sreeram
> 
> 
> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
>> N.B., AMGX interface is a bit experimental.
>> Mark
>> 
>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier.
>>> 
>>> Thanks,
>>> Sreeram
>>> 
>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>> 
>>>> 
>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>> 
>>>>> Thank you Barry and Pierre; I will proceed with the first option. 
>>>>> 
>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs.
>>>> 
>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation.
>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation.
>>>> But let us know if you need assistance figuring things out.
>>>> 
>>>> Thanks,
>>>> Pierre
>>>> 
>>>>> Thanks,
>>>>> Sreeram
>>>>> 
>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP.
>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient.
>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code.
>>>>>> 
>>>>>> Thanks,
>>>>>> Pierre
>>>>>> 
>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>> 
>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
>>>>>>>> 
>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>> 
>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. 
>>>>>>>> 
>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
>>>>>>>> 
>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
>>>>>>>> 
>>>>>>>> Which would be the more efficient option?
>>>>>>> 
>>>>>>> Use 1. 
>>>>>>>> 
>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.
>>>>>>> 
>>>>>>> No
>>>>>>> 
>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs.
>>>>>>> 
>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized
>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Sreeram
>>>>>> 
>>>> 
> <dump.0>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/63d6e8f1/attachment.html>

From pierre at joliv.et  Thu Dec 14 00:45:38 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 14 Dec 2023 07:45:38 +0100
Subject: [petsc-users] Some question about compiling c++ program
 including PETSc using cmake
In-Reply-To: <771d0fcf.3be16.18c6650324c.Coremail.2111191@tongji.edu.cn>
References: <771d0fcf.3be16.18c6650324c.Coremail.2111191@tongji.edu.cn>
Message-ID: <CCE9E7C3-5E9E-47A8-9C10-F12CC8AA86BF@joliv.et>


> On 14 Dec 2023, at 4:13?AM, 2111191--- via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Dear SLEPc Developers,
> 
> I a am student from Tongji University. Recently I am trying to write a c++ program for matrix solving, which requires importing the PETSc library that you have developed. However a lot of errors occur in the cpp file when I use #include <petscts.h> directly. I also try to use extern "C" but it gives me the error in the picture below. Is there a good way to use the PETSc library in a c++ program? (I compiled using cmake and my compiler is g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)).

This compiler (gcc 4.8.5) is known to not be C++11 compliant, but you are using the -std=c++11 flag.
Furthermore, since version 3.18 (or maybe slightly later), PETSc requires a C++11-compliant compiler if using C++.
Could you switch to a newer compiler, or try to reconfigure?
Also, you should not put all the include inside an extern { }.
In any case, you?ll need to send the compilation error log and configure.log to petsc-maint at mcs.anl.gov <mailto:petsc-maint at mcs.anl.gov> if you want further help, as we won?t be able to give a better diagnosis with just the currently provided information.

Thanks,
Pierre

> My cmakelists.txt is:
> 
> cmake_minimum_required(VERSION 3.1.0)
> 
> set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)
> 
> set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH})
> set(SLEPC $ENV{SLEPC_DIR}/$ENV{PETSC_ARCH})
> set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig:${SLEPC}/lib/pkgconfig)
> 
> set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11")  
> set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99")  
> 
> project(test)
> 
> add_executable(${PROJECT_NAME} eigen_test2.cpp)
> find_package(PkgConfig REQUIRED)
> 
> pkg_search_module(PETSc REQUIRED IMPORTED_TARGET PETSc)
> target_link_libraries(${PROJECT_NAME} PkgConfig::PETSc)
> 
> The testing code is:eigen_test2.cpp
> extern "C"{
>   //#include <petsc.h>
>   #include <petscts.h>
>   #include <petscdm.h>
>   #include <petscdmda.h>
>   #include <petscdraw.h>
> }
> 
> int main(int argc,char **argv)
> { 
>   return 0;
> }
> 
> 
> 
> Best regards
> 
> Weijie Xu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/09b25541/attachment-0001.html>

From knepley at gmail.com  Thu Dec 14 06:52:58 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 14 Dec 2023 07:52:58 -0500
Subject: [petsc-users] =?utf-8?b?5o2V6I63?=
In-Reply-To: <376cb39c.11543.18c6305b919.Coremail.2111191@tongji.edu.cn>
References: <376cb39c.11543.18c6305b919.Coremail.2111191@tongji.edu.cn>
Message-ID: <CAMYG4GmpvuKfYuunVoQyiuP=fmoeYa3zEAUA9KpukyP46GsaSA@mail.gmail.com>

On Thu, Dec 14, 2023 at 1:27?AM 2111191--- via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Dear SLEPc Developers,
>
> I a am student from Tongji University. Recently I am trying to write a c++
> program for matrix solving, which requires importing the PETSc library that
> you have developed. However a lot of errors occur in the cpp file when I
> use #include <petscts.h> directly. I also try to use extern "C" but it
> gives me the error in the picture below. Is there a good way to use the
> PETSc library in a c++ program? (I compiled using cmake and my compiler is
> g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)).
>
> My cmakelists.txt is:
>
> cmake_minimum_required(VERSION 3.1.0)
>
> set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)
>
> set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH})
> set(SLEPC $ENV{SLEPC_DIR}/$ENV{PETSC_ARCH})
> set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig:${SLEPC}/lib/pkgconfig)
>
> set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11")
> set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99")
>
> project(test)
>
> add_executable(${PROJECT_NAME} eigen_test2.cpp)
> find_package(PkgConfig REQUIRED)
>
> pkg_search_module(PETSc REQUIRED IMPORTED_TARGET PETSc)
> target_link_libraries(${PROJECT_NAME} PkgConfig::PETSc)
>
> The testing code is:eigen_test2.cpp
>

First, get rid of the "extern C" in front of the headers.

  Thanks,

     Matt


> extern "C"{
>    //#include <petsc.h>
>    #include <petscts.h>
>    #include <petscdm.h>
>    #include <petscdmda.h>
>    #include <petscdraw.h>
> }
>
> int main(int argc,char **argv)
> {
>    return 0;
> }
>
>
>
> Best regards
>
> Weijie Xu
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/09afedb9/attachment.html>

From Pierre.LEDAC at cea.fr  Thu Dec 14 09:15:22 2023
From: Pierre.LEDAC at cea.fr (LEDAC Pierre)
Date: Thu, 14 Dec 2023 15:15:22 +0000
Subject: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower)
 convergence for classical AMG (sequential and especially in parallel)
In-Reply-To: <CADOhEh50RmNSQjxa1ByhatJ8Wxyc6XdcZF=wMdm=YF3dqERQNw@mail.gmail.com>
References: <4c9f02898f324fcd8be1fe5dcc9f0416@cea.fr>,
	<CADOhEh50RmNSQjxa1ByhatJ8Wxyc6XdcZF=wMdm=YF3dqERQNw@mail.gmail.com>
Message-ID: <6fee51e3277d470f9f2d4d51f0bd453a@cea.fr>

Hello Mark,


Thanks for your answer. Indeed, I didn't see the information that classical AMG was not really supported:


 -solver2_pc_gamg_type <now classical : formerly agg>: Type of AMG method (only 'agg' supported and useful) (one of) classical geo agg (PCGAMGSetType)

We switched very recently from GAMG("agg") to GAMG("classical") for a weak scaling test up to 32000 cores, where we saw very good scalability with GAMG("classical") compared to GAMG("agg"). But it was with PETSc 3.14...

So today, we are going to upgrade to 3.20 and focus on GAMG("agg") or Hypre Classical AMG. We will see how it compares.

May I ask you what is your point of view of the current state of the GPU versions of GAMG("agg") versus Hypre AMG Classical ?

In fact, the reason of our move from 3.14 to 3.20 is to take advantage of all the progress in PETSc and Hypre on accelerated solvers/preconditioners during the last 2 years.

Greatly appreciate your help,


Pierre LEDAC
Commissariat ? l??nergie atomique et aux ?nergies alternatives
Centre de SACLAY
DES/ISAS/DM2S/SGLS/LCAN
B?timent 451 ? point courrier n?43
F-91191 Gif-sur-Yvette
+33 1 69 08 04 03
+33 6 83 42 05 79
________________________________
De : Mark Adams <mfadams at lbl.gov>
Envoy? : mercredi 13 d?cembre 2023 20:54:17
? : LEDAC Pierre
Cc : petsc-users at mcs.anl.gov; BRUNETON Adrien
Objet : Re: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower) convergence for classical AMG (sequential and especially in parallel)

Hi Pierre,

Sorry I missed this post and your issues were brought to my attention today.

First, the classic version is not supported well. The postdoc that wrote the code is long gone and I don't know the code at all.
It is really a reference implementation that someone could build on and is not meant for production.
In 10 years you are the first user that has connected us.

The hypre package is a very good AMG solver and it uses classical AMG as the main solver.
I wrote GAMG ("agg") which is a smoothed aggregation AMG solver and is very different from classical.
I would suggest you move to hypre or '-pc_gamg_type agg'.

The coarsening was developed in this time frame and there was a lot of churn as a new strategy for aggressive coarsening did not work well for some users and I had to add the old method in and then made it the default (again).
This change missed v3.20, but you can get the old aggressive strategy with '-pc_gamg_aggressive_square_graph'.
Check with -options_left to check that it is being used.

As far as your output (nice formatting, thank you), the coarse grid is smaller in the new code.
            rows=41, cols=41   |          rows=30, cols=30
"square graph" should fix this.

You can also try not using aggressive coarsening with:
You could try '-pc_gamg_aggressive_coarsening 0'

Let me know how it goes and let's try to get you into a more sustainable state ... I really try not to change this code but sometimes need to.

Thanks,
Mark


On Mon, Oct 9, 2023 at 10:43?AM LEDAC Pierre <Pierre.LEDAC at cea.fr<mailto:Pierre.LEDAC at cea.fr>> wrote:

Hello all,


I am struggling to find the same convergence in iterations when using classical algebric multigrid in my code with PETSc 3.20 compared to PETSc 3.14.


I am using in order to solve a Poisson system:

-ksp_type cg -pc_type gamg -pc_gamg_type classical


I read the different releases notes between 3.15 and 3.20:

https://petsc.org/release/changes/317

https://petsc.org/main/manualpages/PC/PCGAMGSetThreshold/


And have a look at the archive mailing list (especially this one: https://www.mail-archive.com/petsc-users at mcs.anl.gov/msg46688.html)

so I added some other options to try to have the same behaviour than PETSc 3.14:


-ksp_type cg -pc_type gamg -pc_gamg_type classical -mg_levels_pc_type sor -pc_gamg_threshold 0.


It improves the convergence but there still a different convergence though (26 vs 18 iterations).

On another of my test case, the number of levels is different (e.g. 6 vs 4) also, and here it is the same, but with a different coarsening according to the output from the -ksp_view option


The main point is that the convergence dramatically degrades in parallel on a third test case, so I can't upgrade to PETSc 3.20 for now unhappily.


I send you the partial report (petsc_314_vs_petsc_320.ksp_view) with -ksp_view (left PETSc 3.14, right PETSc 3.20) and the configure/command line options used (in petsc_XXX_petsc.TU files).


Could my issue related to the following 3.18 change ? I have not tried the first one.

  *   Remove PCGAMGSetSymGraph() and -pc_gamg_sym_graph. The user should now indicate symmetry and structural symmetry using MatSetOption<https://petsc.org/release/manualpages/Mat/MatSetOption/>() and GAMG will symmetrize the graph if a symmetric options is not set

  *   Change -pc_gamg_reuse_interpolation default from false to true.


Any advice would be greatly appreciated,


Pierre LEDAC
Commissariat ? l??nergie atomique et aux ?nergies alternatives
Centre de SACLAY
DES/ISAS/DM2S/SGLS/LCAN
B?timent 451 ? point courrier n?43
F-91191 Gif-sur-Yvette
+33 1 69 08 04 03
+33 6 83 42 05 79
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/3c3d307f/attachment-0001.html>

From mfadams at lbl.gov  Thu Dec 14 10:50:59 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 14 Dec 2023 11:50:59 -0500
Subject: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower)
 convergence for classical AMG (sequential and especially in parallel)
In-Reply-To: <6fee51e3277d470f9f2d4d51f0bd453a@cea.fr>
References: <4c9f02898f324fcd8be1fe5dcc9f0416@cea.fr>
	<CADOhEh50RmNSQjxa1ByhatJ8Wxyc6XdcZF=wMdm=YF3dqERQNw@mail.gmail.com>
	<6fee51e3277d470f9f2d4d51f0bd453a@cea.fr>
Message-ID: <CADOhEh5kaTTJFy+brmrQJgWJ-_550tXX7C6OG1nSYJ2xTqDCKg@mail.gmail.com>

On Thu, Dec 14, 2023 at 10:15?AM LEDAC Pierre <Pierre.LEDAC at cea.fr> wrote:

> Hello Mark,
>
>
> Thanks for your answer. Indeed, I didn't see the information that
> classical AMG was not really supported:
>
>
>  -solver2_pc_gamg_type <now classical : formerly agg>: Type of AMG method (only
> 'agg' supported and useful) (one of) classical geo agg (PCGAMGSetType)
>
> We switched very recently from GAMG("agg") to GAMG("classical") for a weak
> scaling test up to 32000 cores, where we saw very good scalability with GAMG("classical")
> compared to GAMG("agg"). But it was with PETSc 3.14...
>

AMG is sensitive to parameters.
What PDE and discretization are you solving?
For example, I recently optimized the Q2 Laplacian benchmark and found good
scaling with
-pc_gamg_threshold 0.04 -pc_gamg_threshold_scale .25
Hypre scaled well without optimization (see below),


>
> So today, we are going to upgrade to 3.20 and focus on GAMG("agg") or
> Hypre Classical AMG. We will see how it compares.
>

You might want to update to v3.20.2
That has some of my recent GAMG updates.


> May I ask you what is your point of view of the current state of the GPU
> versions of GAMG("agg") versus Hypre AMG Classical ?
>

Hypre is well supported with several developers over decades, whereas I
really just maintain GAMG + I add some things like anisotropy support
recently/currently.
But, I build on the PETSc sparse linear algebra that is well supported in
PETSc and hypre, and we have several good people doing that.

TL;DR
Both run the solve and matrix setup phase on the GPU.
Hypre puts the graph setup phase on the GPU, but this phase is 1) not well
suited to GPUs and 2) is amortized in most applications (just done once).
GAMG is easier to deal with because it is built-in and the interface to
hypre can be fragile with respect to GPUs (eg, if you use '-mat_type
hypre') in my experience.
If performance is critical and you have the time to put into it, hypre will
be a good option, and GAMG can be a backup.


>
> In fact, the reason of our move from 3.14 to 3.20 is to take advantage of
> all the progress in PETSc and Hypre on accelerated solvers/preconditioners
> during the last 2 years.
>
>
And I can give you advice on GAMG parameters, if you send me the output
with '-info :pc' (and 'grep GAMG').

Thanks,
Mark


> Greatly appreciate your help,
>
> Pierre LEDAC
> Commissariat ? l??nergie atomique et aux ?nergies alternatives
> Centre de SACLAY
> DES/ISAS/DM2S/SGLS/LCAN
> B?timent 451 ? point courrier n?43
> F-91191 Gif-sur-Yvette
> +33 1 69 08 04 03
> +33 6 83 42 05 79
> ------------------------------
> *De :* Mark Adams <mfadams at lbl.gov>
> *Envoy? :* mercredi 13 d?cembre 2023 20:54:17
> *? :* LEDAC Pierre
> *Cc :* petsc-users at mcs.anl.gov; BRUNETON Adrien
> *Objet :* Re: [petsc-users] PETSc 3.14 to PETSc 3.20: Different (slower)
> convergence for classical AMG (sequential and especially in parallel)
>
> Hi Pierre,
>
> Sorry I missed this post and your issues were brought to my attention
> today.
>
> First, the classic version is not supported well. The postdoc that wrote
> the code is long gone and I don't know the code at all.
> It is really a reference implementation that someone could build on and is
> not meant for production.
> In 10 years you are the first user that has connected us.
>
> The hypre package is a very good AMG solver and it uses classical AMG as
> the main solver.
> I wrote GAMG ("agg") which is a smoothed aggregation AMG solver and is
> very different from classical.
> I would suggest you move to hypre or '-pc_gamg_type agg'.
>
> The coarsening was developed in this time frame and there was a lot of
> churn as a new strategy for aggressive coarsening did not work well for
> some users and I had to add the old method in and then made it the default
> (again).
> This change missed v3.20, but you can get the old aggressive strategy with
> '-pc_gamg_aggressive_square_graph'.
> Check with -options_left to check that it is being used.
>
> As far as your output (nice formatting, thank you), the coarse grid is
> smaller in the new code.
>             rows=41, cols=41   |          rows=30, cols=30
> "square graph" should fix this.
>
> You can also try not using aggressive coarsening with:
> You could try '-pc_gamg_aggressive_coarsening 0'
>
> Let me know how it goes and let's try to get you into a more sustainable
> state ... I really try not to change this code but sometimes need to.
>
> Thanks,
> Mark
>
>
>
>
>
> On Mon, Oct 9, 2023 at 10:43?AM LEDAC Pierre <Pierre.LEDAC at cea.fr> wrote:
>
>> Hello all,
>>
>>
>> I am struggling to find the same convergence in iterations when using
>> classical algebric multigrid in my code with PETSc 3.20 compared to PETSc
>> 3.14.
>>
>>
>> I am using in order to solve a Poisson system:
>>
>> *-ksp_type cg -pc_type gamg -pc_gamg_type classical*
>>
>>
>> I read the different releases notes between 3.15 and 3.20:
>>
>> https://petsc.org/release/changes/317
>>
>> https://petsc.org/main/manualpages/PC/PCGAMGSetThreshold/
>>
>>
>> And have a look at the archive mailing list (especially this one:
>> https://www.mail-archive.com/petsc-users at mcs.anl.gov/msg46688.html)
>>
>> so I added some other options to try to have the same behaviour than
>> PETSc 3.14:
>>
>>
>> *-ksp_type cg -pc_type gamg -pc_gamg_type classical *-mg_levels_pc_type
>> sor -pc_gamg_threshold 0.
>>
>>
>> It improves the convergence but there still a different convergence
>> though (26 vs 18 iterations).
>>
>> On another of my test case, the number of levels is different (e.g. 6 vs
>> 4) also, and here it is the same, but with a different coarsening according
>> to the output from the -ksp_view option
>>
>> The main point is that the convergence dramatically degrades in parallel
>> on a third test case, so I can't upgrade to PETSc 3.20 for now unhappily.
>>
>> I send you the partial report (petsc_314_vs_petsc_320.ksp_view) with
>> -ksp_view (left PETSc 3.14, right PETSc 3.20) and the configure/command
>> line options used (in petsc_XXX_petsc.TU files).
>>
>>
>> Could my issue related to the following 3.18 change ? I have not tried
>> the first one.
>>
>>
>>    -
>>
>>    Remove PCGAMGSetSymGraph() and -pc_gamg_sym_graph. The user should
>>    now indicate symmetry and structural symmetry using MatSetOption
>>    <https://petsc.org/release/manualpages/Mat/MatSetOption/>() and GAMG
>>    will symmetrize the graph if a symmetric options is not set
>>    -
>>
>>    Change -pc_gamg_reuse_interpolation default from false to true.
>>
>>
>> Any advice would be greatly appreciated,
>>
>>
>> Pierre LEDAC
>> Commissariat ? l??nergie atomique et aux ?nergies alternatives
>> Centre de SACLAY
>> DES/ISAS/DM2S/SGLS/LCAN
>> B?timent 451 ? point courrier n?43
>> F-91191 Gif-sur-Yvette
>> +33 1 69 08 04 03
>> +33 6 83 42 05 79
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/7a5894cb/attachment.html>

From srvenkat at utexas.edu  Thu Dec 14 13:02:04 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Thu, 14 Dec 2023 13:02:04 -0600
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
Message-ID: <CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>

Hello Pierre,

Thank you for your reply. I tried out the HPDDM CG as you said, and it
seems to be doing the batched solves, but the KSP is not converging due to
a NaN or Inf being generated. I also noticed there are a lot of
host-to-device and device-to-host copies of the matrices (the non-batched
KSP solve did not have any memcopies). I have attached dump.0 again. Could
you please take a look?

Thanks,
Sreeram

On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et> wrote:

> Hello Sreeram,
> KSPCG (PETSc implementation of CG) does not handle solves with multiple
> columns at once.
> There is only a single native PETSc KSP implementation which handles
> solves with multiple columns at once: KSPPREONLY.
> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced
> methods) implementation which handles solves with multiple columns at once
> (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM);
> KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
> I?m the main author of HPDDM, there is preliminary support for device
> matrices, but if it?s not working as intended/not faster than column by
> column, I?d be happy to have a deeper look (maybe in private), because most
> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e.,
> solvers that treat right-hand sides in a single go) are using plain host
> matrices.
>
> Thanks,
> Pierre
>
> PS: you could have a look at
> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
> understand the philosophy behind block iterative methods in PETSc (and in
> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
> developed in the context of this paper to produce Figures 2-3. Note that
> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
> others) have been made ?PCMatApply()-ready?.
>
> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> Hello Pierre,
>
> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
> However, I am noticing that it is still solving column by column (this is
> stated explicitly in the info dump attached). I looked at the code for
> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true,
> it should do the batched solve, though I'm not sure where that gets set.
>
> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
> running the code.
>
> Can you please help me with this?
>
> Thanks,
> Sreeram
>
>
> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov> wrote:
>
>> N.B., AMGX interface is a bit experimental.
>> Mark
>>
>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>
>>> Thanks,
>>> Sreeram
>>>
>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>
>>>>
>>>>
>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>
>>>> I want to use the AMGX preconditioner for the KSP. I will try it out
>>>> and see how it performs.
>>>>
>>>>
>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has
>>>> no PCMatApply() implementation.
>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>> implementation.
>>>> But let us know if you need assistance figuring things out.
>>>>
>>>> Thanks,
>>>> Pierre
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>
>>>>> To expand on Barry?s answer, we have observed repeatedly that
>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>> reproduce this on your own with
>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>> Also, I?m guessing you are using some sort of preconditioner within
>>>>> your KSP.
>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>>>> right-hand sides column by column, which is very inefficient.
>>>>> You could run your code with -info dump and send us dump.0 to see what
>>>>> needs to be done on our end to make things more efficient, should you not
>>>>> be satisfied with the current performance of the code.
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n
>>>>> x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has
>>>>> size n. The data for v can be stored either in column-major or row-major
>>>>> order.  Now, I want to do 2 types of operations:
>>>>>
>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>
>>>>> From what I have read on the documentation, I can think of 2
>>>>> approaches.
>>>>>
>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>> with R and V.
>>>>>
>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with
>>>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that it
>>>>> is a multiple RHS system and act accordingly.
>>>>>
>>>>> Which would be the more efficient option?
>>>>>
>>>>>
>>>>> Use 1.
>>>>>
>>>>>
>>>>> As a side-note, I am also wondering if there is a way to use row-major
>>>>> storage of the vector v.
>>>>>
>>>>>
>>>>> No
>>>>>
>>>>> The reason is that this could allow for more coalesced memory access
>>>>> when doing matvecs.
>>>>>
>>>>>
>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products
>>>>> for the computation so in theory they should already be well-optimized
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>>
>>>>>
>>>> <dump.0>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/9200c5ff/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dump.0
Type: application/octet-stream
Size: 1565937 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/9200c5ff/attachment-0001.obj>

From facklerpw at ornl.gov  Thu Dec 14 13:05:33 2023
From: facklerpw at ornl.gov (Fackler, Philip)
Date: Thu, 14 Dec 2023 19:05:33 +0000
Subject: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing
 allocation behavior
Message-ID: <SA1PR09MB8077756E473CB4C0409FBF34C68CA@SA1PR09MB8077.namprd09.prod.outlook.com>

I'm using the following sequence of functions related to the Jacobian matrix:

DMDACreate1d(..., &da);
DMSetFromOptions(da);
DMSetUp(da);
DMSetMatType(da, MATAIJKOKKOS);
DMSetMatrixPreallocateSkip(da, PETSC_TRUE);
Mat J;
DMCreateMatrix(da, &J);
MatSetPreallocationCOO(J, ...);

I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation<https://petsc.org/release/manualpages/DM/DMSetMatrixPreallocateSkip/> says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix.

[cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13]

Can someone help me understand this?

Thanks,

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/7375aeca/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 54775 bytes
Desc: image.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/7375aeca/attachment-0001.png>

From pierre at joliv.et  Thu Dec 14 13:12:28 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 14 Dec 2023 20:12:28 +0100
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
Message-ID: <C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>


> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> Hello Pierre,
> 
> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look?

Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c).
Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage.

Thanks,
Pierre

> Thanks,
> Sreeram
> 
> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>> Hello Sreeram,
>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once.
>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY.
>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices.
>> 
>> Thanks,
>> Pierre
>> 
>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?.
>> 
>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>> 
>>> Hello Pierre,
>>> 
>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. 
>>> 
>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code.
>>> 
>>> Can you please help me with this?
>>> 
>>> Thanks,
>>> Sreeram
>>> 
>>> 
>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
>>>> N.B., AMGX interface is a bit experimental.
>>>> Mark
>>>> 
>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>> 
>>>>> Thanks,
>>>>> Sreeram
>>>>> 
>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>> 
>>>>>> 
>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>> 
>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. 
>>>>>>> 
>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs.
>>>>>> 
>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation.
>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation.
>>>>>> But let us know if you need assistance figuring things out.
>>>>>> 
>>>>>> Thanks,
>>>>>> Pierre
>>>>>> 
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>> 
>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP.
>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient.
>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Pierre
>>>>>>>> 
>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>> 
>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>> 
>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>> 
>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. 
>>>>>>>>>> 
>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
>>>>>>>>>> 
>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
>>>>>>>>>> 
>>>>>>>>>> Which would be the more efficient option?
>>>>>>>>> 
>>>>>>>>> Use 1. 
>>>>>>>>>> 
>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.
>>>>>>>>> 
>>>>>>>>> No
>>>>>>>>> 
>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs.
>>>>>>>>> 
>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Sreeram
>>>>>>>> 
>>>>>> 
>>> <dump.0>
>> 
> <dump.0>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/6962d254/attachment.html>

From jed at jedbrown.org  Thu Dec 14 14:49:25 2023
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 14 Dec 2023 13:49:25 -0700
Subject: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing
 allocation behavior
In-Reply-To: <SA1PR09MB8077756E473CB4C0409FBF34C68CA@SA1PR09MB8077.namprd09.prod.outlook.com>
References: <SA1PR09MB8077756E473CB4C0409FBF34C68CA@SA1PR09MB8077.namprd09.prod.outlook.com>
Message-ID: <871qboech6.fsf@jedbrown.org>

17 GB for a 1D DMDA, wow. :-)

Could you try applying this diff to make it work for DMDA (it's currently handled by DMPlex)?

diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c
index cad4d926504..bd2a3bda635 100644
--- i/src/dm/impls/da/fdda.c
+++ w/src/dm/impls/da/fdda.c
@@ -675,19 +675,21 @@ PetscErrorCode DMCreateMatrix_DA(DM da, Mat *J)
    specialized setting routines depend only on the particular preallocation
    details of the matrix, not the type itself.
   */
-  PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij));
-  if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij));
-  if (!aij) {
-    PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij));
-    if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij));
-    if (!baij) {
-      PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij));
-      if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij));
-      if (!sbaij) {
-        PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell));
-        if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell));
+  if (!dm->prealloc_skip) { // Flag is likely set when user intends to use MatSetPreallocationCOO()
+    PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij));
+    if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij));
+    if (!aij) {
+      PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij));
+      if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij));
+      if (!baij) {
+        PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij));
+        if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij));
+        if (!sbaij) {
+          PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell));
+          if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell));
+        }
+        if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is));
       }
-      if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is));
     }
   }
   if (aij) {


"Fackler, Philip via petsc-users" <petsc-users at mcs.anl.gov> writes:

> I'm using the following sequence of functions related to the Jacobian matrix:
>
> DMDACreate1d(..., &da);
> DMSetFromOptions(da);
> DMSetUp(da);
> DMSetMatType(da, MATAIJKOKKOS);
> DMSetMatrixPreallocateSkip(da, PETSC_TRUE);
> Mat J;
> DMCreateMatrix(da, &J);
> MatSetPreallocationCOO(J, ...);
>
> I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation<https://petsc.org/release/manualpages/DM/DMSetMatrixPreallocateSkip/> says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix.
>
> [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13]
>
> Can someone help me understand this?
>
> Thanks,
>
> Philip Fackler
> Research Software Engineer, Application Engineering Group
> Advanced Computing Systems Research Section
> Computer Science and Mathematics Division
> Oak Ridge National Laboratory

From knepley at gmail.com  Thu Dec 14 15:19:01 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 14 Dec 2023 16:19:01 -0500
Subject: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing
 allocation behavior
In-Reply-To: <SA1PR09MB8077756E473CB4C0409FBF34C68CA@SA1PR09MB8077.namprd09.prod.outlook.com>
References: <SA1PR09MB8077756E473CB4C0409FBF34C68CA@SA1PR09MB8077.namprd09.prod.outlook.com>
Message-ID: <CAMYG4G=AZAQ-DAFvybdPV2xKMYM7t3ZxGUXpL22hznwWqUPu4A@mail.gmail.com>

On Thu, Dec 14, 2023 at 2:06?PM Fackler, Philip via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> I'm using the following sequence of functions related to the Jacobian
> matrix:
>
> DMDACreate1d(..., &da);
> DMSetFromOptions(da);
> DMSetUp(da);
> DMSetMatType(da, MATAIJKOKKOS);
> DMSetMatrixPreallocateSkip(da, PETSC_TRUE);
> Mat J;
> DMCreateMatrix(da, &J);
> MatSetPreallocationCOO(J, ...);
>
> I recently added the call to DMSetMatrixPreallocateSkip, hoping the
> allocation would be delayed to MatSetPreallocationCOO, and that it would
> require less memory. The documentation
> <https://petsc.org/release/manualpages/DM/DMSetMatrixPreallocateSkip/> says
> that the data structures will not be preallocated.
>

You are completely correct. DMDA is just ignoring this flag. We will fix it.

  Thanks for catching this.

     Matt


> The following data from heaptrack shows that the allocation is still
> happening in the call to DMCreateMatrix.
>
>
> Can someone help me understand this?
>
> Thanks,
>
>
> *Philip Fackler *
> Research Software Engineer, Application Engineering Group
> Advanced Computing Systems Research Section
> Computer Science and Mathematics Division
> *Oak Ridge National Laboratory*
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/b34c081e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 54775 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/b34c081e/attachment-0001.png>

From jed at jedbrown.org  Thu Dec 14 15:27:53 2023
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 14 Dec 2023 14:27:53 -0700
Subject: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing
 allocation behavior
In-Reply-To: <871qboech6.fsf@jedbrown.org>
References: <SA1PR09MB8077756E473CB4C0409FBF34C68CA@SA1PR09MB8077.namprd09.prod.outlook.com>
	<871qboech6.fsf@jedbrown.org>
Message-ID: <87wmtgcw4m.fsf@jedbrown.org>

I had a one-character typo in the diff above. This MR to release should work now.

https://gitlab.com/petsc/petsc/-/merge_requests/7120

Jed Brown <jed at jedbrown.org> writes:

> 17 GB for a 1D DMDA, wow. :-)
>
> Could you try applying this diff to make it work for DMDA (it's currently handled by DMPlex)?
>
> diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c
> index cad4d926504..bd2a3bda635 100644
> --- i/src/dm/impls/da/fdda.c
> +++ w/src/dm/impls/da/fdda.c
> @@ -675,19 +675,21 @@ PetscErrorCode DMCreateMatrix_DA(DM da, Mat *J)
>     specialized setting routines depend only on the particular preallocation
>     details of the matrix, not the type itself.
>    */
> -  PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij));
> -  if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij));
> -  if (!aij) {
> -    PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij));
> -    if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij));
> -    if (!baij) {
> -      PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij));
> -      if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij));
> -      if (!sbaij) {
> -        PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell));
> -        if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell));
> +  if (!dm->prealloc_skip) { // Flag is likely set when user intends to use MatSetPreallocationCOO()
> +    PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij));
> +    if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij));
> +    if (!aij) {
> +      PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij));
> +      if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij));
> +      if (!baij) {
> +        PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij));
> +        if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij));
> +        if (!sbaij) {
> +          PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell));
> +          if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell));
> +        }
> +        if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is));
>        }
> -      if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is));
>      }
>    }
>    if (aij) {
>
>
> "Fackler, Philip via petsc-users" <petsc-users at mcs.anl.gov> writes:
>
>> I'm using the following sequence of functions related to the Jacobian matrix:
>>
>> DMDACreate1d(..., &da);
>> DMSetFromOptions(da);
>> DMSetUp(da);
>> DMSetMatType(da, MATAIJKOKKOS);
>> DMSetMatrixPreallocateSkip(da, PETSC_TRUE);
>> Mat J;
>> DMCreateMatrix(da, &J);
>> MatSetPreallocationCOO(J, ...);
>>
>> I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation<https://petsc.org/release/manualpages/DM/DMSetMatrixPreallocateSkip/> says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix.
>>
>> [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13]
>>
>> Can someone help me understand this?
>>
>> Thanks,
>>
>> Philip Fackler
>> Research Software Engineer, Application Engineering Group
>> Advanced Computing Systems Research Section
>> Computer Science and Mathematics Division
>> Oak Ridge National Laboratory

From srvenkat at utexas.edu  Thu Dec 14 16:45:00 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Thu, 14 Dec 2023 16:45:00 -0600
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
Message-ID: <CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>

Thanks, I will try to create a minimal reproducible example. This may take
me some time though, as I need to figure out how to extract only the
relevant parts (the full program this solve is used in is getting quite
complex).

I'll also try out some of the BoomerAMG options to see if that helps.

Thanks,
Sreeram

On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et> wrote:

>
>
> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> Hello Pierre,
>
> Thank you for your reply. I tried out the HPDDM CG as you said, and it
> seems to be doing the batched solves, but the KSP is not converging due to
> a NaN or Inf being generated. I also noticed there are a lot of
> host-to-device and device-to-host copies of the matrices (the non-batched
> KSP solve did not have any memcopies). I have attached dump.0 again. Could
> you please take a look?
>
>
> Yes, but you?d need to send me something I can run with your set of
> options (if you are more confident doing this in private, you can remove
> the list from c/c).
> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there
> is not much error checking, so instead of erroring out, this may be the
> reason why you are getting garbage.
>
> Thanks,
> Pierre
>
> Thanks,
> Sreeram
>
> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et> wrote:
>
>> Hello Sreeram,
>> KSPCG (PETSc implementation of CG) does not handle solves with multiple
>> columns at once.
>> There is only a single native PETSc KSP implementation which handles
>> solves with multiple columns at once: KSPPREONLY.
>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced
>> methods) implementation which handles solves with multiple columns at once
>> (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM);
>> KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>> I?m the main author of HPDDM, there is preliminary support for device
>> matrices, but if it?s not working as intended/not faster than column by
>> column, I?d be happy to have a deeper look (maybe in private), because most
>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e.,
>> solvers that treat right-hand sides in a single go) are using plain host
>> matrices.
>>
>> Thanks,
>> Pierre
>>
>> PS: you could have a look at
>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
>> understand the philosophy behind block iterative methods in PETSc (and in
>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
>> developed in the context of this paper to produce Figures 2-3. Note that
>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
>> others) have been made ?PCMatApply()-ready?.
>>
>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>> Hello Pierre,
>>
>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
>> However, I am noticing that it is still solving column by column (this is
>> stated explicitly in the info dump attached). I looked at the code for
>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is
>> true, it should do the batched solve, though I'm not sure where that gets
>> set.
>>
>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
>> running the code.
>>
>> Can you please help me with this?
>>
>> Thanks,
>> Sreeram
>>
>>
>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov> wrote:
>>
>>> N.B., AMGX interface is a bit experimental.
>>> Mark
>>>
>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>>
>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out
>>>>> and see how it performs.
>>>>>
>>>>>
>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has
>>>>> no PCMatApply() implementation.
>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>>> implementation.
>>>>> But let us know if you need assistance figuring things out.
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>>
>>>>>> To expand on Barry?s answer, we have observed repeatedly that
>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>>> reproduce this on your own with
>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>> Also, I?m guessing you are using some sort of preconditioner within
>>>>>> your KSP.
>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>>>>> right-hand sides column by column, which is very inefficient.
>>>>>> You could run your code with -info dump and send us dump.0 to see
>>>>>> what needs to be done on our end to make things more efficient, should you
>>>>>> not be satisfied with the current performance of the code.
>>>>>>
>>>>>> Thanks,
>>>>>> Pierre
>>>>>>
>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>> wrote:
>>>>>>
>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size
>>>>>> n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has
>>>>>> size n. The data for v can be stored either in column-major or row-major
>>>>>> order.  Now, I want to do 2 types of operations:
>>>>>>
>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>
>>>>>> From what I have read on the documentation, I can think of 2
>>>>>> approaches.
>>>>>>
>>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>>> with R and V.
>>>>>>
>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with
>>>>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that it
>>>>>> is a multiple RHS system and act accordingly.
>>>>>>
>>>>>> Which would be the more efficient option?
>>>>>>
>>>>>>
>>>>>> Use 1.
>>>>>>
>>>>>>
>>>>>> As a side-note, I am also wondering if there is a way to use
>>>>>> row-major storage of the vector v.
>>>>>>
>>>>>>
>>>>>> No
>>>>>>
>>>>>> The reason is that this could allow for more coalesced memory access
>>>>>> when doing matvecs.
>>>>>>
>>>>>>
>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products
>>>>>> for the computation so in theory they should already be well-optimized
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Sreeram
>>>>>>
>>>>>>
>>>>>>
>>>>> <dump.0>
>>
>>
>> <dump.0>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/29b82d65/attachment-0001.html>

From vittorio.sciortino at uniba.it  Thu Dec 14 22:20:10 2023
From: vittorio.sciortino at uniba.it (Vittorio Sciortino)
Date: Fri, 15 Dec 2023 05:20:10 +0100
Subject: [petsc-users] PETSc configuration to solve Poisson equation on a 2D
 cartesian grid of points with nVidia GPUs (CUDA)
Message-ID: <5b01e23b0de85bc1c5f09b58ee29ef79@uniba.it>


Dear PETSc developers,

My name is Vittorio Sciortion, I am a PhD student in Italy and I am 
really curious about the applications and  possibilities of your 
library. I would ask you two questions about PETSc.

My study case consists in the development of a 2D electrostatic Particle 
In Cell code which simulates a plasma interacting with the shaped 
surface of adjacent divertor mono-blocks.
This type of scenario requires to solve the electro-static Poisson 
equation on the whole set of grid nodes (a cartesian grid) applying some 
boundary conditions.
Currently, we are using the KSPSolve subroutine set to apply the gmres 
iterative method in conjunction with hypre (used as pre-conditioner).
Some boundary conditons are necessary for our specific problem 
(Dirichlet and Neumann conditions on specific line of points).
I have two small curiosity about the possibilities offered by your 
library, which is very interesting:

1. are we using the best possible pair to solve our problem?

2. currently, PETSc is compiled with openMP parallelization and the 
iterative method is executed on the CPU.
Is it possible to configure the compilation of our library to execute 
these iterations on a nVidia GPU? Which are the best compilation options 
that you suggest for your library?

thank you in advance
Greetings
Vittorio Sciortino
PhD student in Physics
Bari, Italy

Recently, I sent a subscribe request to the users mailing list using 
another e-mail, because this one could be deactivated in two/three 
months.  private email: vsciortino.phdcourse at gmail.com
-- 
Vittorio Sciortino
________________________________________________________________________________________________
Sostieni la formazione e la ricerca universitaria con il tuo 5 per mille 
all'Universit? di Bari.
Firma la casella "Finanziamento della ricerca scientifica e della 
Universit?"
indicando il codice fiscale 80002170720.

Il tuo contributo pu? fare la differenza: oggi pi? che mai!
________________________________________________________________________________________________

From pierre at joliv.et  Fri Dec 15 01:01:10 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Fri, 15 Dec 2023 08:01:10 +0100
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
Message-ID: <186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>


> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex).

You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files (I?m guessing your are using double-precision scalars with 32-bit PetscInt).

> I'll also try out some of the BoomerAMG options to see if that helps.

These should work (this is where all ?PCMatApply()-ready? PC are being tested): https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with HIP).
I?m aware the performance should not be optimal (see your comment about host/device copies), I?ve money to hire someone to work on this but: a) I need to find the correct engineer/post-doc, b) I currently don?t have good use cases (of course, I could generate a synthetic benchmark, for science).
So even if you send me the three Mat, a MWE would be appreciated if the KSPMatSolve() is performance-critical for you (see point b) from above).

Thanks,
Pierre

> Thanks,
> Sreeram
> 
> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>> 
>> 
>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>> 
>>> Hello Pierre,
>>> 
>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look?
>> 
>> Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c).
>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage.
>> 
>> Thanks,
>> Pierre
>> 
>>> Thanks,
>>> Sreeram
>>> 
>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>> Hello Sreeram,
>>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once.
>>>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY.
>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices.
>>>> 
>>>> Thanks,
>>>> Pierre
>>>> 
>>>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?.
>>>> 
>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>> 
>>>>> Hello Pierre,
>>>>> 
>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. 
>>>>> 
>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code.
>>>>> 
>>>>> Can you please help me with this?
>>>>> 
>>>>> Thanks,
>>>>> Sreeram
>>>>> 
>>>>> 
>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
>>>>>> N.B., AMGX interface is a bit experimental.
>>>>>> Mark
>>>>>> 
>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>> 
>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>> 
>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. 
>>>>>>>>> 
>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs.
>>>>>>>> 
>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation.
>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation.
>>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Pierre
>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Sreeram
>>>>>>>>> 
>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP.
>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient.
>>>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Pierre
>>>>>>>>>> 
>>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
>>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>>>> 
>>>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. 
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
>>>>>>>>>>>> 
>>>>>>>>>>>> Which would be the more efficient option?
>>>>>>>>>>> 
>>>>>>>>>>> Use 1. 
>>>>>>>>>>>> 
>>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.
>>>>>>>>>>> 
>>>>>>>>>>> No
>>>>>>>>>>> 
>>>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs.
>>>>>>>>>>> 
>>>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Sreeram
>>>>>>>>>> 
>>>>>>>> 
>>>>> <dump.0>
>>>> 
>>> <dump.0>
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231215/744677bd/attachment-0001.html>

From mfadams at lbl.gov  Fri Dec 15 08:17:55 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 15 Dec 2023 09:17:55 -0500
Subject: [petsc-users] PETSc configuration to solve Poisson equation on
 a 2D cartesian grid of points with nVidia GPUs (CUDA)
In-Reply-To: <5b01e23b0de85bc1c5f09b58ee29ef79@uniba.it>
References: <5b01e23b0de85bc1c5f09b58ee29ef79@uniba.it>
Message-ID: <CADOhEh6dx9A-HapxkEO=Sus2660tcH20ObtWfCuNGjgQJywEwg@mail.gmail.com>

Hi Vittorio,

PETSc does provide support for your application and some of us (eg, me and
Matt) work with fusion PIC applications.

1) I am not sure how you handle boundary conditions with a Cartesian grid
so let me give two responses:

1.1) With Cartesian grids, geometric multigrid may be usable and that can
be fast and easier to use.
PETSc supports geometric and algebraic multigrid, including interfaces to
third party libraries like hypre.
Hypre is an excellent solver, but you can probably use CG as your KSP
method instead of GMRES.

1.2) PETSc provides support for unstructured mesh management and
discretizations and you switch to an unstructured grid, but I understand we
all have priorities.
Unstructured grids are probably a better long term solution for you.

2) PETSc is portable with linear algebra back-ends that execute on any
"device".
Our OpenMP support is only through the Kokkos back-end and we have custom
CUDA and HIP backends that are built on vendor libraries.
The Kokkos back-end also supports CUDA, HIP and SYCL and we rely on Kokkos
any other architectures at this point.
BTW, the Kokkos back-end also has an option to use vendor back-ends or
Kokkos Kernels for linear algebra and they are often better than the
vendors libraries.

Hope this helps,
Mark


On Fri, Dec 15, 2023 at 12:41?AM Vittorio Sciortino <
vittorio.sciortino at uniba.it> wrote:

>
> Dear PETSc developers,
>
> My name is Vittorio Sciortion, I am a PhD student in Italy and I am
> really curious about the applications and  possibilities of your
> library. I would ask you two questions about PETSc.
>
> My study case consists in the development of a 2D electrostatic Particle
> In Cell code which simulates a plasma interacting with the shaped
> surface of adjacent divertor mono-blocks.
> This type of scenario requires to solve the electro-static Poisson
> equation on the whole set of grid nodes (a cartesian grid) applying some
> boundary conditions.
> Currently, we are using the KSPSolve subroutine set to apply the gmres
> iterative method in conjunction with hypre (used as pre-conditioner).
> Some boundary conditons are necessary for our specific problem
> (Dirichlet and Neumann conditions on specific line of points).
> I have two small curiosity about the possibilities offered by your
> library, which is very interesting:
>
> 1. are we using the best possible pair to solve our problem?
>
> 2. currently, PETSc is compiled with openMP parallelization and the
> iterative method is executed on the CPU.
> Is it possible to configure the compilation of our library to execute
> these iterations on a nVidia GPU? Which are the best compilation options
> that you suggest for your library?
>
> thank you in advance
> Greetings
> Vittorio Sciortino
> PhD student in Physics
> Bari, Italy
>
> Recently, I sent a subscribe request to the users mailing list using
> another e-mail, because this one could be deactivated in two/three
> months.  private email: vsciortino.phdcourse at gmail.com
> --
> Vittorio Sciortino
>
> ________________________________________________________________________________________________
> Sostieni la formazione e la ricerca universitaria con il tuo 5 per mille
> all'Universit? di Bari.
> Firma la casella "Finanziamento della ricerca scientifica e della
> Universit?"
> indicando il codice fiscale 80002170720.
>
> Il tuo contributo pu? fare la differenza: oggi pi? che mai!
>
> ________________________________________________________________________________________________
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231215/c4d60f61/attachment.html>

From pierre at joliv.et  Sat Dec 16 11:25:54 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Sat, 16 Dec 2023 18:25:54 +0100
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
	<186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>
	<CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
Message-ID: <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>

Unfortunately, I am not able to reproduce such a failure with your input matrix.
I?ve used ex79 that I linked previously and the system is properly solved.
$ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs ascii::ascii_info
Linear solve converged due to CONVERGED_RTOL iterations 6
Mat Object: 1 MPI process
  type: seqaijcusparse
  rows=289, cols=289
  total: nonzeros=2401, allocated nonzeros=2401
  total number of mallocs used during MatSetValues calls=0
    not using I-node routines
Mat Object: 1 MPI process
  type: seqdensecuda
  rows=289, cols=10
  total: nonzeros=2890, allocated nonzeros=2890
  total number of mallocs used during MatSetValues calls=0

You mentioned in a subsequent email that you are interested in systems with at most 1E6 unknowns, and up to 1E4 right-hand sides.
I?m not sure you can expect significant gains from using GPU for such systems.
Probably, the fastest approach would indeed be -pc_type lu -ksp_type preonly -ksp_matsolve_batch_size 100 or something, depending on the memory available on your host.

Thanks,
Pierre

> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> Here are the ksp_view files.  I set the options -ksp_error_if_not_converged to try to get the vectors that caused the error. I noticed that some of the KSPMatSolves converge while others don't. In the code, the solves are called as:
> 
> input vector v --> insert data of v into a dense mat --> KSPMatSolve() --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output vector w -- output w
> 
> The operator used in the KSP is a Laplacian-like operator, and the MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve with a biharmonic-like operator. I can also run it with only the first KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP reportedly converges after 0 iterations (see the next line), but this causes problems in other parts of the code later on. 
> 
> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything. 
> 
> I'll keep trying different options and also try to get the MWE made (this KSPMatSolve is pretty performance critical for us). 
> 
> Thanks for all your help,
> Sreeram
> 
> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>> 
>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>> 
>>> Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex).
>> 
>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files (I?m guessing your are using double-precision scalars with 32-bit PetscInt).
>> 
>>> I'll also try out some of the BoomerAMG options to see if that helps.
>> 
>> These should work (this is where all ?PCMatApply()-ready? PC are being tested): https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with HIP).
>> I?m aware the performance should not be optimal (see your comment about host/device copies), I?ve money to hire someone to work on this but: a) I need to find the correct engineer/post-doc, b) I currently don?t have good use cases (of course, I could generate a synthetic benchmark, for science).
>> So even if you send me the three Mat, a MWE would be appreciated if the KSPMatSolve() is performance-critical for you (see point b) from above).
>> 
>> Thanks,
>> Pierre
>> 
>>> Thanks,
>>> Sreeram
>>> 
>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>> 
>>>> 
>>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>> 
>>>>> Hello Pierre,
>>>>> 
>>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look?
>>>> 
>>>> Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c).
>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage.
>>>> 
>>>> Thanks,
>>>> Pierre
>>>> 
>>>>> Thanks,
>>>>> Sreeram
>>>>> 
>>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>> Hello Sreeram,
>>>>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once.
>>>>>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY.
>>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>>>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices.
>>>>>> 
>>>>>> Thanks,
>>>>>> Pierre
>>>>>> 
>>>>>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?.
>>>>>> 
>>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>> 
>>>>>>> Hello Pierre,
>>>>>>> 
>>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. 
>>>>>>> 
>>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code.
>>>>>>> 
>>>>>>> Can you please help me with this?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>> 
>>>>>>> 
>>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
>>>>>>>> N.B., AMGX interface is a bit experimental.
>>>>>>>> Mark
>>>>>>>> 
>>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Sreeram
>>>>>>>>> 
>>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. 
>>>>>>>>>>> 
>>>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs.
>>>>>>>>>> 
>>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation.
>>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation.
>>>>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Pierre
>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Sreeram
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP.
>>>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient.
>>>>>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Pierre
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
>>>>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Which would be the more efficient option?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Use 1. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> No
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Sreeram
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> <dump.0>
>>>>>> 
>>>>> <dump.0>
>>>> 
>> 
> <Pmat.bin><Amat.bin><rhs.bin>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231216/c418e10d/attachment.html>

From joauma.marichal at uclouvain.be  Mon Dec 18 04:09:36 2023
From: joauma.marichal at uclouvain.be (Joauma Marichal)
Date: Mon, 18 Dec 2023 10:09:36 +0000
Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors
In-Reply-To: <CAMYG4GknN3_hwCWA=LiY9w6LiZeTvW0cwE0uRuxtRAw9Y23awg@mail.gmail.com>
References: <DU0PR03MB95901B99683E00FF1C38209B81DEA@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GkBOyWv=pM-Y3rvr-pfjigTp7WJ-ngj77r=PGZLqeQk4Q@mail.gmail.com>
	<DU0PR03MB959051F6A095101917CA8E5E81B9A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GknN3_hwCWA=LiY9w6LiZeTvW0cwE0uRuxtRAw9Y23awg@mail.gmail.com>
Message-ID: <DU0PR03MB9590427B8F2DAD0FC28154E58190A@DU0PR03MB9590.eurprd03.prod.outlook.com>

Hello,

Sorry for the delay. I attach the file that I obtain when running the code with the debug mode.

Thanks for your help.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com>
Date : jeudi, 23 novembre 2023 ? 15:32
? : Joauma Marichal <joauma.marichal at uclouvain.be>
Cc : petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>, petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>> wrote:
Hello,

My problem persists? Is there anything I could try?

Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It does allocation, and the failure
is in libc, and it only happens on larger examples, so I suspect some allocation problem. Can you rebuild with debugging and run this example? Then we can see if the allocation fails.

  Thanks,

     Matt

Thanks a lot.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date : mercredi, 25 octobre 2023 ? 14:45
? : Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>>
Cc : petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov> <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>>, petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>> wrote:
Hello,

I am using the DMSwarm library in some Eulerian-Lagrangian approach to have vapor bubbles in water.
I have obtained nice results recently and wanted to perform bigger simulations. Unfortunately, when I increase the number of processors used to run the simulation, I get the following error:


free(): invalid size

[cns136:590327] *** Process received signal ***

[cns136:590327] Signal: Aborted (6)

[cns136:590327] Signal code:  (-6)

[cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20]

[cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f]

[cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05]

[cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037]

[cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c]

[cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac]

[cns136:590327] [ 6] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64]

[cns136:590327] [ 7] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642]

[cns136:590327] [ 8] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e]

[cns136:590327] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde]

[cns136:590327] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8]

[cns136:590327] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448]

[cns136:590327] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20]

[cns136:590327] [13] ./cobpor[0x4418dc]

[cns136:590327] [14] ./cobpor[0x408b63]

[cns136:590327] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3]

[cns136:590327] [16] ./cobpor[0x40bdee]

[cns136:590327] *** End of error message ***

--------------------------------------------------------------------------

Primary job  terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.

--------------------------------------------------------------------------

--------------------------------------------------------------------------

mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited on signal 6 (Aborted).

--------------------------------------------------------------------------

When I reduce the number of processors the error disappears and when I run my code without the vapor bubbles it also works.
The problem seems to take place at this moment:

DMCreate(PETSC_COMM_WORLD,swarm);
    DMSetType(*swarm,DMSWARM);
    DMSetDimension(*swarm,3);
    DMSwarmSetType(*swarm,DMSWARM_PIC);
    DMSwarmSetCellDM(*swarm,*dmcell);


Thanks a lot for your help.

Things that would help us track this down:

1) The smallest example where it fails

2) The smallest number of processes where it fails

3) A stack trace of the failure

4) A simple example that we can run that also fails

  Thanks,

     Matt

Best regards,

Joauma


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231218/ee164d94/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slurm-3184479.out
Type: application/octet-stream
Size: 55415 bytes
Desc: slurm-3184479.out
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231218/ee164d94/attachment-0001.obj>

From knepley at gmail.com  Mon Dec 18 05:00:02 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 18 Dec 2023 06:00:02 -0500
Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors
In-Reply-To: <DU0PR03MB9590427B8F2DAD0FC28154E58190A@DU0PR03MB9590.eurprd03.prod.outlook.com>
References: <DU0PR03MB95901B99683E00FF1C38209B81DEA@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GkBOyWv=pM-Y3rvr-pfjigTp7WJ-ngj77r=PGZLqeQk4Q@mail.gmail.com>
	<DU0PR03MB959051F6A095101917CA8E5E81B9A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GknN3_hwCWA=LiY9w6LiZeTvW0cwE0uRuxtRAw9Y23awg@mail.gmail.com>
	<DU0PR03MB9590427B8F2DAD0FC28154E58190A@DU0PR03MB9590.eurprd03.prod.outlook.com>
Message-ID: <CAMYG4GmkoYFTV2YjXKy5dAeH-hp6fxNw75Kwako=HdU1CV_xFA@mail.gmail.com>

On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal <
joauma.marichal at uclouvain.be> wrote:

> Hello,
>
>
>
> Sorry for the delay. I attach the file that I obtain when running the code
> with the debug mode.
>

Okay, we can now see where this is happening:

malloc_consolidate(): invalid chunk size
[cns263:3265170] *** Process received signal ***
[cns263:3265170] Signal: Aborted (6)
[cns263:3265170] Signal code:  (-6)
[cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20]
[cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f]
[cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05]
[cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037]
[cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c]
[cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68]
[cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18]
[cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822]
[cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc]
[cns263:3265170] [ 9]
/gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625]
[cns263:3265170] [10]
/gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07]
[cns263:3265170] [11]
/gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b]
[cns263:3265170] [12]
/gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9]
[cns263:3265170] [13]
/gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea]
[cns263:3265170] [14] ./cobpor[0x402de8]
[cns263:3265170] [15]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3]
[cns263:3265170] [16] ./cobpor[0x40304e]
[cns263:3265170] *** End of error message ***

However, this is not great. First, the amount of memory being allocated is
quite small, and this does not appear to be an Out of Memory error. Second,
the error occurs in libc:

  malloc_consolidate(): invalid chunk size

which means something is wrong internally. I agree with this analysis (
https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error)
that says you have probably overwritten memory somewhere in your code. I
recommend running under valgrind, or using Address Sanitizer from clang.

  Thanks,

     Matt

Thanks for your help.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *jeudi, 23 novembre 2023 ? 15:32
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal <
> joauma.marichal at uclouvain.be> wrote:
>
> Hello,
>
>
>
> My problem persists? Is there anything I could try?
>
>
>
> Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It
> does allocation, and the failure
>
> is in libc, and it only happens on larger examples, so I suspect some
> allocation problem. Can you rebuild with debugging and run this example?
> Then we can see if the allocation fails.
>
>
>
>   Thanks,
>
>      Matt
>
>
>
> Thanks a lot.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *mercredi, 25 octobre 2023 ? 14:45
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint <
> petsc-maint at mcs.anl.gov> wrote:
>
> Hello,
>
>
>
> I am using the DMSwarm library in some Eulerian-Lagrangian approach to
> have vapor bubbles in water.
>
> I have obtained nice results recently and wanted to perform bigger
> simulations. Unfortunately, when I increase the number of processors used
> to run the simulation, I get the following error:
>
>
>
> free(): invalid size
>
> [cns136:590327] *** Process received signal ***
>
> [cns136:590327] Signal: Aborted (6)
>
> [cns136:590327] Signal code:  (-6)
>
> [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20]
>
> [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f]
>
> [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05]
>
> [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037]
>
> [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c]
>
> [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac]
>
> [cns136:590327] [ 6]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64]
>
> [cns136:590327] [ 7]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642]
>
> [cns136:590327] [ 8]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e]
>
> [cns136:590327] [ 9]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde]
>
> [cns136:590327] [10]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8]
>
> [cns136:590327] [11]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448]
>
> [cns136:590327] [12]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20]
>
> [cns136:590327] [13] ./cobpor[0x4418dc]
>
> [cns136:590327] [14] ./cobpor[0x408b63]
>
> [cns136:590327] [15]
> /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3]
>
> [cns136:590327] [16] ./cobpor[0x40bdee]
>
> [cns136:590327] *** End of error message ***
>
> --------------------------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code. Per user-direction, the job has been aborted.
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited
> on signal 6 (Aborted).
>
> --------------------------------------------------------------------------
>
>
>
> When I reduce the number of processors the error disappears and when I run
> my code without the vapor bubbles it also works.
>
> The problem seems to take place at this moment:
>
>
>
> DMCreate(PETSC_COMM_WORLD,swarm);
>
>     DMSetType(*swarm,DMSWARM);
>
>     DMSetDimension(*swarm,3);
>
>     DMSwarmSetType(*swarm,DMSWARM_PIC);
>
>     DMSwarmSetCellDM(*swarm,*dmcell);
>
>
>
>
>
> Thanks a lot for your help.
>
>
>
> Things that would help us track this down:
>
>
>
> 1) The smallest example where it fails
>
>
>
> 2) The smallest number of processes where it fails
>
>
>
> 3) A stack trace of the failure
>
>
>
> 4) A simple example that we can run that also fails
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231218/089c379b/attachment-0001.html>

From facklerpw at ornl.gov  Mon Dec 18 12:54:27 2023
From: facklerpw at ornl.gov (Fackler, Philip)
Date: Mon, 18 Dec 2023 18:54:27 +0000
Subject: [petsc-users] [EXTERNAL] Re: Call to DMSetMatrixPreallocateSkip
 not changing allocation behavior
In-Reply-To: <87wmtgcw4m.fsf@jedbrown.org>
References: <SA1PR09MB8077756E473CB4C0409FBF34C68CA@SA1PR09MB8077.namprd09.prod.outlook.com>
	<871qboech6.fsf@jedbrown.org> <87wmtgcw4m.fsf@jedbrown.org>
Message-ID: <SA1PR09MB80777C84B57FA82737A50932C690A@SA1PR09MB8077.namprd09.prod.outlook.com>

Jed,

That seems to have worked (ridiculously well). It's now 55MB, and it's happening in the call to MatSetPreallocationCOO.

Thank you,

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory
________________________________
From: Jed Brown <jed at jedbrown.org>
Sent: Thursday, December 14, 2023 16:27
To: Fackler, Philip <facklerpw at ornl.gov>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>; xolotl-psi-development at lists.sourceforge.net <xolotl-psi-development at lists.sourceforge.net>
Subject: [EXTERNAL] Re: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing allocation behavior

I had a one-character typo in the diff above. This MR to release should work now.

https://urldefense.us/v2/url?u=https-3A__gitlab.com_petsc_petsc_-2D_merge-5Frequests_7120&d=DwIBAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=DAkLCjn8leYU-uJ-kfNEQMhPZWx9lzc4d5KgIR-RZWQ&m=v9sHqomCGBRWotign4NcwYwOpszOJehUGs_EO3eGn4SSZqxnfK7Iv15-X8nO1lii&s=h_jIP-6WcIjR6LssfGrV6Z2DojlN_w7Me4-a4rBE074&e=

Jed Brown <jed at jedbrown.org> writes:

> 17 GB for a 1D DMDA, wow. :-)
>
> Could you try applying this diff to make it work for DMDA (it's currently handled by DMPlex)?
>
> diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c
> index cad4d926504..bd2a3bda635 100644
> --- i/src/dm/impls/da/fdda.c
> +++ w/src/dm/impls/da/fdda.c
> @@ -675,19 +675,21 @@ PetscErrorCode DMCreateMatrix_DA(DM da, Mat *J)
>     specialized setting routines depend only on the particular preallocation
>     details of the matrix, not the type itself.
>    */
> -  PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij));
> -  if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij));
> -  if (!aij) {
> -    PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij));
> -    if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij));
> -    if (!baij) {
> -      PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij));
> -      if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij));
> -      if (!sbaij) {
> -        PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell));
> -        if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell));
> +  if (!dm->prealloc_skip) { // Flag is likely set when user intends to use MatSetPreallocationCOO()
> +    PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij));
> +    if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij));
> +    if (!aij) {
> +      PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij));
> +      if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij));
> +      if (!baij) {
> +        PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij));
> +        if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij));
> +        if (!sbaij) {
> +          PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell));
> +          if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell));
> +        }
> +        if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is));
>        }
> -      if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is));
>      }
>    }
>    if (aij) {
>
>
> "Fackler, Philip via petsc-users" <petsc-users at mcs.anl.gov> writes:
>
>> I'm using the following sequence of functions related to the Jacobian matrix:
>>
>> DMDACreate1d(..., &da);
>> DMSetFromOptions(da);
>> DMSetUp(da);
>> DMSetMatType(da, MATAIJKOKKOS);
>> DMSetMatrixPreallocateSkip(da, PETSC_TRUE);
>> Mat J;
>> DMCreateMatrix(da, &J);
>> MatSetPreallocationCOO(J, ...);
>>
>> I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation<https://urldefense.us/v2/url?u=https-3A__petsc.org_release_manualpages_DM_DMSetMatrixPreallocateSkip_&d=DwIBAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=DAkLCjn8leYU-uJ-kfNEQMhPZWx9lzc4d5KgIR-RZWQ&m=v9sHqomCGBRWotign4NcwYwOpszOJehUGs_EO3eGn4SSZqxnfK7Iv15-X8nO1lii&s=IMLBs0ydxDPvuXeD6jmsq1BN_8oieHVSyG6VA9c0DyM&e= > says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix.
>>
>> [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13]
>>
>> Can someone help me understand this?
>>
>> Thanks,
>>
>> Philip Fackler
>> Research Software Engineer, Application Engineering Group
>> Advanced Computing Systems Research Section
>> Computer Science and Mathematics Division
>> Oak Ridge National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231218/7b5e8769/attachment.html>

From jed at jedbrown.org  Mon Dec 18 13:47:06 2023
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 18 Dec 2023 12:47:06 -0700
Subject: [petsc-users] [EXTERNAL] Re: Call to DMSetMatrixPreallocateSkip
 not changing allocation behavior
In-Reply-To: <SA1PR09MB80777C84B57FA82737A50932C690A@SA1PR09MB8077.namprd09.prod.outlook.com>
References: <SA1PR09MB8077756E473CB4C0409FBF34C68CA@SA1PR09MB8077.namprd09.prod.outlook.com>
	<871qboech6.fsf@jedbrown.org> <87wmtgcw4m.fsf@jedbrown.org>
	<SA1PR09MB80777C84B57FA82737A50932C690A@SA1PR09MB8077.namprd09.prod.outlook.com>
Message-ID: <874jgfb8ed.fsf@jedbrown.org>

Great, thanks for letting us know. It'll merge to release shortly and thus be in petsc >= 3.20.3.

"Fackler, Philip" <facklerpw at ornl.gov> writes:

> Jed,
>
> That seems to have worked (ridiculously well). It's now 55MB, and it's happening in the call to MatSetPreallocationCOO.
>
> Thank you,
>
> Philip Fackler
> Research Software Engineer, Application Engineering Group
> Advanced Computing Systems Research Section
> Computer Science and Mathematics Division
> Oak Ridge National Laboratory
> ________________________________
> From: Jed Brown <jed at jedbrown.org>
> Sent: Thursday, December 14, 2023 16:27
> To: Fackler, Philip <facklerpw at ornl.gov>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>; xolotl-psi-development at lists.sourceforge.net <xolotl-psi-development at lists.sourceforge.net>
> Subject: [EXTERNAL] Re: [petsc-users] Call to DMSetMatrixPreallocateSkip not changing allocation behavior
>
> I had a one-character typo in the diff above. This MR to release should work now.
>
> https://urldefense.us/v2/url?u=https-3A__gitlab.com_petsc_petsc_-2D_merge-5Frequests_7120&d=DwIBAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=DAkLCjn8leYU-uJ-kfNEQMhPZWx9lzc4d5KgIR-RZWQ&m=v9sHqomCGBRWotign4NcwYwOpszOJehUGs_EO3eGn4SSZqxnfK7Iv15-X8nO1lii&s=h_jIP-6WcIjR6LssfGrV6Z2DojlN_w7Me4-a4rBE074&e=
>
> Jed Brown <jed at jedbrown.org> writes:
>
>> 17 GB for a 1D DMDA, wow. :-)
>>
>> Could you try applying this diff to make it work for DMDA (it's currently handled by DMPlex)?
>>
>> diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c
>> index cad4d926504..bd2a3bda635 100644
>> --- i/src/dm/impls/da/fdda.c
>> +++ w/src/dm/impls/da/fdda.c
>> @@ -675,19 +675,21 @@ PetscErrorCode DMCreateMatrix_DA(DM da, Mat *J)
>>     specialized setting routines depend only on the particular preallocation
>>     details of the matrix, not the type itself.
>>    */
>> -  PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij));
>> -  if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij));
>> -  if (!aij) {
>> -    PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij));
>> -    if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij));
>> -    if (!baij) {
>> -      PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij));
>> -      if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij));
>> -      if (!sbaij) {
>> -        PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell));
>> -        if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell));
>> +  if (!dm->prealloc_skip) { // Flag is likely set when user intends to use MatSetPreallocationCOO()
>> +    PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIAIJSetPreallocation_C", &aij));
>> +    if (!aij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqAIJSetPreallocation_C", &aij));
>> +    if (!aij) {
>> +      PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPIBAIJSetPreallocation_C", &baij));
>> +      if (!baij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqBAIJSetPreallocation_C", &baij));
>> +      if (!baij) {
>> +        PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISBAIJSetPreallocation_C", &sbaij));
>> +        if (!sbaij) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSBAIJSetPreallocation_C", &sbaij));
>> +        if (!sbaij) {
>> +          PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatMPISELLSetPreallocation_C", &sell));
>> +          if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatSeqSELLSetPreallocation_C", &sell));
>> +        }
>> +        if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is));
>>        }
>> -      if (!sell) PetscCall(PetscObjectQueryFunction((PetscObject)A, "MatISSetPreallocation_C", &is));
>>      }
>>    }
>>    if (aij) {
>>
>>
>> "Fackler, Philip via petsc-users" <petsc-users at mcs.anl.gov> writes:
>>
>>> I'm using the following sequence of functions related to the Jacobian matrix:
>>>
>>> DMDACreate1d(..., &da);
>>> DMSetFromOptions(da);
>>> DMSetUp(da);
>>> DMSetMatType(da, MATAIJKOKKOS);
>>> DMSetMatrixPreallocateSkip(da, PETSC_TRUE);
>>> Mat J;
>>> DMCreateMatrix(da, &J);
>>> MatSetPreallocationCOO(J, ...);
>>>
>>> I recently added the call to DMSetMatrixPreallocateSkip, hoping the allocation would be delayed to MatSetPreallocationCOO, and that it would require less memory. The documentation<https://urldefense.us/v2/url?u=https-3A__petsc.org_release_manualpages_DM_DMSetMatrixPreallocateSkip_&d=DwIBAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=DAkLCjn8leYU-uJ-kfNEQMhPZWx9lzc4d5KgIR-RZWQ&m=v9sHqomCGBRWotign4NcwYwOpszOJehUGs_EO3eGn4SSZqxnfK7Iv15-X8nO1lii&s=IMLBs0ydxDPvuXeD6jmsq1BN_8oieHVSyG6VA9c0DyM&e= > says that the data structures will not be preallocated. The following data from heaptrack shows that the allocation is still happening in the call to DMCreateMatrix.
>>>
>>> [cid:bda9ef12-a46f-47b2-9b9b-a4b2808b6b13]
>>>
>>> Can someone help me understand this?
>>>
>>> Thanks,
>>>
>>> Philip Fackler
>>> Research Software Engineer, Application Engineering Group
>>> Advanced Computing Systems Research Section
>>> Computer Science and Mathematics Division
>>> Oak Ridge National Laboratory

From joauma.marichal at uclouvain.be  Tue Dec 19 04:10:56 2023
From: joauma.marichal at uclouvain.be (Joauma Marichal)
Date: Tue, 19 Dec 2023 10:10:56 +0000
Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors
In-Reply-To: <CAMYG4GmkoYFTV2YjXKy5dAeH-hp6fxNw75Kwako=HdU1CV_xFA@mail.gmail.com>
References: <DU0PR03MB95901B99683E00FF1C38209B81DEA@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GkBOyWv=pM-Y3rvr-pfjigTp7WJ-ngj77r=PGZLqeQk4Q@mail.gmail.com>
	<DU0PR03MB959051F6A095101917CA8E5E81B9A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GknN3_hwCWA=LiY9w6LiZeTvW0cwE0uRuxtRAw9Y23awg@mail.gmail.com>
	<DU0PR03MB9590427B8F2DAD0FC28154E58190A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GmkoYFTV2YjXKy5dAeH-hp6fxNw75Kwako=HdU1CV_xFA@mail.gmail.com>
Message-ID: <DU0PR03MB9590E1266652F52E2DAF1C8B8197A@DU0PR03MB9590.eurprd03.prod.outlook.com>

Hello,

I have used Address Sanitizer to check any memory errors. On my computer, no errors are found. Unfortunately, on the supercomputer that I am using, I get lots of errors? I attach my log files (running on 1 and 70 procs).
Do you have any idea of what I could do?

Thanks a lot for your help.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com>
Date : lundi, 18 d?cembre 2023 ? 12:00
? : Joauma Marichal <joauma.marichal at uclouvain.be>
Cc : petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>, petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>> wrote:
Hello,

Sorry for the delay. I attach the file that I obtain when running the code with the debug mode.

Okay, we can now see where this is happening:

malloc_consolidate(): invalid chunk size
[cns263:3265170] *** Process received signal ***
[cns263:3265170] Signal: Aborted (6)
[cns263:3265170] Signal code:  (-6)
[cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20]
[cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f]
[cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05]
[cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037]
[cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c]
[cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68]
[cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18]
[cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822]
[cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc]
[cns263:3265170] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625]
[cns263:3265170] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07]
[cns263:3265170] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b]
[cns263:3265170] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9]
[cns263:3265170] [13] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea]
[cns263:3265170] [14] ./cobpor[0x402de8]
[cns263:3265170] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3]
[cns263:3265170] [16] ./cobpor[0x40304e]
[cns263:3265170] *** End of error message ***

However, this is not great. First, the amount of memory being allocated is quite small, and this does not appear to be an Out of Memory error. Second, the error occurs in libc:

  malloc_consolidate(): invalid chunk size

which means something is wrong internally. I agree with this analysis (https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error) that says you have probably overwritten memory somewhere in your code. I recommend running under valgrind, or using Address Sanitizer from clang.

  Thanks,

     Matt

Thanks for your help.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date : jeudi, 23 novembre 2023 ? 15:32
? : Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>>
Cc : petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov> <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>>, petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>> wrote:
Hello,

My problem persists? Is there anything I could try?

Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It does allocation, and the failure
is in libc, and it only happens on larger examples, so I suspect some allocation problem. Can you rebuild with debugging and run this example? Then we can see if the allocation fails.

  Thanks,

     Matt

Thanks a lot.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date : mercredi, 25 octobre 2023 ? 14:45
? : Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>>
Cc : petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov> <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>>, petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>> wrote:
Hello,

I am using the DMSwarm library in some Eulerian-Lagrangian approach to have vapor bubbles in water.
I have obtained nice results recently and wanted to perform bigger simulations. Unfortunately, when I increase the number of processors used to run the simulation, I get the following error:


free(): invalid size

[cns136:590327] *** Process received signal ***

[cns136:590327] Signal: Aborted (6)

[cns136:590327] Signal code:  (-6)

[cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20]

[cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f]

[cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05]

[cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037]

[cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c]

[cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac]

[cns136:590327] [ 6] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64]

[cns136:590327] [ 7] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642]

[cns136:590327] [ 8] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e]

[cns136:590327] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde]

[cns136:590327] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8]

[cns136:590327] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448]

[cns136:590327] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20]

[cns136:590327] [13] ./cobpor[0x4418dc]

[cns136:590327] [14] ./cobpor[0x408b63]

[cns136:590327] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3]

[cns136:590327] [16] ./cobpor[0x40bdee]

[cns136:590327] *** End of error message ***

--------------------------------------------------------------------------

Primary job  terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.

--------------------------------------------------------------------------

--------------------------------------------------------------------------

mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited on signal 6 (Aborted).

--------------------------------------------------------------------------

When I reduce the number of processors the error disappears and when I run my code without the vapor bubbles it also works.
The problem seems to take place at this moment:

DMCreate(PETSC_COMM_WORLD,swarm);
    DMSetType(*swarm,DMSWARM);
    DMSetDimension(*swarm,3);
    DMSwarmSetType(*swarm,DMSWARM_PIC);
    DMSwarmSetCellDM(*swarm,*dmcell);


Thanks a lot for your help.

Things that would help us track this down:

1) The smallest example where it fails

2) The smallest number of processes where it fails

3) A stack trace of the failure

4) A simple example that we can run that also fails

  Thanks,

     Matt

Best regards,

Joauma


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231219/39d3f0ca/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log_1proc
Type: application/octet-stream
Size: 14935 bytes
Desc: log_1proc
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231219/39d3f0ca/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log_70proc
Type: application/octet-stream
Size: 174297 bytes
Desc: log_70proc
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231219/39d3f0ca/attachment-0003.obj>

From knepley at gmail.com  Tue Dec 19 07:29:58 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 19 Dec 2023 08:29:58 -0500
Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors
In-Reply-To: <DU0PR03MB9590E1266652F52E2DAF1C8B8197A@DU0PR03MB9590.eurprd03.prod.outlook.com>
References: <DU0PR03MB95901B99683E00FF1C38209B81DEA@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GkBOyWv=pM-Y3rvr-pfjigTp7WJ-ngj77r=PGZLqeQk4Q@mail.gmail.com>
	<DU0PR03MB959051F6A095101917CA8E5E81B9A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GknN3_hwCWA=LiY9w6LiZeTvW0cwE0uRuxtRAw9Y23awg@mail.gmail.com>
	<DU0PR03MB9590427B8F2DAD0FC28154E58190A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GmkoYFTV2YjXKy5dAeH-hp6fxNw75Kwako=HdU1CV_xFA@mail.gmail.com>
	<DU0PR03MB9590E1266652F52E2DAF1C8B8197A@DU0PR03MB9590.eurprd03.prod.outlook.com>
Message-ID: <CAMYG4Gm21aL9hcRU7FpKjtdd37AqwxDTMtPthOvO_mVHH9YgMQ@mail.gmail.com>

On Tue, Dec 19, 2023 at 5:11?AM Joauma Marichal <
joauma.marichal at uclouvain.be> wrote:

> Hello,
>
>
>
> I have used Address Sanitizer to check any memory errors. On my computer,
> no errors are found. Unfortunately, on the supercomputer that I am using, I
> get lots of errors? I attach my log files (running on 1 and 70 procs).
>
> Do you have any idea of what I could do?
>

Run the same parallel configuration as you do on the supercomputer. If that
is fine, I would suggest Address Sanitizer there. Something is corrupting
the stack, and it appears that it is connected to that machine, rather than
the library. Do you have access to a second parallel machine?

  Thanks,

     Matt


> Thanks a lot for your help.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *lundi, 18 d?cembre 2023 ? 12:00
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal <
> joauma.marichal at uclouvain.be> wrote:
>
> Hello,
>
>
>
> Sorry for the delay. I attach the file that I obtain when running the code
> with the debug mode.
>
>
>
> Okay, we can now see where this is happening:
>
>
>
> malloc_consolidate(): invalid chunk size
> [cns263:3265170] *** Process received signal ***
> [cns263:3265170] Signal: Aborted (6)
> [cns263:3265170] Signal code:  (-6)
> [cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20]
> [cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f]
> [cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05]
> [cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037]
> [cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c]
> [cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68]
> [cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18]
> [cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822]
> [cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc]
> [cns263:3265170] [ 9]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625]
> [cns263:3265170] [10]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07]
> [cns263:3265170] [11]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b]
> [cns263:3265170] [12]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9]
> [cns263:3265170] [13]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea]
> [cns263:3265170] [14] ./cobpor[0x402de8]
> [cns263:3265170] [15]
> /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3]
> [cns263:3265170] [16] ./cobpor[0x40304e]
> [cns263:3265170] *** End of error message ***
>
>
>
> However, this is not great. First, the amount of memory being allocated is
> quite small, and this does not appear to be an Out of Memory error. Second,
> the error occurs in libc:
>
>
>
>   malloc_consolidate(): invalid chunk size
>
>
>
> which means something is wrong internally. I agree with this analysis (
> https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error)
> that says you have probably overwritten memory somewhere in your code. I
> recommend running under valgrind, or using Address Sanitizer from clang.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Thanks for your help.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *jeudi, 23 novembre 2023 ? 15:32
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal <
> joauma.marichal at uclouvain.be> wrote:
>
> Hello,
>
>
>
> My problem persists? Is there anything I could try?
>
>
>
> Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It
> does allocation, and the failure
>
> is in libc, and it only happens on larger examples, so I suspect some
> allocation problem. Can you rebuild with debugging and run this example?
> Then we can see if the allocation fails.
>
>
>
>   Thanks,
>
>      Matt
>
>
>
> Thanks a lot.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *mercredi, 25 octobre 2023 ? 14:45
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint <
> petsc-maint at mcs.anl.gov> wrote:
>
> Hello,
>
>
>
> I am using the DMSwarm library in some Eulerian-Lagrangian approach to
> have vapor bubbles in water.
>
> I have obtained nice results recently and wanted to perform bigger
> simulations. Unfortunately, when I increase the number of processors used
> to run the simulation, I get the following error:
>
>
>
> free(): invalid size
>
> [cns136:590327] *** Process received signal ***
>
> [cns136:590327] Signal: Aborted (6)
>
> [cns136:590327] Signal code:  (-6)
>
> [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20]
>
> [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f]
>
> [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05]
>
> [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037]
>
> [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c]
>
> [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac]
>
> [cns136:590327] [ 6]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64]
>
> [cns136:590327] [ 7]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642]
>
> [cns136:590327] [ 8]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e]
>
> [cns136:590327] [ 9]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde]
>
> [cns136:590327] [10]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8]
>
> [cns136:590327] [11]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448]
>
> [cns136:590327] [12]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20]
>
> [cns136:590327] [13] ./cobpor[0x4418dc]
>
> [cns136:590327] [14] ./cobpor[0x408b63]
>
> [cns136:590327] [15]
> /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3]
>
> [cns136:590327] [16] ./cobpor[0x40bdee]
>
> [cns136:590327] *** End of error message ***
>
> --------------------------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code. Per user-direction, the job has been aborted.
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited
> on signal 6 (Aborted).
>
> --------------------------------------------------------------------------
>
>
>
> When I reduce the number of processors the error disappears and when I run
> my code without the vapor bubbles it also works.
>
> The problem seems to take place at this moment:
>
>
>
> DMCreate(PETSC_COMM_WORLD,swarm);
>
>     DMSetType(*swarm,DMSWARM);
>
>     DMSetDimension(*swarm,3);
>
>     DMSwarmSetType(*swarm,DMSWARM_PIC);
>
>     DMSwarmSetCellDM(*swarm,*dmcell);
>
>
>
>
>
> Thanks a lot for your help.
>
>
>
> Things that would help us track this down:
>
>
>
> 1) The smallest example where it fails
>
>
>
> 2) The smallest number of processes where it fails
>
>
>
> 3) A stack trace of the failure
>
>
>
> 4) A simple example that we can run that also fails
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231219/cb81cfed/attachment-0001.html>

From sawsan.shatanawi at wsu.edu  Tue Dec 19 19:28:52 2023
From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad)
Date: Wed, 20 Dec 2023 01:28:52 +0000
Subject: [petsc-users] Help with Integrating PETSc into Fortran Groundwater
 Flow Simulation Code
Message-ID: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>

Hello everyone,

I hope this email finds you well.

 My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages.

I am kindly asking if someone can help me, I would be happy to share my code with him/her.

Please find the attached file contains a list of errors I have gotten

Thank you in advance for your time and assistance.

Best regards,

 Sawsan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/69a7436c/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: out.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/69a7436c/attachment.txt>

From srvenkat at utexas.edu  Wed Dec 20 01:42:27 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Wed, 20 Dec 2023 13:12:27 +0530
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
	<186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>
	<CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
	<7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>
Message-ID: <CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>

Ok, I think the error I'm getting has something to do with how the multiple
solves are being done in succession. I'll try to see if there's anything
I'm doing wrong there.

One question about the -pc_type lu -ksp_type preonly method: do you know
which parts of the solve (factorization/triangular solves) are done on host
and which are done on device?

Thanks,
Sreeram

On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet <pierre at joliv.et> wrote:

> Unfortunately, I am not able to reproduce such a failure with your input
> matrix.
> I?ve used ex79 that I linked previously and the system is properly solved.
> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg
> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs
> ascii::ascii_info
> Linear solve converged due to CONVERGED_RTOL iterations 6
> Mat Object: 1 MPI process
>   type: seqaijcusparse
>   rows=289, cols=289
>   total: nonzeros=2401, allocated nonzeros=2401
>   total number of mallocs used during MatSetValues calls=0
>     not using I-node routines
> Mat Object: 1 MPI process
>   type: seqdensecuda
>   rows=289, cols=10
>   total: nonzeros=2890, allocated nonzeros=2890
>   total number of mallocs used during MatSetValues calls=0
>
> You mentioned in a subsequent email that you are interested in systems
> with at most 1E6 unknowns, and up to 1E4 right-hand sides.
> I?m not sure you can expect significant gains from using GPU for such
> systems.
> Probably, the fastest approach would indeed be -pc_type lu -ksp_type
> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory
> available on your host.
>
> Thanks,
> Pierre
>
> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> Here are the ksp_view files.  I set the options
> -ksp_error_if_not_converged to try to get the vectors that caused the
> error. I noticed that some of the KSPMatSolves converge while others don't.
> In the code, the solves are called as:
>
> input vector v --> insert data of v into a dense mat --> KSPMatSolve() -->
> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output
> vector w -- output w
>
> The operator used in the KSP is a Laplacian-like operator, and the
> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve
> with a biharmonic-like operator. I can also run it with only the first
> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP
> reportedly converges after 0 iterations (see the next line), but this
> causes problems in other parts of the code later on.
>
> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations
> due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I
> tried setting ksp_min_it, but that didn't seem to do anything.
>
> I'll keep trying different options and also try to get the MWE made (this
> KSPMatSolve is pretty performance critical for us).
>
> Thanks for all your help,
> Sreeram
>
> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet <pierre at joliv.et> wrote:
>
>>
>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>> Thanks, I will try to create a minimal reproducible example. This may
>> take me some time though, as I need to figure out how to extract only the
>> relevant parts (the full program this solve is used in is getting quite
>> complex).
>>
>>
>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat
>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files
>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt).
>>
>> I'll also try out some of the BoomerAMG options to see if that helps.
>>
>>
>> These should work (this is where all ?PCMatApply()-ready? PC are being
>> tested):
>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with
>> HIP).
>> I?m aware the performance should not be optimal (see your comment about
>> host/device copies), I?ve money to hire someone to work on this but: a) I
>> need to find the correct engineer/post-doc, b) I currently don?t have good
>> use cases (of course, I could generate a synthetic benchmark, for science).
>> So even if you send me the three Mat, a MWE would be appreciated if the
>> KSPMatSolve() is performance-critical for you (see point b) from above).
>>
>> Thanks,
>> Pierre
>>
>> Thanks,
>> Sreeram
>>
>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>
>>>
>>>
>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>> Hello Pierre,
>>>
>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it
>>> seems to be doing the batched solves, but the KSP is not converging due to
>>> a NaN or Inf being generated. I also noticed there are a lot of
>>> host-to-device and device-to-host copies of the matrices (the non-batched
>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could
>>> you please take a look?
>>>
>>>
>>> Yes, but you?d need to send me something I can run with your set of
>>> options (if you are more confident doing this in private, you can remove
>>> the list from c/c).
>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there
>>> is not much error checking, so instead of erroring out, this may be the
>>> reason why you are getting garbage.
>>>
>>> Thanks,
>>> Pierre
>>>
>>> Thanks,
>>> Sreeram
>>>
>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et> wrote:
>>>
>>>> Hello Sreeram,
>>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple
>>>> columns at once.
>>>> There is only a single native PETSc KSP implementation which handles
>>>> solves with multiple columns at once: KSPPREONLY.
>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more
>>>> advanced methods) implementation which handles solves with multiple columns
>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp,
>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>> I?m the main author of HPDDM, there is preliminary support for device
>>>> matrices, but if it?s not working as intended/not faster than column by
>>>> column, I?d be happy to have a deeper look (maybe in private), because most
>>>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e.,
>>>> solvers that treat right-hand sides in a single go) are using plain host
>>>> matrices.
>>>>
>>>> Thanks,
>>>> Pierre
>>>>
>>>> PS: you could have a look at
>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
>>>> understand the philosophy behind block iterative methods in PETSc (and in
>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
>>>> developed in the context of this paper to produce Figures 2-3. Note that
>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
>>>> others) have been made ?PCMatApply()-ready?.
>>>>
>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>> Hello Pierre,
>>>>
>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
>>>> However, I am noticing that it is still solving column by column (this is
>>>> stated explicitly in the info dump attached). I looked at the code for
>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is
>>>> true, it should do the batched solve, though I'm not sure where that gets
>>>> set.
>>>>
>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
>>>> running the code.
>>>>
>>>> Can you please help me with this?
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>>
>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov> wrote:
>>>>
>>>>> N.B., AMGX interface is a bit experimental.
>>>>> Mark
>>>>>
>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>
>>>>>> Thanks,
>>>>>> Sreeram
>>>>>>
>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>>>>
>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out
>>>>>>> and see how it performs.
>>>>>>>
>>>>>>>
>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus
>>>>>>> has no PCMatApply() implementation.
>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>>>>> implementation.
>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Pierre
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>>
>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that
>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>>>>> reproduce this on your own with
>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within
>>>>>>>> your KSP.
>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>>>>>>> right-hand sides column by column, which is very inefficient.
>>>>>>>> You could run your code with -info dump and send us dump.0 to see
>>>>>>>> what needs to be done on our end to make things more efficient, should you
>>>>>>>> not be satisfied with the current performance of the code.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Pierre
>>>>>>>>
>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of
>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where
>>>>>>>> v_i has size n. The data for v can be stored either in column-major or
>>>>>>>> row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>
>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>
>>>>>>>> From what I have read on the documentation, I can think of 2
>>>>>>>> approaches.
>>>>>>>>
>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>>>>> with R and V.
>>>>>>>>
>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly
>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that
>>>>>>>> it is a multiple RHS system and act accordingly.
>>>>>>>>
>>>>>>>> Which would be the more efficient option?
>>>>>>>>
>>>>>>>>
>>>>>>>> Use 1.
>>>>>>>>
>>>>>>>>
>>>>>>>> As a side-note, I am also wondering if there is a way to use
>>>>>>>> row-major storage of the vector v.
>>>>>>>>
>>>>>>>>
>>>>>>>> No
>>>>>>>>
>>>>>>>> The reason is that this could allow for more coalesced memory
>>>>>>>> access when doing matvecs.
>>>>>>>>
>>>>>>>>
>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products
>>>>>>>> for the computation so in theory they should already be well-optimized
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sreeram
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> <dump.0>
>>>>
>>>>
>>>> <dump.0>
>>>
>>>
>>>
>> <Pmat.bin><Amat.bin><rhs.bin>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/a29c8a72/attachment-0001.html>

From pierre at joliv.et  Wed Dec 20 01:51:21 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Wed, 20 Dec 2023 08:51:21 +0100
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
	<186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>
	<CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
	<7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>
	<CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>
Message-ID: <E0A158B4-9A6E-4C38-9830-498C3273D6DE@joliv.et>


> On 20 Dec 2023, at 8:42?AM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> Ok, I think the error I'm getting has something to do with how the multiple solves are being done in succession. I'll try to see if there's anything I'm doing wrong there. 
> 
> One question about the -pc_type lu -ksp_type preonly method: do you know which parts of the solve (factorization/triangular solves) are done on host and which are done on device?

I think only the triangular solves can be done on device.
Since you have many right-hand sides, it may not be that bad.
GPU people will hopefully give you a more insightful answer.

Thanks,
Pierre

> Thanks,
> Sreeram
> 
> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>> Unfortunately, I am not able to reproduce such a failure with your input matrix.
>> I?ve used ex79 that I linked previously and the system is properly solved.
>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs ascii::ascii_info
>> Linear solve converged due to CONVERGED_RTOL iterations 6
>> Mat Object: 1 MPI process
>>   type: seqaijcusparse
>>   rows=289, cols=289
>>   total: nonzeros=2401, allocated nonzeros=2401
>>   total number of mallocs used during MatSetValues calls=0
>>     not using I-node routines
>> Mat Object: 1 MPI process
>>   type: seqdensecuda
>>   rows=289, cols=10
>>   total: nonzeros=2890, allocated nonzeros=2890
>>   total number of mallocs used during MatSetValues calls=0
>> 
>> You mentioned in a subsequent email that you are interested in systems with at most 1E6 unknowns, and up to 1E4 right-hand sides.
>> I?m not sure you can expect significant gains from using GPU for such systems.
>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type preonly -ksp_matsolve_batch_size 100 or something, depending on the memory available on your host.
>> 
>> Thanks,
>> Pierre
>> 
>>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>> 
>>> Here are the ksp_view files.  I set the options -ksp_error_if_not_converged to try to get the vectors that caused the error. I noticed that some of the KSPMatSolves converge while others don't. In the code, the solves are called as:
>>> 
>>> input vector v --> insert data of v into a dense mat --> KSPMatSolve() --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output vector w -- output w
>>> 
>>> The operator used in the KSP is a Laplacian-like operator, and the MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve with a biharmonic-like operator. I can also run it with only the first KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP reportedly converges after 0 iterations (see the next line), but this causes problems in other parts of the code later on. 
>>> 
>>> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything. 
>>> 
>>> I'll keep trying different options and also try to get the MWE made (this KSPMatSolve is pretty performance critical for us). 
>>> 
>>> Thanks for all your help,
>>> Sreeram
>>> 
>>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>> 
>>>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>> 
>>>>> Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex).
>>>> 
>>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files (I?m guessing your are using double-precision scalars with 32-bit PetscInt).
>>>> 
>>>>> I'll also try out some of the BoomerAMG options to see if that helps.
>>>> 
>>>> These should work (this is where all ?PCMatApply()-ready? PC are being tested): https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
>>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with HIP).
>>>> I?m aware the performance should not be optimal (see your comment about host/device copies), I?ve money to hire someone to work on this but: a) I need to find the correct engineer/post-doc, b) I currently don?t have good use cases (of course, I could generate a synthetic benchmark, for science).
>>>> So even if you send me the three Mat, a MWE would be appreciated if the KSPMatSolve() is performance-critical for you (see point b) from above).
>>>> 
>>>> Thanks,
>>>> Pierre
>>>> 
>>>>> Thanks,
>>>>> Sreeram
>>>>> 
>>>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>> 
>>>>>> 
>>>>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>> 
>>>>>>> Hello Pierre,
>>>>>>> 
>>>>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look?
>>>>>> 
>>>>>> Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c).
>>>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage.
>>>>>> 
>>>>>> Thanks,
>>>>>> Pierre
>>>>>> 
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>> 
>>>>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>> Hello Sreeram,
>>>>>>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once.
>>>>>>>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY.
>>>>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>>>>>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Pierre
>>>>>>>> 
>>>>>>>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?.
>>>>>>>> 
>>>>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>> 
>>>>>>>>> Hello Pierre,
>>>>>>>>> 
>>>>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. 
>>>>>>>>> 
>>>>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code.
>>>>>>>>> 
>>>>>>>>> Can you please help me with this?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Sreeram
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
>>>>>>>>>> N.B., AMGX interface is a bit experimental.
>>>>>>>>>> Mark
>>>>>>>>>> 
>>>>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Sreeram
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs.
>>>>>>>>>>>> 
>>>>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation.
>>>>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation.
>>>>>>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Pierre
>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Sreeram
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP.
>>>>>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient.
>>>>>>>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Pierre
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
>>>>>>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Which would be the more efficient option?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Use 1. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> No
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Sreeram
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>> <dump.0>
>>>>>>>> 
>>>>>>> <dump.0>
>>>>>> 
>>>> 
>>> <Pmat.bin><Amat.bin><rhs.bin>
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/de21fb44/attachment-0001.html>

From joauma.marichal at uclouvain.be  Wed Dec 20 03:12:43 2023
From: joauma.marichal at uclouvain.be (Joauma Marichal)
Date: Wed, 20 Dec 2023 09:12:43 +0000
Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors
In-Reply-To: <CAMYG4Gm21aL9hcRU7FpKjtdd37AqwxDTMtPthOvO_mVHH9YgMQ@mail.gmail.com>
References: <DU0PR03MB95901B99683E00FF1C38209B81DEA@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GkBOyWv=pM-Y3rvr-pfjigTp7WJ-ngj77r=PGZLqeQk4Q@mail.gmail.com>
	<DU0PR03MB959051F6A095101917CA8E5E81B9A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GknN3_hwCWA=LiY9w6LiZeTvW0cwE0uRuxtRAw9Y23awg@mail.gmail.com>
	<DU0PR03MB9590427B8F2DAD0FC28154E58190A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GmkoYFTV2YjXKy5dAeH-hp6fxNw75Kwako=HdU1CV_xFA@mail.gmail.com>
	<DU0PR03MB9590E1266652F52E2DAF1C8B8197A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4Gm21aL9hcRU7FpKjtdd37AqwxDTMtPthOvO_mVHH9YgMQ@mail.gmail.com>
Message-ID: <DU0PR03MB95902A5449FCBFC12B5B113F8196A@DU0PR03MB9590.eurprd03.prod.outlook.com>

Hello,

I used Address Sanitizer on my laptop and I have no leaks.
I do have access to another machine (managed by the same people as the previous one) but I obtain similar errors?

Thanks again for your help.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com>
Date : mardi, 19 d?cembre 2023 ? 14:30
? : Joauma Marichal <joauma.marichal at uclouvain.be>
Cc : petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>, petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Tue, Dec 19, 2023 at 5:11?AM Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>> wrote:
Hello,

I have used Address Sanitizer to check any memory errors. On my computer, no errors are found. Unfortunately, on the supercomputer that I am using, I get lots of errors? I attach my log files (running on 1 and 70 procs).
Do you have any idea of what I could do?

Run the same parallel configuration as you do on the supercomputer. If that is fine, I would suggest Address Sanitizer there. Something is corrupting the stack, and it appears that it is connected to that machine, rather than the library. Do you have access to a second parallel machine?

  Thanks,

     Matt

Thanks a lot for your help.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date : lundi, 18 d?cembre 2023 ? 12:00
? : Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>>
Cc : petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov> <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>>, petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>> wrote:
Hello,

Sorry for the delay. I attach the file that I obtain when running the code with the debug mode.

Okay, we can now see where this is happening:

malloc_consolidate(): invalid chunk size
[cns263:3265170] *** Process received signal ***
[cns263:3265170] Signal: Aborted (6)
[cns263:3265170] Signal code:  (-6)
[cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20]
[cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f]
[cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05]
[cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037]
[cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c]
[cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68]
[cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18]
[cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822]
[cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc]
[cns263:3265170] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625]
[cns263:3265170] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07]
[cns263:3265170] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b]
[cns263:3265170] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9]
[cns263:3265170] [13] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea]
[cns263:3265170] [14] ./cobpor[0x402de8]
[cns263:3265170] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3]
[cns263:3265170] [16] ./cobpor[0x40304e]
[cns263:3265170] *** End of error message ***

However, this is not great. First, the amount of memory being allocated is quite small, and this does not appear to be an Out of Memory error. Second, the error occurs in libc:

  malloc_consolidate(): invalid chunk size

which means something is wrong internally. I agree with this analysis (https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error) that says you have probably overwritten memory somewhere in your code. I recommend running under valgrind, or using Address Sanitizer from clang.

  Thanks,

     Matt

Thanks for your help.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date : jeudi, 23 novembre 2023 ? 15:32
? : Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>>
Cc : petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov> <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>>, petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>> wrote:
Hello,

My problem persists? Is there anything I could try?

Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It does allocation, and the failure
is in libc, and it only happens on larger examples, so I suspect some allocation problem. Can you rebuild with debugging and run this example? Then we can see if the allocation fails.

  Thanks,

     Matt

Thanks a lot.

Best regards,

Joauma

De : Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date : mercredi, 25 octobre 2023 ? 14:45
? : Joauma Marichal <joauma.marichal at uclouvain.be<mailto:joauma.marichal at uclouvain.be>>
Cc : petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov> <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>>, petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Objet : Re: [petsc-maint] DMSwarm on multiple processors
On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint <petsc-maint at mcs.anl.gov<mailto:petsc-maint at mcs.anl.gov>> wrote:
Hello,

I am using the DMSwarm library in some Eulerian-Lagrangian approach to have vapor bubbles in water.
I have obtained nice results recently and wanted to perform bigger simulations. Unfortunately, when I increase the number of processors used to run the simulation, I get the following error:


free(): invalid size

[cns136:590327] *** Process received signal ***

[cns136:590327] Signal: Aborted (6)

[cns136:590327] Signal code:  (-6)

[cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20]

[cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f]

[cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05]

[cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037]

[cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c]

[cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac]

[cns136:590327] [ 6] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64]

[cns136:590327] [ 7] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642]

[cns136:590327] [ 8] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e]

[cns136:590327] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde]

[cns136:590327] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8]

[cns136:590327] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448]

[cns136:590327] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20]

[cns136:590327] [13] ./cobpor[0x4418dc]

[cns136:590327] [14] ./cobpor[0x408b63]

[cns136:590327] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3]

[cns136:590327] [16] ./cobpor[0x40bdee]

[cns136:590327] *** End of error message ***

--------------------------------------------------------------------------

Primary job  terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.

--------------------------------------------------------------------------

--------------------------------------------------------------------------

mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited on signal 6 (Aborted).

--------------------------------------------------------------------------

When I reduce the number of processors the error disappears and when I run my code without the vapor bubbles it also works.
The problem seems to take place at this moment:

DMCreate(PETSC_COMM_WORLD,swarm);
    DMSetType(*swarm,DMSWARM);
    DMSetDimension(*swarm,3);
    DMSwarmSetType(*swarm,DMSWARM_PIC);
    DMSwarmSetCellDM(*swarm,*dmcell);


Thanks a lot for your help.

Things that would help us track this down:

1) The smallest example where it fails

2) The smallest number of processes where it fails

3) A stack trace of the failure

4) A simple example that we can run that also fails

  Thanks,

     Matt

Best regards,

Joauma


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/f5269ee4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log_1proc_SC2
Type: application/octet-stream
Size: 89034 bytes
Desc: log_1proc_SC2
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/f5269ee4/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log_70proc_SC2
Type: application/octet-stream
Size: 172731 bytes
Desc: log_70proc_SC2
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/f5269ee4/attachment-0003.obj>

From mfadams at lbl.gov  Wed Dec 20 04:48:07 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 20 Dec 2023 05:48:07 -0500
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
Message-ID: <CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>

I am guessing that you are creating a matrix, adding to it, finalizing it
("assembly"), and then adding to it again, which is fine, but you are
adding new non-zeros to the sparsity pattern.
If this is what you want then you can tell the matrix to let you do that.
Otherwise you have a bug.

Mark

On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hello everyone,
>
> I hope this email finds you well.
>
>  My Name is Sawsan Shatanawi, and I am currently working on developing a
> Fortran code for simulating groundwater flow in a 3D system. The code
> involves solving a nonlinear system, and I have created the matrix to be
> solved using the PCG solver and Picard iteration. However, when I tried
> to assign it as a PETSc matrix I started getting a lot of error messages.
>
> I am kindly asking if someone can help me, I would be happy to share my
> code with him/her.
>
> Please find the attached file contains a list of errors I have gotten
>
> Thank you in advance for your time and assistance.
>
> Best regards,
>
>  Sawsan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/70b90fc4/attachment.html>

From knepley at gmail.com  Wed Dec 20 06:58:43 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 20 Dec 2023 07:58:43 -0500
Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors
In-Reply-To: <DU0PR03MB95902A5449FCBFC12B5B113F8196A@DU0PR03MB9590.eurprd03.prod.outlook.com>
References: <DU0PR03MB95901B99683E00FF1C38209B81DEA@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GkBOyWv=pM-Y3rvr-pfjigTp7WJ-ngj77r=PGZLqeQk4Q@mail.gmail.com>
	<DU0PR03MB959051F6A095101917CA8E5E81B9A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GknN3_hwCWA=LiY9w6LiZeTvW0cwE0uRuxtRAw9Y23awg@mail.gmail.com>
	<DU0PR03MB9590427B8F2DAD0FC28154E58190A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4GmkoYFTV2YjXKy5dAeH-hp6fxNw75Kwako=HdU1CV_xFA@mail.gmail.com>
	<DU0PR03MB9590E1266652F52E2DAF1C8B8197A@DU0PR03MB9590.eurprd03.prod.outlook.com>
	<CAMYG4Gm21aL9hcRU7FpKjtdd37AqwxDTMtPthOvO_mVHH9YgMQ@mail.gmail.com>
	<DU0PR03MB95902A5449FCBFC12B5B113F8196A@DU0PR03MB9590.eurprd03.prod.outlook.com>
Message-ID: <CAMYG4GkxQZN2tv1tA60V4z4B56YmEiS-gsZDLosy2sfyC173jw@mail.gmail.com>

On Wed, Dec 20, 2023 at 4:12?AM Joauma Marichal <
joauma.marichal at uclouvain.be> wrote:

> Hello,
>
>
>
> I used Address Sanitizer on my laptop and I have no leaks.
>
> I do have access to another machine (managed by the same people as the
> previous one) but I obtain similar errors?
>

Let me understand:

1) You have run the exact same problem on two different parallel machines,
and gotten the same error,
     meaning on the second machine, it printed

malloc_consolidate(): invalid chunk size

    Is this true?

 2) You have run the exact same problem on the same number of processes on
your own machine under Address Sanitizer with no errors?

  Thanks,

     Matt


> Thanks again for your help.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *mardi, 19 d?cembre 2023 ? 14:30
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Tue, Dec 19, 2023 at 5:11?AM Joauma Marichal <
> joauma.marichal at uclouvain.be> wrote:
>
> Hello,
>
>
>
> I have used Address Sanitizer to check any memory errors. On my computer,
> no errors are found. Unfortunately, on the supercomputer that I am using, I
> get lots of errors? I attach my log files (running on 1 and 70 procs).
>
> Do you have any idea of what I could do?
>
>
>
> Run the same parallel configuration as you do on the supercomputer. If
> that is fine, I would suggest Address Sanitizer there. Something is
> corrupting the stack, and it appears that it is connected to that machine,
> rather than the library. Do you have access to a second parallel machine?
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Thanks a lot for your help.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *lundi, 18 d?cembre 2023 ? 12:00
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Mon, Dec 18, 2023 at 5:09?AM Joauma Marichal <
> joauma.marichal at uclouvain.be> wrote:
>
> Hello,
>
>
>
> Sorry for the delay. I attach the file that I obtain when running the code
> with the debug mode.
>
>
>
> Okay, we can now see where this is happening:
>
>
>
> malloc_consolidate(): invalid chunk size
> [cns263:3265170] *** Process received signal ***
> [cns263:3265170] Signal: Aborted (6)
> [cns263:3265170] Signal code:  (-6)
> [cns263:3265170] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f3bd9148b20]
> [cns263:3265170] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f3bd9148a9f]
> [cns263:3265170] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f3bd911be05]
> [cns263:3265170] [ 3] /lib64/libc.so.6(+0x91037)[0x7f3bd918b037]
> [cns263:3265170] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f3bd919219c]
> [cns263:3265170] [ 5] /lib64/libc.so.6(+0x98b68)[0x7f3bd9192b68]
> [cns263:3265170] [ 6] /lib64/libc.so.6(+0x9af18)[0x7f3bd9194f18]
> [cns263:3265170] [ 7] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f3bd9196822]
> [cns263:3265170] [ 8] /lib64/libc.so.6(posix_memalign+0x3c)[0x7f3bd91980fc]
> [cns263:3265170] [ 9]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocAlign+0x45)[0x7f3bda5f1625]
> [cns263:3265170] [10]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscMallocA+0x297)[0x7f3bda5f1b07]
> [cns263:3265170] [11]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMCreate+0x5b)[0x7f3bdaa73c1b]
> [cns263:3265170] [12]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate+0x9)[0x7f3bdab0a2f9]
> [cns263:3265170] [13]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMDACreate3d+0x9a)[0x7f3bdab07dea]
> [cns263:3265170] [14] ./cobpor[0x402de8]
> [cns263:3265170] [15]
> /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3bd9134cf3]
> [cns263:3265170] [16] ./cobpor[0x40304e]
> [cns263:3265170] *** End of error message ***
>
>
>
> However, this is not great. First, the amount of memory being allocated is
> quite small, and this does not appear to be an Out of Memory error. Second,
> the error occurs in libc:
>
>
>
>   malloc_consolidate(): invalid chunk size
>
>
>
> which means something is wrong internally. I agree with this analysis (
> https://stackoverflow.com/questions/18760999/sample-example-program-to-get-the-malloc-consolidate-error)
> that says you have probably overwritten memory somewhere in your code. I
> recommend running under valgrind, or using Address Sanitizer from clang.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Thanks for your help.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *jeudi, 23 novembre 2023 ? 15:32
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal <
> joauma.marichal at uclouvain.be> wrote:
>
> Hello,
>
>
>
> My problem persists? Is there anything I could try?
>
>
>
> Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It
> does allocation, and the failure
>
> is in libc, and it only happens on larger examples, so I suspect some
> allocation problem. Can you rebuild with debugging and run this example?
> Then we can see if the allocation fails.
>
>
>
>   Thanks,
>
>      Matt
>
>
>
> Thanks a lot.
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
> *De : *Matthew Knepley <knepley at gmail.com>
> *Date : *mercredi, 25 octobre 2023 ? 14:45
> *? : *Joauma Marichal <joauma.marichal at uclouvain.be>
> *Cc : *petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>,
> petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet : *Re: [petsc-maint] DMSwarm on multiple processors
>
> On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint <
> petsc-maint at mcs.anl.gov> wrote:
>
> Hello,
>
>
>
> I am using the DMSwarm library in some Eulerian-Lagrangian approach to
> have vapor bubbles in water.
>
> I have obtained nice results recently and wanted to perform bigger
> simulations. Unfortunately, when I increase the number of processors used
> to run the simulation, I get the following error:
>
>
>
> free(): invalid size
>
> [cns136:590327] *** Process received signal ***
>
> [cns136:590327] Signal: Aborted (6)
>
> [cns136:590327] Signal code:  (-6)
>
> [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20]
>
> [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f]
>
> [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05]
>
> [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037]
>
> [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c]
>
> [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac]
>
> [cns136:590327] [ 6]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64]
>
> [cns136:590327] [ 7]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642]
>
> [cns136:590327] [ 8]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e]
>
> [cns136:590327] [ 9]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde]
>
> [cns136:590327] [10]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8]
>
> [cns136:590327] [11]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448]
>
> [cns136:590327] [12]
> /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20]
>
> [cns136:590327] [13] ./cobpor[0x4418dc]
>
> [cns136:590327] [14] ./cobpor[0x408b63]
>
> [cns136:590327] [15]
> /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3]
>
> [cns136:590327] [16] ./cobpor[0x40bdee]
>
> [cns136:590327] *** End of error message ***
>
> --------------------------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code. Per user-direction, the job has been aborted.
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited
> on signal 6 (Aborted).
>
> --------------------------------------------------------------------------
>
>
>
> When I reduce the number of processors the error disappears and when I run
> my code without the vapor bubbles it also works.
>
> The problem seems to take place at this moment:
>
>
>
> DMCreate(PETSC_COMM_WORLD,swarm);
>
>     DMSetType(*swarm,DMSWARM);
>
>     DMSetDimension(*swarm,3);
>
>     DMSwarmSetType(*swarm,DMSWARM_PIC);
>
>     DMSwarmSetCellDM(*swarm,*dmcell);
>
>
>
>
>
> Thanks a lot for your help.
>
>
>
> Things that would help us track this down:
>
>
>
> 1) The smallest example where it fails
>
>
>
> 2) The smallest number of processes where it fails
>
>
>
> 3) A stack trace of the failure
>
>
>
> 4) A simple example that we can run that also fails
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Best regards,
>
>
>
> Joauma
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/570abf68/attachment-0001.html>

From knepley at gmail.com  Wed Dec 20 07:58:01 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 20 Dec 2023 08:58:01 -0500
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
Message-ID: <CAMYG4GnR=aPyyrmrD3UEMvdyRcprMT4QW6fxwPs+fgUWBpm+Sg@mail.gmail.com>

On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hello everyone,
>
> I hope this email finds you well.
>
>  My Name is Sawsan Shatanawi, and I am currently working on developing a
> Fortran code for simulating groundwater flow in a 3D system. The code
> involves solving a nonlinear system, and I have created the matrix to be
> solved using the PCG solver and Picard iteration. However, when I tried
> to assign it as a PETSc matrix I started getting a lot of error messages.
>
> I am kindly asking if someone can help me, I would be happy to share my
> code with him/her.
>
> Please find the attached file contains a list of errors I have gotten
>

This error indicates that your preallocation is not sufficient for the
values you want to insert. Now in PETSc you can just remove your
preallocation, and PETSc will automatically allocate correctly.

  Thanks,

     Matt


> Thank you in advance for your time and assistance.
>
> Best regards,
>
>  Sawsan
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/831d4e73/attachment.html>

From sawsan.shatanawi at wsu.edu  Wed Dec 20 08:36:35 2023
From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad)
Date: Wed, 20 Dec 2023 14:36:35 +0000
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
Message-ID: <CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>

Hello,

I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it

Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Mark Adams <mfadams at lbl.gov>
Sent: Wednesday, December 20, 2023 2:48 AM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code


[EXTERNAL EMAIL]

I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern.
If this is what you want then you can tell the matrix to let you do that.
Otherwise you have a bug.

Mark

On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello everyone,

I hope this email finds you well.

 My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages.

I am kindly asking if someone can help me, I would be happy to share my code with him/her.

Please find the attached file contains a list of errors I have gotten

Thank you in advance for your time and assistance.

Best regards,

 Sawsan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/7f308787/attachment-0001.html>

From mfadams at lbl.gov  Wed Dec 20 08:44:47 2023
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 20 Dec 2023 09:44:47 -0500
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
	<CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>
Message-ID: <CADOhEh5uqbNr+oPJ0LOqnJELKfiqN82i8L+DaKnPNxuRAA5bnQ@mail.gmail.com>

Did you set preallocation values when you created the matrix?
Don't do that.

On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad <
sawsan.shatanawi at wsu.edu> wrote:

> Hello,
>
> I am trying to create a sparse matrix( which is as I believe a zero
> matrix) then adding some nonzero elements to it over a loop, then
> assembling it
>
> Get Outlook for iOS <https://aka.ms/o0ukef>
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* Wednesday, December 20, 2023 2:48 AM
> *To:* Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran
> Groundwater Flow Simulation Code
>
>
> *[EXTERNAL EMAIL]*
> I am guessing that you are creating a matrix, adding to it, finalizing it
> ("assembly"), and then adding to it again, which is fine, but you are
> adding new non-zeros to the sparsity pattern.
> If this is what you want then you can tell the matrix to let you do that.
> Otherwise you have a bug.
>
> Mark
>
> On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users
> <petsc-users at mcs.anl.gov> wrote:
>
>> Hello everyone,
>>
>> I hope this email finds you well.
>>
>>  My Name is Sawsan Shatanawi, and I am currently working on developing a
>> Fortran code for simulating groundwater flow in a 3D system. The code
>> involves solving a nonlinear system, and I have created the matrix to be
>> solved using the PCG solver and Picard iteration. However, when I tried
>> to assign it as a PETSc matrix I started getting a lot of error messages.
>>
>> I am kindly asking if someone can help me, I would be happy to share my
>> code with him/her.
>>
>> Please find the attached file contains a list of errors I have gotten
>>
>> Thank you in advance for your time and assistance.
>>
>> Best regards,
>>
>>  Sawsan
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/a099a14d/attachment.html>

From y.hu at mpie.de  Wed Dec 20 10:11:03 2023
From: y.hu at mpie.de (Yi Hu)
Date: Wed, 20 Dec 2023 17:11:03 +0100
Subject: [petsc-users] fortran interface to snes matrix-free jacobian
Message-ID: <c1dd3129-fe1e-4941-928f-d76aeb809a0a@mpie.de>

Dear PETSc team,
 
My ?solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence?
 
I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? 
 
Are these fortran subroutines available? I saw an example in ts module as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES. ?
 
Thanks for your help.
 
Best wishes,
Yi

-------------------------------------------------
Stay up to date and follow us on LinkedIn, Twitter and YouTube.

Max-Planck-Institut f?r Eisenforschung GmbH
Max-Planck-Stra?e 1
D-40237 D?sseldorf
 
Handelsregister B 2533 
Amtsgericht D?sseldorf
 
Gesch?ftsf?hrung
Prof. Dr. Gerhard Dehm
Prof. Dr. J?rg Neugebauer
Prof. Dr. Dierk Raabe
Dr. Kai de Weldige
 
Ust.-Id.-Nr.: DE 11 93 58 514 
Steuernummer: 105 5891 1000


Please consider that invitations and e-mails of our institute are 
only valid if they end with ?@mpie.de. 
If you are not sure of the validity please contact rco at mpie.de

Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. 
In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
-------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/6d5f7fd2/attachment.html>

From jed at jedbrown.org  Wed Dec 20 10:40:27 2023
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 20 Dec 2023 09:40:27 -0700
Subject: [petsc-users] fortran interface to snes matrix-free jacobian
In-Reply-To: <c1dd3129-fe1e-4941-928f-d76aeb809a0a@mpie.de>
References: <c1dd3129-fe1e-4941-928f-d76aeb809a0a@mpie.de>
Message-ID: <87h6kcakuc.fsf@jedbrown.org>

Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator.

https://petsc.org/release/manual/snes/#jacobian-evaluation

Yi Hu <y.hu at mpie.de> writes:

> Dear PETSc team,
>  
> My ?solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence?
>  
> I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? 
>  
> Are these fortran subroutines available? I saw an example in ts module as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES. ?
>  
> Thanks for your help.
>  
> Best wishes,
> Yi
>
> -------------------------------------------------
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>
> Max-Planck-Institut f?r Eisenforschung GmbH
> Max-Planck-Stra?e 1
> D-40237 D?sseldorf
>  
> Handelsregister B 2533 
> Amtsgericht D?sseldorf
>  
> Gesch?ftsf?hrung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. J?rg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
>  
> Ust.-Id.-Nr.: DE 11 93 58 514 
> Steuernummer: 105 5891 1000
>
>
> Please consider that invitations and e-mails of our institute are 
> only valid if they end with ?@mpie.de. 
> If you are not sure of the validity please contact rco at mpie.de
>
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. 
> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
> -------------------------------------------------

From y.hu at mpie.de  Wed Dec 20 10:44:24 2023
From: y.hu at mpie.de (Yi Hu)
Date: Wed, 20 Dec 2023 17:44:24 +0100
Subject: [petsc-users] fortran interface to snes matrix-free jacobian
In-Reply-To: <87h6kcakuc.fsf@jedbrown.org>
References: <c1dd3129-fe1e-4941-928f-d76aeb809a0a@mpie.de>
	<87h6kcakuc.fsf@jedbrown.org>
Message-ID: <d7bbe453-5244-4191-99cd-2c26ca0aacbd@mpie.de>

Dear Jed,

Thanks for your reply. I have an analytical one to implement.

Best, Yi

-----Original Message-----
From: Jed Brown <jed at jedbrown.org> 
Sent: Wednesday, December 20, 2023 5:40 PM
To: Yi Hu <y.hu at mpie.de>; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian

Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator.

https://petsc.org/release/manual/snes/#jacobian-evaluation

Yi Hu <y.hu at mpie.de> writes:

> Dear PETSc team,
>  
> My ?solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence?
>  
> I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? 
>  
> Are these fortran subroutines available? I saw an example in ts module 
> as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES.
>  
> Thanks for your help.
>  
> Best wishes,
> Yi
>
> -------------------------------------------------
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>
> Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1
> D-40237 D?sseldorf
>  
> Handelsregister B 2533
> Amtsgericht D?sseldorf
>  
> Gesch?ftsf?hrung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. J?rg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
>  
> Ust.-Id.-Nr.: DE 11 93 58 514
> Steuernummer: 105 5891 1000
>
>
> Please consider that invitations and e-mails of our institute are only 
> valid if they end with ?@mpie.de.
> If you are not sure of the validity please contact rco at mpie.de
>
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails 
> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind.
> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
> -------------------------------------------------


-------------------------------------------------
Stay up to date and follow us on LinkedIn, Twitter and YouTube.

Max-Planck-Institut f?r Eisenforschung GmbH
Max-Planck-Stra?e 1
D-40237 D?sseldorf
 
Handelsregister B 2533 
Amtsgericht D?sseldorf
 
Gesch?ftsf?hrung
Prof. Dr. Gerhard Dehm
Prof. Dr. J?rg Neugebauer
Prof. Dr. Dierk Raabe
Dr. Kai de Weldige
 
Ust.-Id.-Nr.: DE 11 93 58 514 
Steuernummer: 105 5891 1000


Please consider that invitations and e-mails of our institute are 
only valid if they end with ?@mpie.de. 
If you are not sure of the validity please contact rco at mpie.de

Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. 
In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
-------------------------------------------------


From jed at jedbrown.org  Wed Dec 20 10:52:16 2023
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 20 Dec 2023 09:52:16 -0700
Subject: [petsc-users] fortran interface to snes matrix-free jacobian
In-Reply-To: <d7bbe453-5244-4191-99cd-2c26ca0aacbd@mpie.de>
References: <c1dd3129-fe1e-4941-928f-d76aeb809a0a@mpie.de>
	<87h6kcakuc.fsf@jedbrown.org>
	<d7bbe453-5244-4191-99cd-2c26ca0aacbd@mpie.de>
Message-ID: <87a5q4akan.fsf@jedbrown.org>

Then just use MatShell. I see the docs need some work to clarify this, but MatCreateSNESMF is to specify matrix-free finite differencing from code (perhaps where one wants to customize parameters).

Yi Hu <y.hu at mpie.de> writes:

> Dear Jed,
>
> Thanks for your reply. I have an analytical one to implement.
>
> Best, Yi
>
> -----Original Message-----
> From: Jed Brown <jed at jedbrown.org> 
> Sent: Wednesday, December 20, 2023 5:40 PM
> To: Yi Hu <y.hu at mpie.de>; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian
>
> Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator.
>
> https://petsc.org/release/manual/snes/#jacobian-evaluation
>
> Yi Hu <y.hu at mpie.de> writes:
>
>> Dear PETSc team,
>>  
>> My ?solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence?
>>  
>> I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct? 
>>  
>> Are these fortran subroutines available? I saw an example in ts module 
>> as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES.
>>  
>> Thanks for your help.
>>  
>> Best wishes,
>> Yi
>>
>> -------------------------------------------------
>> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>>
>> Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1
>> D-40237 D?sseldorf
>>  
>> Handelsregister B 2533
>> Amtsgericht D?sseldorf
>>  
>> Gesch?ftsf?hrung
>> Prof. Dr. Gerhard Dehm
>> Prof. Dr. J?rg Neugebauer
>> Prof. Dr. Dierk Raabe
>> Dr. Kai de Weldige
>>  
>> Ust.-Id.-Nr.: DE 11 93 58 514
>> Steuernummer: 105 5891 1000
>>
>>
>> Please consider that invitations and e-mails of our institute are only 
>> valid if they end with ?@mpie.de.
>> If you are not sure of the validity please contact rco at mpie.de
>>
>> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails 
>> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind.
>> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
>> -------------------------------------------------
>
>
> -------------------------------------------------
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>
> Max-Planck-Institut f?r Eisenforschung GmbH
> Max-Planck-Stra?e 1
> D-40237 D?sseldorf
>  
> Handelsregister B 2533 
> Amtsgericht D?sseldorf
>  
> Gesch?ftsf?hrung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. J?rg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
>  
> Ust.-Id.-Nr.: DE 11 93 58 514 
> Steuernummer: 105 5891 1000
>
>
> Please consider that invitations and e-mails of our institute are 
> only valid if they end with ?@mpie.de. 
> If you are not sure of the validity please contact rco at mpie.de
>
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. 
> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
> -------------------------------------------------

From bsmith at petsc.dev  Wed Dec 20 13:14:53 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 20 Dec 2023 14:14:53 -0500
Subject: [petsc-users] fortran interface to snes matrix-free jacobian
In-Reply-To: <d7bbe453-5244-4191-99cd-2c26ca0aacbd@mpie.de>
References: <c1dd3129-fe1e-4941-928f-d76aeb809a0a@mpie.de>
	<87h6kcakuc.fsf@jedbrown.org>
	<d7bbe453-5244-4191-99cd-2c26ca0aacbd@mpie.de>
Message-ID: <45B09E03-4B23-4DE2-B4BC-7DD44629E0FD@petsc.dev>


> On Dec 20, 2023, at 11:44?AM, Yi Hu <y.hu at mpie.de> wrote:
> 
> Dear Jed,
> 
> Thanks for your reply. I have an analytical one to implement.
> 
> Best, Yi
> 
> -----Original Message-----
> From: Jed Brown <jed at jedbrown.org> 
> Sent: Wednesday, December 20, 2023 5:40 PM
> To: Yi Hu <y.hu at mpie.de>; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian
> 
> Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator.
> 
> https://petsc.org/release/manual/snes/#jacobian-evaluation
> 
> Yi Hu <y.hu at mpie.de> writes:
> 
>> Dear PETSc team,
>> 
>> My  solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence?

   You can use DMSNESCreateJacobianMF() (MatCreateSNESMF is not appropriate when you are providing the operation).
     
   
>> 
>> I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct?

Not exactly. Do not use DMDASNESsetJacobianLocal() use DMSNESCreateJacobianMF() to create a Mat J where you create the SNES and use SNESSetJacobian()  and pass the J matrix in along with myJacobian().

>> 
>> Are these fortran subroutines available? I saw an example in ts module 
>> as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES.
>> 
>> Thanks for your help.
>> 
>> Best wishes,
>> Yi
>> 
>> -------------------------------------------------
>> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>> 
>> Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1
>> D-40237 D?sseldorf
>> 
>> Handelsregister B 2533
>> Amtsgericht D?sseldorf
>> 
>> Gesch?ftsf?hrung
>> Prof. Dr. Gerhard Dehm
>> Prof. Dr. J?rg Neugebauer
>> Prof. Dr. Dierk Raabe
>> Dr. Kai de Weldige
>> 
>> Ust.-Id.-Nr.: DE 11 93 58 514
>> Steuernummer: 105 5891 1000
>> 
>> 
>> Please consider that invitations and e-mails of our institute are only 
>> valid if they end with ?@mpie.de.
>> If you are not sure of the validity please contact rco at mpie.de
>> 
>> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails 
>> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind.
>> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
>> -------------------------------------------------
> 
> 
> -------------------------------------------------
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
> 
> Max-Planck-Institut f?r Eisenforschung GmbH
> Max-Planck-Stra?e 1
> D-40237 D?sseldorf
> 
> Handelsregister B 2533 
> Amtsgericht D?sseldorf
> 
> Gesch?ftsf?hrung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. J?rg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
> 
> Ust.-Id.-Nr.: DE 11 93 58 514 
> Steuernummer: 105 5891 1000
> 
> 
> Please consider that invitations and e-mails of our institute are 
> only valid if they end with ?@mpie.de. 
> If you are not sure of the validity please contact rco at mpie.de
> 
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. 
> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
> -------------------------------------------------
> 


From bsmith at petsc.dev  Wed Dec 20 13:34:02 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 20 Dec 2023 14:34:02 -0500
Subject: [petsc-users] fortran interface to snes matrix-free jacobian
In-Reply-To: <45B09E03-4B23-4DE2-B4BC-7DD44629E0FD@petsc.dev>
References: <c1dd3129-fe1e-4941-928f-d76aeb809a0a@mpie.de>
	<87h6kcakuc.fsf@jedbrown.org>
	<d7bbe453-5244-4191-99cd-2c26ca0aacbd@mpie.de>
	<45B09E03-4B23-4DE2-B4BC-7DD44629E0FD@petsc.dev>
Message-ID: <8A145074-7057-4329-8A2A-C37510287BAD@petsc.dev>


   I apologize; please ignore my answer below. Use MatCreateShell() as indicated by Jed.


> On Dec 20, 2023, at 2:14?PM, Barry Smith <bsmith at petsc.dev> wrote:
> 
> 
> 
>> On Dec 20, 2023, at 11:44?AM, Yi Hu <y.hu at mpie.de> wrote:
>> 
>> Dear Jed,
>> 
>> Thanks for your reply. I have an analytical one to implement.
>> 
>> Best, Yi
>> 
>> -----Original Message-----
>> From: Jed Brown <jed at jedbrown.org> 
>> Sent: Wednesday, December 20, 2023 5:40 PM
>> To: Yi Hu <y.hu at mpie.de>; petsc-users at mcs.anl.gov
>> Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian
>> 
>> Are you wanting an analytic matrix-free operator or one created for you based on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator.
>> 
>> https://petsc.org/release/manual/snes/#jacobian-evaluation
>> 
>> Yi Hu <y.hu at mpie.de> writes:
>> 
>>> Dear PETSc team,
>>> 
>>> My  solution scheme relies on a matrix-free jacobian in the SNES solver. I saw the useful C interface like MatCreateSNESMF(), DMSNESCreateJacobianMF(). I am wondering if you have the fortran equivalence?
> 
>   You can use DMSNESCreateJacobianMF() (MatCreateSNESMF is not appropriate when you are providing the operation).
> 
> 
>>> 
>>> I think for my problem in the main program I need to do DMDASNESsetJacobianLocal(DM, INSERT_VALUES, myJacobian, ctx, err_petsc). Then in myJacobian() subroutine I have to create the operator from DMSNESCreateJacobianMF(), and register my own MATOP_MULT from MatShellSetOperation(). Am I correct?
> 
> Not exactly. Do not use DMDASNESsetJacobianLocal() use DMSNESCreateJacobianMF() to create a Mat J where you create the SNES and use SNESSetJacobian()  and pass the J matrix in along with myJacobian().
> 
>>> 
>>> Are these fortran subroutines available? I saw an example in ts module 
>>> as ex22f_mf.F90 which behaves similar as what I would like to do. Because I would like to use ngmres, I then need to stay in the SNES.
>>> 
>>> Thanks for your help.
>>> 
>>> Best wishes,
>>> Yi
>>> 
>>> -------------------------------------------------
>>> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>>> 
>>> Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1
>>> D-40237 D?sseldorf
>>> 
>>> Handelsregister B 2533
>>> Amtsgericht D?sseldorf
>>> 
>>> Gesch?ftsf?hrung
>>> Prof. Dr. Gerhard Dehm
>>> Prof. Dr. J?rg Neugebauer
>>> Prof. Dr. Dierk Raabe
>>> Dr. Kai de Weldige
>>> 
>>> Ust.-Id.-Nr.: DE 11 93 58 514
>>> Steuernummer: 105 5891 1000
>>> 
>>> 
>>> Please consider that invitations and e-mails of our institute are only 
>>> valid if they end with ?@mpie.de.
>>> If you are not sure of the validity please contact rco at mpie.de
>>> 
>>> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails 
>>> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind.
>>> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
>>> -------------------------------------------------
>> 
>> 
>> -------------------------------------------------
>> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>> 
>> Max-Planck-Institut f?r Eisenforschung GmbH
>> Max-Planck-Stra?e 1
>> D-40237 D?sseldorf
>> 
>> Handelsregister B 2533 
>> Amtsgericht D?sseldorf
>> 
>> Gesch?ftsf?hrung
>> Prof. Dr. Gerhard Dehm
>> Prof. Dr. J?rg Neugebauer
>> Prof. Dr. Dierk Raabe
>> Dr. Kai de Weldige
>> 
>> Ust.-Id.-Nr.: DE 11 93 58 514 
>> Steuernummer: 105 5891 1000
>> 
>> 
>> Please consider that invitations and e-mails of our institute are 
>> only valid if they end with ?@mpie.de. 
>> If you are not sure of the validity please contact rco at mpie.de
>> 
>> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
>> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. 
>> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de
>> -------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/1b3c6fa1/attachment.html>

From kenneth.c.hall at duke.edu  Wed Dec 20 15:19:54 2023
From: kenneth.c.hall at duke.edu (Kenneth C Hall)
Date: Wed, 20 Dec 2023 21:19:54 +0000
Subject: [petsc-users] SLEPc/NEP for shell matrice T(lambda) and
 T'(lambda)
In-Reply-To: <19A0679D-2A50-4E64-A805-F26582562B9A@dsic.upv.es>
References: <BL0PR05MB480177D5E10088FE9EC11DA4A2CAA@BL0PR05MB4801.namprd05.prod.outlook.com>
	<F1528F80-C92C-43A9-9CB7-75A1D8712100@dsic.upv.es>
	<BL0PR05MB4801BD698F33E63C93E55252A2C9A@BL0PR05MB4801.namprd05.prod.outlook.com>
	<89E53665-4C0D-4583-9C90-13C4C108A4EA@dsic.upv.es>
	<BL0PR05MB48010E7BE4124F38F73144AEA2CCA@BL0PR05MB4801.namprd05.prod.outlook.com>
	<442B3841-B668-4185-9C6F-D03CA481CA26@dsic.upv.es>
	<BL0PR05MB4801499DCBCD2582091CCBE1A2D3A@BL0PR05MB4801.namprd05.prod.outlook.com>
	<D1209BDF-F37A-477E-8270-53D4D58F4A17@dsic.upv.es>
	<DM6PR05MB4812B18DB35D05B412244B8CA2D5A@DM6PR05MB4812.namprd05.prod.outlook.com>
	<19A0679D-2A50-4E64-A805-F26582562B9A@dsic.upv.es>
Message-ID: <94CFE0C0-7A64-4E9E-800B-B18CEAF83BFF@duke.edu>

Jose,


I have been revisiting the issue of SLEPc/NEP for shell matrices T(lambda) and T'(lambda).

I am having problems running SLEPc/NEP with -nep_type nleigs.


I have compiled two versions of PETSc/SLEPc:


            petsc-arch-real / slepc-arch-real

            ./configure --with-cc=gcc-13 --with-cxx=g++-13 --with-fc=gfortran --COPTFLAGS='-O3 -fopenmp' --CXXOPTFLAGS='-O3 -fopenmp'

                        --FOPTFLAGS='-O3 -fopenmp' --with-debugging=1 --with-logging=1 --with-scalar-type=real --with-precision=double

                        --download-fblaslapack --with-openmp --with-mpi=0


            petsc-arch-complex / slepc-arch-complex

            ./configure --with-cc=gcc-13 --with-cxx=g++-13 --with-fc=gfortran --COPTFLAGS='-O3 -fopenmp' --CXXOPTFLAGS='-O3 -fopenmp'

                        --FOPTFLAGS='-O3 -fopenmp' --with-debugging=1 --with-logging=1 --with-scalar-type=complex --with-precision=double

                        --download-fblaslapack --with-openmp --with-mpi=0


I use gfortran on an Apple Mac Mini M1.  Both the PETSc and SLEPc versions are the latest development versions as of today (a6690fd8 and 267bd1cd, respectively).


I ran the ex54f90 test cases:


            %  main-arch-real -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none

           %  main-arch-complex -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none

           %  main-arch-real  -nep_type nleigs -rg_interval_endpoints 0.2,1.1 -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none

           %  main-arch-complex -nep_type nleigs -rg_interval_endpoints 0.2,1.1,-.1,.1 -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none


Both the slp cases ran as expected and gave the correct answer.

However, both the real and complex architectures failed for the nleigs case.


For the complex case, none of the callback functions appear to have been called.

For the real case, only the MatMult_A routine appears to be called, 100 times and returns each time, sweeping over lambda from 0.2 to 1.1.


Any suggestions would be welcome.


Best regards,

Kenneth Hall


?On 10/18/23, 9:16 AM, "Jose E. Roman" <jroman at dsic.upv.es <mailto:jroman at dsic.upv.es>> wrote:


By the way, the MATOP_DESTROY stuff produced segmentation fault in some compilers (in gfortran it worked well). The reason was having the callback functions inside CONTAINS, that is why we have removed it and used regular subroutines instead.


Jose


> El 18 oct 2023, a las 15:11, Kenneth C Hall <kenneth.c.hall at duke.edu <mailto:kenneth.c.hall at duke.edu>> escribi?:

>

> Jose,

>

> Thank you. I have downloaded and will take a look. I will try the new example and then implement in my actual problem. I will keep you posted as to my results.

>

> Thank you and best regards,

> Kenneth

>

> From: Jose E. Roman <jroman at dsic.upv.es <mailto:jroman at dsic.upv.es>>

> Sent: Tuesday, October 17, 2023 2:31 PM

> To: Kenneth C Hall <kenneth.c.hall at duke.edu <mailto:kenneth.c.hall at duke.edu>>

> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>

> Subject: Re: [petsc-users] SLEPc/NEP for shell matrice T(lambda) and T'(lambda)

>

> Kenneth,

>

> I have worked a bit more on your example and put it in SLEPc https://urldefense.com/v3/__https://gitlab.com/slepc/slepc/-/merge_requests/596__;!!OToaGQ!oSqCpmczx5VDi5025aO5T3WqW-MwGnKUSzxKVkdyXTHo9vuxP4GYnDfMoYxavvWRAA0WdcwX3tiVaiXWT0dh2-o$ <https://urldefense.com/v3/__https://gitlab.com/slepc/slepc/-/merge_requests/596__;!!OToaGQ!oSqCpmczx5VDi5025aO5T3WqW-MwGnKUSzxKVkdyXTHo9vuxP4GYnDfMoYxavvWRAA0WdcwX3tiVaiXWT0dh2-o$>

> This version also has MATOP_DESTROY to avoid memory leaks.

>

> Thanks.

> Jose

>

>

> > El 12 oct 2023, a las 20:59, Kenneth C Hall <kenneth.c.hall at duke.edu <mailto:kenneth.c.hall at duke.edu>> escribi?:

> >

> > Jose,

> >

> > Thanks very much for this. I will give it a try and let you know how it works.

> >

> > Best regards,

> > Kenneth

> >

> > From: Jose E. Roman <jroman at dsic.upv.es <mailto:jroman at dsic.upv.es>>

> > Date: Thursday, October 12, 2023 at 2:12 PM

> > To: Kenneth C Hall <kenneth.c.hall at duke.edu <mailto:kenneth.c.hall at duke.edu>>

> > Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>

> > Subject: Re: [petsc-users] SLEPc/NEP for shell matrice T(lambda) and T'(lambda)

> >

> > I am attaching your example modified with the context stuff.

> >

> > With the PETSc branch that I indicated, now it works with NLEIGS, for instance:

> >

> > $ ./test_nep -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none -rg_interval_endpoints 0.2,1.1 -nep_target 0.8 -nep_nev 5 -n 400 -nep_monitor -nep_view -nep_error_relative ::ascii_info_detail

> >

> > And also other solvers such as SLP:

> >

> > $ ./test_nep -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none -nep_target 0.8 -nep_nev 5 -n 400 -nep_monitor -nep_error_relative ::ascii_info_detail

> >

> > I will clean the example code an add it as a SLEPc example.

> >

> > Regards,

> > Jose

> >

> >

> > > El 11 oct 2023, a las 17:27, Kenneth C Hall <kenneth.c.hall at duke.edu <mailto:kenneth.c.hall at duke.edu>> escribi?:

> > >

> > > Jose,

> > >

> > > Thanks very much for your help with this. Greatly appreciated. I will look at the MR. Please let me know if you do get the Fortran example working.

> > >

> > > Thanks, and best regards,

> > > Kenneth

> > >


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/3e9f3ec7/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: nep_transript.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/3e9f3ec7/attachment-0001.txt>

From sawsan.shatanawi at wsu.edu  Wed Dec 20 19:52:07 2023
From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad)
Date: Thu, 21 Dec 2023 01:52:07 +0000
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CADOhEh5uqbNr+oPJ0LOqnJELKfiqN82i8L+DaKnPNxuRAA5bnQ@mail.gmail.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
	<CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh5uqbNr+oPJ0LOqnJELKfiqN82i8L+DaKnPNxuRAA5bnQ@mail.gmail.com>
Message-ID: <CO1PR01MB6760E9A44C9EE941DF1435DA9895A@CO1PR01MB6760.prod.exchangelabs.com>

Hello,
I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code.
I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached)

I appreciate your help?

Sawsan
________________________________
From: Mark Adams <mfadams at lbl.gov>
Sent: Wednesday, December 20, 2023 6:44 AM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code


[EXTERNAL EMAIL]

Did you set preallocation values when you created the matrix?
Don't do that.

On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>> wrote:
Hello,

I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it

Get Outlook for iOS<https://urldefense.com/v3/__https://aka.ms/o0ukef__;!!JmPEgBY0HMszNaDT!uUJ_jeYf45gcXDGR_PeMjhU7hbd_fKcXJPn0pM9eb-YQihKNYuXMYM9x-hglsbXsCFIwNBWgHXdetHODupsOloE$>
________________________________
From: Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>>
Sent: Wednesday, December 20, 2023 2:48 AM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code


[EXTERNAL EMAIL]

I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern.
If this is what you want then you can tell the matrix to let you do that.
Otherwise you have a bug.

Mark

On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello everyone,

I hope this email finds you well.

 My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages.

I am kindly asking if someone can help me, I would be happy to share my code with him/her.

Please find the attached file contains a list of errors I have gotten

Thank you in advance for your time and assistance.

Best regards,

 Sawsan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/bf2d6d68/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Matrix_RHS.F90
Type: application/octet-stream
Size: 7250 bytes
Desc: Matrix_RHS.F90
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/bf2d6d68/attachment-0002.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: out.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/bf2d6d68/attachment-0001.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: solver.F90
Type: application/octet-stream
Size: 6717 bytes
Desc: solver.F90
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/bf2d6d68/attachment-0003.obj>

From bsmith at petsc.dev  Wed Dec 20 20:32:10 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 20 Dec 2023 21:32:10 -0500
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CO1PR01MB6760E9A44C9EE941DF1435DA9895A@CO1PR01MB6760.prod.exchangelabs.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
	<CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh5uqbNr+oPJ0LOqnJELKfiqN82i8L+DaKnPNxuRAA5bnQ@mail.gmail.com>
	<CO1PR01MB6760E9A44C9EE941DF1435DA9895A@CO1PR01MB6760.prod.exchangelabs.com>
Message-ID: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev>


Instead of 

    call PCCreate(PETSC_COMM_WORLD, pc, ierr)
    call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU)
    call KSPSetPC(ksp, pc,ierr)  ! Associate the preconditioner with the KSP solver

do

    call KSPGetPC(ksp,pc,ierr)
    call PCSetType(pc, PCILU,ierr)

Do not call KSPSetUp(). It will be taken care of automatically during the solve


> On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hello, 
> I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code.
> I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached)
> 
> I appreciate your help?
> 
> Sawsan
> From: Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>>
> Sent: Wednesday, December 20, 2023 6:44 AM
> To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu <mailto:sawsan.shatanawi at wsu.edu>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code
>  
> [EXTERNAL EMAIL]
> Did you set preallocation values when you created the matrix?
> Don't do that.
> 
> On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu <mailto:sawsan.shatanawi at wsu.edu>> wrote:
> Hello, 
> 
> I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it 
> 
> Get Outlook for iOS <https://urldefense.com/v3/__https://aka.ms/o0ukef__;!!JmPEgBY0HMszNaDT!uUJ_jeYf45gcXDGR_PeMjhU7hbd_fKcXJPn0pM9eb-YQihKNYuXMYM9x-hglsbXsCFIwNBWgHXdetHODupsOloE$>
> From: Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>>
> Sent: Wednesday, December 20, 2023 2:48 AM
> To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu <mailto:sawsan.shatanawi at wsu.edu>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code
>  
> [EXTERNAL EMAIL]
> I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern.
> If this is what you want then you can tell the matrix to let you do that.
> Otherwise you have a bug.
> 
> Mark
> 
> On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
> Hello everyone,
> 
> I hope this email finds you well.
> 
>  My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages.
> 
> I am kindly asking if someone can help me, I would be happy to share my code with him/her.
> 
> Please find the attached file contains a list of errors I have gotten
> 
> Thank you in advance for your time and assistance.
> Best regards,
> 
>  Sawsan
> 
> 
> <Matrix_RHS.F90><out.txt><solver.F90>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/df4f5de1/attachment.html>

From sawsan.shatanawi at wsu.edu  Wed Dec 20 20:49:13 2023
From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad)
Date: Thu, 21 Dec 2023 02:49:13 +0000
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
	<CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh5uqbNr+oPJ0LOqnJELKfiqN82i8L+DaKnPNxuRAA5bnQ@mail.gmail.com>
	<CO1PR01MB6760E9A44C9EE941DF1435DA9895A@CO1PR01MB6760.prod.exchangelabs.com>
	<17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev>
Message-ID: <CO1PR01MB676030C8CD24FB2B768B4C279895A@CO1PR01MB6760.prod.exchangelabs.com>

Hello Barry,

Thank you a lot for your help, Now I am getting the attached error message.

Bests,
Sawsan
________________________________
From: Barry Smith <bsmith at petsc.dev>
Sent: Wednesday, December 20, 2023 6:32 PM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
Cc: Mark Adams <mfadams at lbl.gov>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code


[EXTERNAL EMAIL]

Instead of

    call PCCreate(PETSC_COMM_WORLD, pc, ierr)
    call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU)
    call KSPSetPC(ksp, pc,ierr)  ! Associate the preconditioner with the KSP solver

do

    call KSPGetPC(ksp,pc,ierr)
    call PCSetType(pc, PCILU,ierr)

Do not call KSPSetUp(). It will be taken care of automatically during the solve


On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov> wrote:

Hello,
I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code.
I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached)

I appreciate your help?

Sawsan
________________________________
From: Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>>
Sent: Wednesday, December 20, 2023 6:44 AM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code

[EXTERNAL EMAIL]
Did you set preallocation values when you created the matrix?
Don't do that.

On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>> wrote:
Hello,

I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it

Get Outlook for iOS<https://urldefense.com/v3/__https://aka.ms/o0ukef__;!!JmPEgBY0HMszNaDT!uUJ_jeYf45gcXDGR_PeMjhU7hbd_fKcXJPn0pM9eb-YQihKNYuXMYM9x-hglsbXsCFIwNBWgHXdetHODupsOloE$>
________________________________
From: Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>>
Sent: Wednesday, December 20, 2023 2:48 AM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code

[EXTERNAL EMAIL]
I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern.
If this is what you want then you can tell the matrix to let you do that.
Otherwise you have a bug.

Mark

On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello everyone,

I hope this email finds you well.

 My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages.

I am kindly asking if someone can help me, I would be happy to share my code with him/her.

Please find the attached file contains a list of errors I have gotten

Thank you in advance for your time and assistance.

Best regards,

 Sawsan

<Matrix_RHS.F90><out.txt><solver.F90>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/bcdfdf6a/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: out2.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/bcdfdf6a/attachment-0001.txt>

From knepley at gmail.com  Wed Dec 20 20:54:33 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 20 Dec 2023 21:54:33 -0500
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CO1PR01MB676030C8CD24FB2B768B4C279895A@CO1PR01MB6760.prod.exchangelabs.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
	<CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh5uqbNr+oPJ0LOqnJELKfiqN82i8L+DaKnPNxuRAA5bnQ@mail.gmail.com>
	<CO1PR01MB6760E9A44C9EE941DF1435DA9895A@CO1PR01MB6760.prod.exchangelabs.com>
	<17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev>
	<CO1PR01MB676030C8CD24FB2B768B4C279895A@CO1PR01MB6760.prod.exchangelabs.com>
Message-ID: <CAMYG4GkG5+VshuRAWYAB2rAO27raqqZ3C1B+MxS2NfaQF8yW=g@mail.gmail.com>

On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hello Barry,
>
> Thank you a lot for your help, Now I am getting the attached error message.
>

Do not destroy the PC from KSPGetPC()

  THanks,

     Matt


> Bests,
> Sawsan
> ------------------------------
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* Wednesday, December 20, 2023 6:32 PM
> *To:* Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
> *Cc:* Mark Adams <mfadams at lbl.gov>; petsc-users at mcs.anl.gov <
> petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran
> Groundwater Flow Simulation Code
>
>
> *[EXTERNAL EMAIL]*
>
> Instead of
>
>     call PCCreate(PETSC_COMM_WORLD, pc, ierr)
>     call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU)
>     call KSPSetPC(ksp, pc,ierr)  ! Associate the preconditioner with the
> KSP solver
>
> do
>
>     call KSPGetPC(ksp,pc,ierr)
>     call PCSetType(pc, PCILU,ierr)
>
> Do not call KSPSetUp(). It will be taken care of automatically during the
> solve
>
>
>
> On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hello,
> I don't think that I set preallocation values when I created the matrix,
> would you please have look at my code. It is just the petsc related part
> from my code.
> I was able to fix some of the error messages. Now I have a new set of
> error messages related to the KSP solver (attached)
>
> I appreciate your help
>
> Sawsan
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* Wednesday, December 20, 2023 6:44 AM
> *To:* Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran
> Groundwater Flow Simulation Code
>
> *[EXTERNAL EMAIL]*
> Did you set preallocation values when you created the matrix?
> Don't do that.
>
> On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad <
> sawsan.shatanawi at wsu.edu> wrote:
>
> Hello,
>
> I am trying to create a sparse matrix( which is as I believe a zero
> matrix) then adding some nonzero elements to it over a loop, then
> assembling it
>
> Get Outlook for iOS
> <https://urldefense.com/v3/__https://aka.ms/o0ukef__;!!JmPEgBY0HMszNaDT!uUJ_jeYf45gcXDGR_PeMjhU7hbd_fKcXJPn0pM9eb-YQihKNYuXMYM9x-hglsbXsCFIwNBWgHXdetHODupsOloE$>
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* Wednesday, December 20, 2023 2:48 AM
> *To:* Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran
> Groundwater Flow Simulation Code
>
> *[EXTERNAL EMAIL]*
> I am guessing that you are creating a matrix, adding to it, finalizing it
> ("assembly"), and then adding to it again, which is fine, but you are
> adding new non-zeros to the sparsity pattern.
> If this is what you want then you can tell the matrix to let you do that.
> Otherwise you have a bug.
>
> Mark
>
> On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users
> <petsc-users at mcs.anl.gov> wrote:
>
> Hello everyone,
>
> I hope this email finds you well.
>
>  My Name is Sawsan Shatanawi, and I am currently working on developing a
> Fortran code for simulating groundwater flow in a 3D system. The code
> involves solving a nonlinear system, and I have created the matrix to be
> solved using the PCG solver and Picard iteration. However, when I tried
> to assign it as a PETSc matrix I started getting a lot of error messages.
>
> I am kindly asking if someone can help me, I would be happy to share my
> code with him/her.
>
> Please find the attached file contains a list of errors I have gotten
>
> Thank you in advance for your time and assistance.
>
> Best regards,
>
>  Sawsan
>
> <Matrix_RHS.F90><out.txt><solver.F90>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231220/a5eed63c/attachment.html>

From sawsan.shatanawi at wsu.edu  Wed Dec 20 21:02:17 2023
From: sawsan.shatanawi at wsu.edu (Shatanawi, Sawsan Muhammad)
Date: Thu, 21 Dec 2023 03:02:17 +0000
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CAMYG4GkG5+VshuRAWYAB2rAO27raqqZ3C1B+MxS2NfaQF8yW=g@mail.gmail.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
	<CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh5uqbNr+oPJ0LOqnJELKfiqN82i8L+DaKnPNxuRAA5bnQ@mail.gmail.com>
	<CO1PR01MB6760E9A44C9EE941DF1435DA9895A@CO1PR01MB6760.prod.exchangelabs.com>
	<17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev>
	<CO1PR01MB676030C8CD24FB2B768B4C279895A@CO1PR01MB6760.prod.exchangelabs.com>
	<CAMYG4GkG5+VshuRAWYAB2rAO27raqqZ3C1B+MxS2NfaQF8yW=g@mail.gmail.com>
Message-ID: <CO1PR01MB676011E177517D83803A690B9895A@CO1PR01MB6760.prod.exchangelabs.com>

Hello Matthew,

Thank you for your help. I am sorry that I keep coming back with my error messages, but I reached a point that I don't know how to fix them, and I don't understand them easily.
The list of errors is getting shorter, now I am getting the attached error messages

Thank you again,

Sawsan
________________________________
From: Matthew Knepley <knepley at gmail.com>
Sent: Wednesday, December 20, 2023 6:54 PM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
Cc: Barry Smith <bsmith at petsc.dev>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code


[EXTERNAL EMAIL]

On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello Barry,

Thank you a lot for your help, Now I am getting the attached error message.

Do not destroy the PC from KSPGetPC()

  THanks,

     Matt

Bests,
Sawsan
________________________________
From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Wednesday, December 20, 2023 6:32 PM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>>
Cc: Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>>; petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code


[EXTERNAL EMAIL]

Instead of

    call PCCreate(PETSC_COMM_WORLD, pc, ierr)
    call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU)
    call KSPSetPC(ksp, pc,ierr)  ! Associate the preconditioner with the KSP solver

do

    call KSPGetPC(ksp,pc,ierr)
    call PCSetType(pc, PCILU,ierr)

Do not call KSPSetUp(). It will be taken care of automatically during the solve


On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Hello,
I don't think that I set preallocation values when I created the matrix, would you please have look at my code. It is just the petsc related part from my code.
I was able to fix some of the error messages. Now I have a new set of error messages related to the KSP solver (attached)

I appreciate your help

Sawsan
________________________________
From: Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>>
Sent: Wednesday, December 20, 2023 6:44 AM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code

[EXTERNAL EMAIL]
Did you set preallocation values when you created the matrix?
Don't do that.

On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>> wrote:
Hello,

I am trying to create a sparse matrix( which is as I believe a zero matrix) then adding some nonzero elements to it over a loop, then assembling it

Get Outlook for iOS<https://urldefense.com/v3/__https://aka.ms/o0ukef__;!!JmPEgBY0HMszNaDT!uUJ_jeYf45gcXDGR_PeMjhU7hbd_fKcXJPn0pM9eb-YQihKNYuXMYM9x-hglsbXsCFIwNBWgHXdetHODupsOloE$>
________________________________
From: Mark Adams <mfadams at lbl.gov<mailto:mfadams at lbl.gov>>
Sent: Wednesday, December 20, 2023 2:48 AM
To: Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu<mailto:sawsan.shatanawi at wsu.edu>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code

[EXTERNAL EMAIL]
I am guessing that you are creating a matrix, adding to it, finalizing it ("assembly"), and then adding to it again, which is fine, but you are adding new non-zeros to the sparsity pattern.
If this is what you want then you can tell the matrix to let you do that.
Otherwise you have a bug.

Mark

On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello everyone,

I hope this email finds you well.

 My Name is Sawsan Shatanawi, and I am currently working on developing a Fortran code for simulating groundwater flow in a 3D system. The code involves solving a nonlinear system, and I have created the matrix to be solved using the PCG solver and Picard iteration. However, when I tried to assign it as a PETSc matrix I started getting a lot of error messages.

I am kindly asking if someone can help me, I would be happy to share my code with him/her.

Please find the attached file contains a list of errors I have gotten

Thank you in advance for your time and assistance.

Best regards,

 Sawsan

<Matrix_RHS.F90><out.txt><solver.F90>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<https://urldefense.com/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!JmPEgBY0HMszNaDT!uskvAyF0pMMWDbMIexr9g4qN46V7Rea17GQdNIVG2vH_HMaX7mXgie4ZYgusmPpss_DS7H1_8vn8arGQNSkC$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/36930023/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: out3.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/36930023/attachment-0001.txt>

From srvenkat at utexas.edu  Wed Dec 20 22:04:09 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Thu, 21 Dec 2023 09:34:09 +0530
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <E0A158B4-9A6E-4C38-9830-498C3273D6DE@joliv.et>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
	<186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>
	<CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
	<7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>
	<CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>
	<E0A158B4-9A6E-4C38-9830-498C3273D6DE@joliv.et>
Message-ID: <CADtq7Mspf0dQBN3G1HiAHqShbci2Wv8577Hi2_s_V6tW457COA@mail.gmail.com>

Would using the CHOLMOD Cholesky factorization (
https://petsc.org/release/manualpages/Mat/MATSOLVERCHOLMOD/) let us do the
factorization on device as well?


On Wed, Dec 20, 2023 at 1:21?PM Pierre Jolivet <pierre at joliv.et> wrote:

>
>
> On 20 Dec 2023, at 8:42?AM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> Ok, I think the error I'm getting has something to do with how the
> multiple solves are being done in succession. I'll try to see if there's
> anything I'm doing wrong there.
>
> One question about the -pc_type lu -ksp_type preonly method: do you know
> which parts of the solve (factorization/triangular solves) are done on host
> and which are done on device?
>
>
> I think only the triangular solves can be done on device.
> Since you have many right-hand sides, it may not be that bad.
> GPU people will hopefully give you a more insightful answer.
>
> Thanks,
> Pierre
>
> Thanks,
> Sreeram
>
> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet <pierre at joliv.et> wrote:
>
>> Unfortunately, I am not able to reproduce such a failure with your input
>> matrix.
>> I?ve used ex79 that I linked previously and the system is properly solved.
>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg
>> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs
>> ascii::ascii_info
>> Linear solve converged due to CONVERGED_RTOL iterations 6
>> Mat Object: 1 MPI process
>>   type: seqaijcusparse
>>   rows=289, cols=289
>>   total: nonzeros=2401, allocated nonzeros=2401
>>   total number of mallocs used during MatSetValues calls=0
>>     not using I-node routines
>> Mat Object: 1 MPI process
>>   type: seqdensecuda
>>   rows=289, cols=10
>>   total: nonzeros=2890, allocated nonzeros=2890
>>   total number of mallocs used during MatSetValues calls=0
>>
>> You mentioned in a subsequent email that you are interested in systems
>> with at most 1E6 unknowns, and up to 1E4 right-hand sides.
>> I?m not sure you can expect significant gains from using GPU for such
>> systems.
>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type
>> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory
>> available on your host.
>>
>> Thanks,
>> Pierre
>>
>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>>
>> Here are the ksp_view files.  I set the options
>> -ksp_error_if_not_converged to try to get the vectors that caused the
>> error. I noticed that some of the KSPMatSolves converge while others don't.
>> In the code, the solves are called as:
>>
>> input vector v --> insert data of v into a dense mat --> KSPMatSolve()
>> --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output
>> vector w -- output w
>>
>> The operator used in the KSP is a Laplacian-like operator, and the
>> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve
>> with a biharmonic-like operator. I can also run it with only the first
>> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP
>> reportedly converges after 0 iterations (see the next line), but this
>> causes problems in other parts of the code later on.
>>
>> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations
>> due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I
>> tried setting ksp_min_it, but that didn't seem to do anything.
>>
>> I'll keep trying different options and also try to get the MWE made (this
>> KSPMatSolve is pretty performance critical for us).
>>
>> Thanks for all your help,
>> Sreeram
>>
>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet <pierre at joliv.et> wrote:
>>
>>>
>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>> Thanks, I will try to create a minimal reproducible example. This may
>>> take me some time though, as I need to figure out how to extract only the
>>> relevant parts (the full program this solve is used in is getting quite
>>> complex).
>>>
>>>
>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat
>>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files
>>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt).
>>>
>>> I'll also try out some of the BoomerAMG options to see if that helps.
>>>
>>>
>>> These should work (this is where all ?PCMatApply()-ready? PC are being
>>> tested):
>>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with
>>> HIP).
>>> I?m aware the performance should not be optimal (see your comment about
>>> host/device copies), I?ve money to hire someone to work on this but: a) I
>>> need to find the correct engineer/post-doc, b) I currently don?t have good
>>> use cases (of course, I could generate a synthetic benchmark, for science).
>>> So even if you send me the three Mat, a MWE would be appreciated if the
>>> KSPMatSolve() is performance-critical for you (see point b) from above).
>>>
>>> Thanks,
>>> Pierre
>>>
>>> Thanks,
>>> Sreeram
>>>
>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>
>>>>
>>>>
>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>> Hello Pierre,
>>>>
>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it
>>>> seems to be doing the batched solves, but the KSP is not converging due to
>>>> a NaN or Inf being generated. I also noticed there are a lot of
>>>> host-to-device and device-to-host copies of the matrices (the non-batched
>>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could
>>>> you please take a look?
>>>>
>>>>
>>>> Yes, but you?d need to send me something I can run with your set of
>>>> options (if you are more confident doing this in private, you can remove
>>>> the list from c/c).
>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and
>>>> there is not much error checking, so instead of erroring out, this may be
>>>> the reason why you are getting garbage.
>>>>
>>>> Thanks,
>>>> Pierre
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et>
>>>> wrote:
>>>>
>>>>> Hello Sreeram,
>>>>> KSPCG (PETSc implementation of CG) does not handle solves with
>>>>> multiple columns at once.
>>>>> There is only a single native PETSc KSP implementation which handles
>>>>> solves with multiple columns at once: KSPPREONLY.
>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more
>>>>> advanced methods) implementation which handles solves with multiple columns
>>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp,
>>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>>> I?m the main author of HPDDM, there is preliminary support for device
>>>>> matrices, but if it?s not working as intended/not faster than column by
>>>>> column, I?d be happy to have a deeper look (maybe in private), because most
>>>>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e.,
>>>>> solvers that treat right-hand sides in a single go) are using plain host
>>>>> matrices.
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>> PS: you could have a look at
>>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
>>>>> understand the philosophy behind block iterative methods in PETSc (and in
>>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
>>>>> developed in the context of this paper to produce Figures 2-3. Note that
>>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
>>>>> others) have been made ?PCMatApply()-ready?.
>>>>>
>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>> Hello Pierre,
>>>>>
>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
>>>>> However, I am noticing that it is still solving column by column (this is
>>>>> stated explicitly in the info dump attached). I looked at the code for
>>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is
>>>>> true, it should do the batched solve, though I'm not sure where that gets
>>>>> set.
>>>>>
>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
>>>>> running the code.
>>>>>
>>>>> Can you please help me with this?
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>>
>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov> wrote:
>>>>>
>>>>>> N.B., AMGX interface is a bit experimental.
>>>>>> Mark
>>>>>>
>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>>
>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>>>>>
>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it
>>>>>>>> out and see how it performs.
>>>>>>>>
>>>>>>>>
>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus
>>>>>>>> has no PCMatApply() implementation.
>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>>>>>> implementation.
>>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Pierre
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sreeram
>>>>>>>>
>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that
>>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>>>>>> reproduce this on your own with
>>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner
>>>>>>>>> within your KSP.
>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>>>>>>>> right-hand sides column by column, which is very inefficient.
>>>>>>>>> You could run your code with -info dump and send us dump.0 to see
>>>>>>>>> what needs to be done on our end to make things more efficient, should you
>>>>>>>>> not be satisfied with the current performance of the code.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Pierre
>>>>>>>>>
>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of
>>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where
>>>>>>>>> v_i has size n. The data for v can be stored either in column-major or
>>>>>>>>> row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>
>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>
>>>>>>>>> From what I have read on the documentation, I can think of 2
>>>>>>>>> approaches.
>>>>>>>>>
>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>>>>>> with R and V.
>>>>>>>>>
>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly
>>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that
>>>>>>>>> it is a multiple RHS system and act accordingly.
>>>>>>>>>
>>>>>>>>> Which would be the more efficient option?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Use 1.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> As a side-note, I am also wondering if there is a way to use
>>>>>>>>> row-major storage of the vector v.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> No
>>>>>>>>>
>>>>>>>>> The reason is that this could allow for more coalesced memory
>>>>>>>>> access when doing matvecs.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector
>>>>>>>>> products for the computation so in theory they should already be
>>>>>>>>> well-optimized
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Sreeram
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> <dump.0>
>>>>>
>>>>>
>>>>> <dump.0>
>>>>
>>>>
>>>>
>>> <Pmat.bin><Amat.bin><rhs.bin>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/b223adfd/attachment-0001.html>

From jroman at dsic.upv.es  Thu Dec 21 05:04:57 2023
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Thu, 21 Dec 2023 12:04:57 +0100
Subject: [petsc-users] SLEPc/NEP for shell matrice T(lambda) and
 T'(lambda)
In-Reply-To: <94CFE0C0-7A64-4E9E-800B-B18CEAF83BFF@duke.edu>
References: <BL0PR05MB480177D5E10088FE9EC11DA4A2CAA@BL0PR05MB4801.namprd05.prod.outlook.com>
	<F1528F80-C92C-43A9-9CB7-75A1D8712100@dsic.upv.es>
	<BL0PR05MB4801BD698F33E63C93E55252A2C9A@BL0PR05MB4801.namprd05.prod.outlook.com>
	<89E53665-4C0D-4583-9C90-13C4C108A4EA@dsic.upv.es>
	<BL0PR05MB48010E7BE4124F38F73144AEA2CCA@BL0PR05MB4801.namprd05.prod.outlook.com>
	<442B3841-B668-4185-9C6F-D03CA481CA26@dsic.upv.es>
	<BL0PR05MB4801499DCBCD2582091CCBE1A2D3A@BL0PR05MB4801.namprd05.prod.outlook.com>
	<D1209BDF-F37A-477E-8270-53D4D58F4A17@dsic.upv.es>
	<DM6PR05MB4812B18DB35D05B412244B8CA2D5A@DM6PR05MB4812.namprd05.prod.outlook.com>
	<19A0679D-2A50-4E64-A805-F26582562B9A@dsic.upv.es>
	<94CFE0C0-7A64-4E9E-800B-B18CEAF83BFF@duke.edu>
Message-ID: <874379C7-D9DB-4CC2-94BF-198FEC9B8E49@dsic.upv.es>

The errors are strange. The traceback points to harmless operations. Likely memory corruption, as the message says.

Those tests are included in SLEPc pipelines, they are run with serveral Linux distributions, with several compilers. Also, on my macOS it runs cleanly, although my configuration is different from yours. I don't have access to an M1 computer. Also, using gcc instead of clang from xcode may have unexpected side effects, I don't know.

I would try with less agressive optimization flags, e.g., --COPTFLAGS=-O --CXXOPTFLAGS=-O --FOPTFLAGS=-O (or even remove them completely). Maybe try also --with-debugging=0.
Another thing you can try is change the BLAS/LAPACK, e.g., removing --download-fblaslapack or replace it with --download-netlib-lapack

See also the FAQ https://petsc.org/release/faq/#what-does-corrupt-argument-or-caught-signal-or-segv-or-segmentation-violation-or-bus-error-mean-can-i-use-valgrind-or-cuda-memcheck-to-debug-memory-corruption-issues

Jose


> El 20 dic 2023, a las 22:19, Kenneth C Hall <kenneth.c.hall at duke.edu> escribi?:
> 
> Jose,
>  
> I have been revisiting the issue of SLEPc/NEP for shell matrices T(lambda) and T'(lambda). 
> I am having problems running SLEPc/NEP with -nep_type nleigs.
>  
> I have compiled two versions of PETSc/SLEPc:
>  
>             petsc-arch-real / slepc-arch-real
>             ./configure --with-cc=gcc-13 --with-cxx=g++-13 --with-fc=gfortran --COPTFLAGS='-O3 -fopenmp' --CXXOPTFLAGS='-O3 -fopenmp'
>                         --FOPTFLAGS='-O3 -fopenmp' --with-debugging=1 --with-logging=1 --with-scalar-type=real --with-precision=double
>                         --download-fblaslapack --with-openmp --with-mpi=0
>  
>  
>             petsc-arch-complex / slepc-arch-complex
>             ./configure --with-cc=gcc-13 --with-cxx=g++-13 --with-fc=gfortran --COPTFLAGS='-O3 -fopenmp' --CXXOPTFLAGS='-O3 -fopenmp'
>                         --FOPTFLAGS='-O3 -fopenmp' --with-debugging=1 --with-logging=1 --with-scalar-type=complex --with-precision=double
>                         --download-fblaslapack --with-openmp --with-mpi=0
>  
> I use gfortran on an Apple Mac Mini M1.  Both the PETSc and SLEPc versions are the latest development versions as of today (a6690fd8 and 267bd1cd, respectively).
>  
> I ran the ex54f90 test cases:
>  
>             %  main-arch-real -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none
>            %  main-arch-complex -nep_type slp -nep_slp_ksp_type gmres -nep_slp_pc_type none
>            %  main-arch-real  -nep_type nleigs -rg_interval_endpoints 0.2,1.1 -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none
>            %  main-arch-complex -nep_type nleigs -rg_interval_endpoints 0.2,1.1,-.1,.1 -nep_nleigs_ksp_type gmres -nep_nleigs_pc_type none
>  
> Both the slp cases ran as expected and gave the correct answer.
> However, both the real and complex architectures failed for the nleigs case.
>  
> For the complex case, none of the callback functions appear to have been called. 
> For the real case, only the MatMult_A routine appears to be called, 100 times and returns each time, sweeping over lambda from 0.2 to 1.1.  
>  
> Any suggestions would be welcome.
>  
> Best regards,
> Kenneth Hall
>  


From knepley at gmail.com  Thu Dec 21 05:48:04 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 21 Dec 2023 06:48:04 -0500
Subject: [petsc-users] Help with Integrating PETSc into Fortran
 Groundwater Flow Simulation Code
In-Reply-To: <CO1PR01MB676011E177517D83803A690B9895A@CO1PR01MB6760.prod.exchangelabs.com>
References: <CO1PR01MB67604EA9421A730AD7E7960B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh6iwMWUxKiRhru7Bk7ae283uptxaix09SaQpqE+PgmR1A@mail.gmail.com>
	<CO1PR01MB676094B8B57766829936D56B9896A@CO1PR01MB6760.prod.exchangelabs.com>
	<CADOhEh5uqbNr+oPJ0LOqnJELKfiqN82i8L+DaKnPNxuRAA5bnQ@mail.gmail.com>
	<CO1PR01MB6760E9A44C9EE941DF1435DA9895A@CO1PR01MB6760.prod.exchangelabs.com>
	<17B6144E-4F3A-4173-9818-4B03669736C4@petsc.dev>
	<CO1PR01MB676030C8CD24FB2B768B4C279895A@CO1PR01MB6760.prod.exchangelabs.com>
	<CAMYG4GkG5+VshuRAWYAB2rAO27raqqZ3C1B+MxS2NfaQF8yW=g@mail.gmail.com>
	<CO1PR01MB676011E177517D83803A690B9895A@CO1PR01MB6760.prod.exchangelabs.com>
Message-ID: <CAMYG4GnrCEuFK4+OpRxAdLvEUFH5H8dq8bp8Mzcq8ny87+1mxA@mail.gmail.com>

On Wed, Dec 20, 2023 at 10:02?PM Shatanawi, Sawsan Muhammad <
sawsan.shatanawi at wsu.edu> wrote:

> Hello Matthew,
>
> Thank you for your help. I am sorry that I keep coming back with my error
> messages, but I reached a point that I don't know how to fix them, and I
> don't understand them easily.
> The list of errors is getting shorter, now I am getting the attached error
> messages
>

You are overwriting memory somewhere, but we cannot see the code, and thus
cannot tell where.

You can figure this out by running in the debugger, using
-start_in_debugger, which launches a debugger window, and then 'cont' to
run until the error, and then 'where' to print the stack trace.

   Thanks,

     Matt


> Thank you again,
>
> Sawsan
> ------------------------------
> *From:* Matthew Knepley <knepley at gmail.com>
> *Sent:* Wednesday, December 20, 2023 6:54 PM
> *To:* Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
> *Cc:* Barry Smith <bsmith at petsc.dev>; petsc-users at mcs.anl.gov <
> petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran
> Groundwater Flow Simulation Code
>
>
> *[EXTERNAL EMAIL]*
> On Wed, Dec 20, 2023 at 9:49?PM Shatanawi, Sawsan Muhammad via petsc-users
> <petsc-users at mcs.anl.gov> wrote:
>
> Hello Barry,
>
> Thank you a lot for your help, Now I am getting the attached error message.
>
>
> Do not destroy the PC from KSPGetPC()
>
>   THanks,
>
>      Matt
>
>
> Bests,
> Sawsan
> ------------------------------
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* Wednesday, December 20, 2023 6:32 PM
> *To:* Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
> *Cc:* Mark Adams <mfadams at lbl.gov>; petsc-users at mcs.anl.gov <
> petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran
> Groundwater Flow Simulation Code
>
>
> *[EXTERNAL EMAIL]*
>
> Instead of
>
>     call PCCreate(PETSC_COMM_WORLD, pc, ierr)
>     call PCSetType(pc, PCILU,ierr) ! Choose a preconditioner type (ILU)
>     call KSPSetPC(ksp, pc,ierr)  ! Associate the preconditioner with the
> KSP solver
>
> do
>
>     call KSPGetPC(ksp,pc,ierr)
>     call PCSetType(pc, PCILU,ierr)
>
> Do not call KSPSetUp(). It will be taken care of automatically during the
> solve
>
>
>
> On Dec 20, 2023, at 8:52?PM, Shatanawi, Sawsan Muhammad via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hello,
> I don't think that I set preallocation values when I created the matrix,
> would you please have look at my code. It is just the petsc related part
> from my code.
> I was able to fix some of the error messages. Now I have a new set of
> error messages related to the KSP solver (attached)
>
> I appreciate your help
>
> Sawsan
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* Wednesday, December 20, 2023 6:44 AM
> *To:* Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran
> Groundwater Flow Simulation Code
>
> *[EXTERNAL EMAIL]*
> Did you set preallocation values when you created the matrix?
> Don't do that.
>
> On Wed, Dec 20, 2023 at 9:36?AM Shatanawi, Sawsan Muhammad <
> sawsan.shatanawi at wsu.edu> wrote:
>
> Hello,
>
> I am trying to create a sparse matrix( which is as I believe a zero
> matrix) then adding some nonzero elements to it over a loop, then
> assembling it
>
> Get Outlook for iOS
> <https://urldefense.com/v3/__https://aka.ms/o0ukef__;!!JmPEgBY0HMszNaDT!uUJ_jeYf45gcXDGR_PeMjhU7hbd_fKcXJPn0pM9eb-YQihKNYuXMYM9x-hglsbXsCFIwNBWgHXdetHODupsOloE$>
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* Wednesday, December 20, 2023 2:48 AM
> *To:* Shatanawi, Sawsan Muhammad <sawsan.shatanawi at wsu.edu>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Help with Integrating PETSc into Fortran
> Groundwater Flow Simulation Code
>
> *[EXTERNAL EMAIL]*
> I am guessing that you are creating a matrix, adding to it, finalizing it
> ("assembly"), and then adding to it again, which is fine, but you are
> adding new non-zeros to the sparsity pattern.
> If this is what you want then you can tell the matrix to let you do that.
> Otherwise you have a bug.
>
> Mark
>
> On Tue, Dec 19, 2023 at 9:50?PM Shatanawi, Sawsan Muhammad via petsc-users
> <petsc-users at mcs.anl.gov> wrote:
>
> Hello everyone,
>
> I hope this email finds you well.
>
>  My Name is Sawsan Shatanawi, and I am currently working on developing a
> Fortran code for simulating groundwater flow in a 3D system. The code
> involves solving a nonlinear system, and I have created the matrix to be
> solved using the PCG solver and Picard iteration. However, when I tried
> to assign it as a PETSc matrix I started getting a lot of error messages.
>
> I am kindly asking if someone can help me, I would be happy to share my
> code with him/her.
>
> Please find the attached file contains a list of errors I have gotten
>
> Thank you in advance for your time and assistance.
>
> Best regards,
>
>  Sawsan
>
> <Matrix_RHS.F90><out.txt><solver.F90>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <https://urldefense.com/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!JmPEgBY0HMszNaDT!uskvAyF0pMMWDbMIexr9g4qN46V7Rea17GQdNIVG2vH_HMaX7mXgie4ZYgusmPpss_DS7H1_8vn8arGQNSkC$>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/e4a5003a/attachment-0001.html>

From knepley at gmail.com  Thu Dec 21 05:53:43 2023
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 21 Dec 2023 06:53:43 -0500
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
	<186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>
	<CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
	<7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>
	<CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>
Message-ID: <CAMYG4Gkzpyx4wDX4-CFz4s-uzH-p0B+uFNPtU1PUSxCBz7nWVQ@mail.gmail.com>

On Thu, Dec 21, 2023 at 6:46?AM Sreeram R Venkat <srvenkat at utexas.edu>
wrote:

> Ok, I think the error I'm getting has something to do with how the
> multiple solves are being done in succession. I'll try to see if there's
> anything I'm doing wrong there.
>
> One question about the -pc_type lu -ksp_type preonly method: do you know
> which parts of the solve (factorization/triangular solves) are done on host
> and which are done on device?
>

For SEQDENSE, I believe both the factorization and solve is on device. It
is hard to see, but I believe the dispatch code is here:


https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/dense/seq/cupm/matseqdensecupm.hpp?ref_type=heads#L368

  Thanks,

     Matt


> Thanks,
> Sreeram
>
> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet <pierre at joliv.et> wrote:
>
>> Unfortunately, I am not able to reproduce such a failure with your input
>> matrix.
>> I?ve used ex79 that I linked previously and the system is properly solved.
>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg
>> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs
>> ascii::ascii_info
>> Linear solve converged due to CONVERGED_RTOL iterations 6
>> Mat Object: 1 MPI process
>>   type: seqaijcusparse
>>   rows=289, cols=289
>>   total: nonzeros=2401, allocated nonzeros=2401
>>   total number of mallocs used during MatSetValues calls=0
>>     not using I-node routines
>> Mat Object: 1 MPI process
>>   type: seqdensecuda
>>   rows=289, cols=10
>>   total: nonzeros=2890, allocated nonzeros=2890
>>   total number of mallocs used during MatSetValues calls=0
>>
>> You mentioned in a subsequent email that you are interested in systems
>> with at most 1E6 unknowns, and up to 1E4 right-hand sides.
>> I?m not sure you can expect significant gains from using GPU for such
>> systems.
>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type
>> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory
>> available on your host.
>>
>> Thanks,
>> Pierre
>>
>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>>
>> Here are the ksp_view files.  I set the options
>> -ksp_error_if_not_converged to try to get the vectors that caused the
>> error. I noticed that some of the KSPMatSolves converge while others don't.
>> In the code, the solves are called as:
>>
>> input vector v --> insert data of v into a dense mat --> KSPMatSolve()
>> --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output
>> vector w -- output w
>>
>> The operator used in the KSP is a Laplacian-like operator, and the
>> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve
>> with a biharmonic-like operator. I can also run it with only the first
>> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP
>> reportedly converges after 0 iterations (see the next line), but this
>> causes problems in other parts of the code later on.
>>
>> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations
>> due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I
>> tried setting ksp_min_it, but that didn't seem to do anything.
>>
>> I'll keep trying different options and also try to get the MWE made (this
>> KSPMatSolve is pretty performance critical for us).
>>
>> Thanks for all your help,
>> Sreeram
>>
>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet <pierre at joliv.et> wrote:
>>
>>>
>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>> Thanks, I will try to create a minimal reproducible example. This may
>>> take me some time though, as I need to figure out how to extract only the
>>> relevant parts (the full program this solve is used in is getting quite
>>> complex).
>>>
>>>
>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat
>>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files
>>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt).
>>>
>>> I'll also try out some of the BoomerAMG options to see if that helps.
>>>
>>>
>>> These should work (this is where all ?PCMatApply()-ready? PC are being
>>> tested):
>>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with
>>> HIP).
>>> I?m aware the performance should not be optimal (see your comment about
>>> host/device copies), I?ve money to hire someone to work on this but: a) I
>>> need to find the correct engineer/post-doc, b) I currently don?t have good
>>> use cases (of course, I could generate a synthetic benchmark, for science).
>>> So even if you send me the three Mat, a MWE would be appreciated if the
>>> KSPMatSolve() is performance-critical for you (see point b) from above).
>>>
>>> Thanks,
>>> Pierre
>>>
>>> Thanks,
>>> Sreeram
>>>
>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>
>>>>
>>>>
>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>> Hello Pierre,
>>>>
>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it
>>>> seems to be doing the batched solves, but the KSP is not converging due to
>>>> a NaN or Inf being generated. I also noticed there are a lot of
>>>> host-to-device and device-to-host copies of the matrices (the non-batched
>>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could
>>>> you please take a look?
>>>>
>>>>
>>>> Yes, but you?d need to send me something I can run with your set of
>>>> options (if you are more confident doing this in private, you can remove
>>>> the list from c/c).
>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and
>>>> there is not much error checking, so instead of erroring out, this may be
>>>> the reason why you are getting garbage.
>>>>
>>>> Thanks,
>>>> Pierre
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et>
>>>> wrote:
>>>>
>>>>> Hello Sreeram,
>>>>> KSPCG (PETSc implementation of CG) does not handle solves with
>>>>> multiple columns at once.
>>>>> There is only a single native PETSc KSP implementation which handles
>>>>> solves with multiple columns at once: KSPPREONLY.
>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more
>>>>> advanced methods) implementation which handles solves with multiple columns
>>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp,
>>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>>> I?m the main author of HPDDM, there is preliminary support for device
>>>>> matrices, but if it?s not working as intended/not faster than column by
>>>>> column, I?d be happy to have a deeper look (maybe in private), because most
>>>>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e.,
>>>>> solvers that treat right-hand sides in a single go) are using plain host
>>>>> matrices.
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>> PS: you could have a look at
>>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
>>>>> understand the philosophy behind block iterative methods in PETSc (and in
>>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
>>>>> developed in the context of this paper to produce Figures 2-3. Note that
>>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
>>>>> others) have been made ?PCMatApply()-ready?.
>>>>>
>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>> Hello Pierre,
>>>>>
>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
>>>>> However, I am noticing that it is still solving column by column (this is
>>>>> stated explicitly in the info dump attached). I looked at the code for
>>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is
>>>>> true, it should do the batched solve, though I'm not sure where that gets
>>>>> set.
>>>>>
>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
>>>>> running the code.
>>>>>
>>>>> Can you please help me with this?
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>>
>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov> wrote:
>>>>>
>>>>>> N.B., AMGX interface is a bit experimental.
>>>>>> Mark
>>>>>>
>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>>
>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>>>>>
>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it
>>>>>>>> out and see how it performs.
>>>>>>>>
>>>>>>>>
>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus
>>>>>>>> has no PCMatApply() implementation.
>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>>>>>> implementation.
>>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Pierre
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sreeram
>>>>>>>>
>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that
>>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>>>>>> reproduce this on your own with
>>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner
>>>>>>>>> within your KSP.
>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>>>>>>>> right-hand sides column by column, which is very inefficient.
>>>>>>>>> You could run your code with -info dump and send us dump.0 to see
>>>>>>>>> what needs to be done on our end to make things more efficient, should you
>>>>>>>>> not be satisfied with the current performance of the code.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Pierre
>>>>>>>>>
>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of
>>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where
>>>>>>>>> v_i has size n. The data for v can be stored either in column-major or
>>>>>>>>> row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>
>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>
>>>>>>>>> From what I have read on the documentation, I can think of 2
>>>>>>>>> approaches.
>>>>>>>>>
>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>>>>>> with R and V.
>>>>>>>>>
>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly
>>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that
>>>>>>>>> it is a multiple RHS system and act accordingly.
>>>>>>>>>
>>>>>>>>> Which would be the more efficient option?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Use 1.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> As a side-note, I am also wondering if there is a way to use
>>>>>>>>> row-major storage of the vector v.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> No
>>>>>>>>>
>>>>>>>>> The reason is that this could allow for more coalesced memory
>>>>>>>>> access when doing matvecs.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector
>>>>>>>>> products for the computation so in theory they should already be
>>>>>>>>> well-optimized
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Sreeram
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> <dump.0>
>>>>>
>>>>>
>>>>> <dump.0>
>>>>
>>>>
>>>>
>>> <Pmat.bin><Amat.bin><rhs.bin>
>>
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/559fa109/attachment-0001.html>

From ngoetting at itp.uni-bremen.de  Thu Dec 21 07:35:37 2023
From: ngoetting at itp.uni-bremen.de (=?UTF-8?Q?Niclas_G=C3=B6tting?=)
Date: Thu, 21 Dec 2023 14:35:37 +0100
Subject: [petsc-users] TS docs wrong URLs in Examples
Message-ID: <a279a2e2-5008-4948-9a93-98d86e896cac@itp.uni-bremen.de>

Hi all,

I noticed that all links to the examples under 
https://petsc.org/release/manualpages/TS/TS/ point to the wrong URL. 
Instead of src/ts/**/*, they point to src/sys/**/*, which does not seem 
to be right. This definitely is a minor issue, but I couldn't see an 
obvious fix via "Edit this page", so here is the e-mail.

Best regards
Niclas


From bsmith at petsc.dev  Thu Dec 21 11:51:23 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 21 Dec 2023 12:51:23 -0500
Subject: [petsc-users] TS docs wrong URLs in Examples
In-Reply-To: <a279a2e2-5008-4948-9a93-98d86e896cac@itp.uni-bremen.de>
References: <a279a2e2-5008-4948-9a93-98d86e896cac@itp.uni-bremen.de>
Message-ID: <B8CC858E-4CED-4331-9FBB-2F4BBA91D9F6@petsc.dev>


  Thanks for letting us know, we'll take a look at it.

   Barry


> On Dec 21, 2023, at 8:35?AM, Niclas G?tting <ngoetting at itp.uni-bremen.de> wrote:
> 
> Hi all,
> 
> I noticed that all links to the examples under https://petsc.org/release/manualpages/TS/TS/ point to the wrong URL. Instead of src/ts/**/*, they point to src/sys/**/*, which does not seem to be right. This definitely is a minor issue, but I couldn't see an obvious fix via "Edit this page", so here is the e-mail.
> 
> Best regards
> Niclas
> 


From junchao.zhang at gmail.com  Thu Dec 21 12:38:24 2023
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Thu, 21 Dec 2023 12:38:24 -0600
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CAMYG4Gkzpyx4wDX4-CFz4s-uzH-p0B+uFNPtU1PUSxCBz7nWVQ@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
	<186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>
	<CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
	<7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>
	<CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>
	<CAMYG4Gkzpyx4wDX4-CFz4s-uzH-p0B+uFNPtU1PUSxCBz7nWVQ@mail.gmail.com>
Message-ID: <CA+MQGp-CJS=gvUNjCJRo__v39MJ=Aw+EAYB5sXAwrLRBNb_fAQ@mail.gmail.com>

On Thu, Dec 21, 2023 at 5:54?AM Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Dec 21, 2023 at 6:46?AM Sreeram R Venkat <srvenkat at utexas.edu>
> wrote:
>
>> Ok, I think the error I'm getting has something to do with how the
>> multiple solves are being done in succession. I'll try to see if there's
>> anything I'm doing wrong there.
>>
>> One question about the -pc_type lu -ksp_type preonly method: do you know
>> which parts of the solve (factorization/triangular solves) are done on host
>> and which are done on device?
>>
>
> For SEQDENSE, I believe both the factorization and solve is on device. It
> is hard to see, but I believe the dispatch code is here:
>
Yes, it is correct.

>
>
> https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/dense/seq/cupm/matseqdensecupm.hpp?ref_type=heads#L368
>
>   Thanks,
>
>      Matt
>
>
>> Thanks,
>> Sreeram
>>
>> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>
>>> Unfortunately, I am not able to reproduce such a failure with your input
>>> matrix.
>>> I?ve used ex79 that I linked previously and the system is properly
>>> solved.
>>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg
>>> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs
>>> ascii::ascii_info
>>> Linear solve converged due to CONVERGED_RTOL iterations 6
>>> Mat Object: 1 MPI process
>>>   type: seqaijcusparse
>>>   rows=289, cols=289
>>>   total: nonzeros=2401, allocated nonzeros=2401
>>>   total number of mallocs used during MatSetValues calls=0
>>>     not using I-node routines
>>> Mat Object: 1 MPI process
>>>   type: seqdensecuda
>>>   rows=289, cols=10
>>>   total: nonzeros=2890, allocated nonzeros=2890
>>>   total number of mallocs used during MatSetValues calls=0
>>>
>>> You mentioned in a subsequent email that you are interested in systems
>>> with at most 1E6 unknowns, and up to 1E4 right-hand sides.
>>> I?m not sure you can expect significant gains from using GPU for such
>>> systems.
>>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type
>>> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory
>>> available on your host.
>>>
>>> Thanks,
>>> Pierre
>>>
>>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>> Here are the ksp_view files.  I set the options
>>> -ksp_error_if_not_converged to try to get the vectors that caused the
>>> error. I noticed that some of the KSPMatSolves converge while others don't.
>>> In the code, the solves are called as:
>>>
>>> input vector v --> insert data of v into a dense mat --> KSPMatSolve()
>>> --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output
>>> vector w -- output w
>>>
>>> The operator used in the KSP is a Laplacian-like operator, and the
>>> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve
>>> with a biharmonic-like operator. I can also run it with only the first
>>> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP
>>> reportedly converges after 0 iterations (see the next line), but this
>>> causes problems in other parts of the code later on.
>>>
>>> I saw that sometimes the first KSPMatSolve "converges" after 0
>>> iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a
>>> NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything.
>>>
>>> I'll keep trying different options and also try to get the MWE made
>>> (this KSPMatSolve is pretty performance critical for us).
>>>
>>> Thanks for all your help,
>>> Sreeram
>>>
>>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet <pierre at joliv.et> wrote:
>>>
>>>>
>>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>> Thanks, I will try to create a minimal reproducible example. This may
>>>> take me some time though, as I need to figure out how to extract only the
>>>> relevant parts (the full program this solve is used in is getting quite
>>>> complex).
>>>>
>>>>
>>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat
>>>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files
>>>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt).
>>>>
>>>> I'll also try out some of the BoomerAMG options to see if that helps.
>>>>
>>>>
>>>> These should work (this is where all ?PCMatApply()-ready? PC are being
>>>> tested):
>>>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
>>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not
>>>> with HIP).
>>>> I?m aware the performance should not be optimal (see your comment about
>>>> host/device copies), I?ve money to hire someone to work on this but: a) I
>>>> need to find the correct engineer/post-doc, b) I currently don?t have good
>>>> use cases (of course, I could generate a synthetic benchmark, for science).
>>>> So even if you send me the three Mat, a MWE would be appreciated if the
>>>> KSPMatSolve() is performance-critical for you (see point b) from above).
>>>>
>>>> Thanks,
>>>> Pierre
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>> Hello Pierre,
>>>>>
>>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it
>>>>> seems to be doing the batched solves, but the KSP is not converging due to
>>>>> a NaN or Inf being generated. I also noticed there are a lot of
>>>>> host-to-device and device-to-host copies of the matrices (the non-batched
>>>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could
>>>>> you please take a look?
>>>>>
>>>>>
>>>>> Yes, but you?d need to send me something I can run with your set of
>>>>> options (if you are more confident doing this in private, you can remove
>>>>> the list from c/c).
>>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and
>>>>> there is not much error checking, so instead of erroring out, this may be
>>>>> the reason why you are getting garbage.
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et>
>>>>> wrote:
>>>>>
>>>>>> Hello Sreeram,
>>>>>> KSPCG (PETSc implementation of CG) does not handle solves with
>>>>>> multiple columns at once.
>>>>>> There is only a single native PETSc KSP implementation which handles
>>>>>> solves with multiple columns at once: KSPPREONLY.
>>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more
>>>>>> advanced methods) implementation which handles solves with multiple columns
>>>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp,
>>>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>>>> I?m the main author of HPDDM, there is preliminary support for device
>>>>>> matrices, but if it?s not working as intended/not faster than column by
>>>>>> column, I?d be happy to have a deeper look (maybe in private), because most
>>>>>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e.,
>>>>>> solvers that treat right-hand sides in a single go) are using plain host
>>>>>> matrices.
>>>>>>
>>>>>> Thanks,
>>>>>> Pierre
>>>>>>
>>>>>> PS: you could have a look at
>>>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
>>>>>> understand the philosophy behind block iterative methods in PETSc (and in
>>>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
>>>>>> developed in the context of this paper to produce Figures 2-3. Note that
>>>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
>>>>>> others) have been made ?PCMatApply()-ready?.
>>>>>>
>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>> wrote:
>>>>>>
>>>>>> Hello Pierre,
>>>>>>
>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
>>>>>> However, I am noticing that it is still solving column by column (this is
>>>>>> stated explicitly in the info dump attached). I looked at the code for
>>>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is
>>>>>> true, it should do the batched solve, though I'm not sure where that gets
>>>>>> set.
>>>>>>
>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
>>>>>> running the code.
>>>>>>
>>>>>> Can you please help me with this?
>>>>>>
>>>>>> Thanks,
>>>>>> Sreeram
>>>>>>
>>>>>>
>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov> wrote:
>>>>>>
>>>>>>> N.B., AMGX interface is a bit experimental.
>>>>>>> Mark
>>>>>>>
>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sreeram
>>>>>>>>
>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>>>>>>
>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it
>>>>>>>>> out and see how it performs.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus
>>>>>>>>> has no PCMatApply() implementation.
>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>>>>>>> implementation.
>>>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Pierre
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Sreeram
>>>>>>>>>
>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that
>>>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>>>>>>> reproduce this on your own with
>>>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner
>>>>>>>>>> within your KSP.
>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>>>>>>>>> right-hand sides column by column, which is very inefficient.
>>>>>>>>>> You could run your code with -info dump and send us dump.0 to see
>>>>>>>>>> what needs to be done on our end to make things more efficient, should you
>>>>>>>>>> not be satisfied with the current performance of the code.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Pierre
>>>>>>>>>>
>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of
>>>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where
>>>>>>>>>> v_i has size n. The data for v can be stored either in column-major or
>>>>>>>>>> row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>>
>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>>
>>>>>>>>>> From what I have read on the documentation, I can think of 2
>>>>>>>>>> approaches.
>>>>>>>>>>
>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>>>>>>> with R and V.
>>>>>>>>>>
>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly
>>>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that
>>>>>>>>>> it is a multiple RHS system and act accordingly.
>>>>>>>>>>
>>>>>>>>>> Which would be the more efficient option?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Use 1.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> As a side-note, I am also wondering if there is a way to use
>>>>>>>>>> row-major storage of the vector v.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> No
>>>>>>>>>>
>>>>>>>>>> The reason is that this could allow for more coalesced memory
>>>>>>>>>> access when doing matvecs.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector
>>>>>>>>>> products for the computation so in theory they should already be
>>>>>>>>>> well-optimized
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Sreeram
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> <dump.0>
>>>>>>
>>>>>>
>>>>>> <dump.0>
>>>>>
>>>>>
>>>>>
>>>> <Pmat.bin><Amat.bin><rhs.bin>
>>>
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/fcbf1a68/attachment-0001.html>

From pierre at joliv.et  Thu Dec 21 14:25:45 2023
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 21 Dec 2023 21:25:45 +0100
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <CA+MQGp-CJS=gvUNjCJRo__v39MJ=Aw+EAYB5sXAwrLRBNb_fAQ@mail.gmail.com>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
	<186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>
	<CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
	<7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>
	<CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>
	<CAMYG4Gkzpyx4wDX4-CFz4s-uzH-p0B+uFNPtU1PUSxCBz7nWVQ@mail.gmail.com>
	<CA+MQGp-CJS=gvUNjCJRo__v39MJ=Aw+EAYB5sXAwrLRBNb_fAQ@mail.gmail.com>
Message-ID: <B076115C-11BA-45D5-943A-2AA2E446EA93@joliv.et>


> On 21 Dec 2023, at 7:38?PM, Junchao Zhang <junchao.zhang at gmail.com> wrote:
> 
> 
> 
> 
> On Thu, Dec 21, 2023 at 5:54?AM Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> On Thu, Dec 21, 2023 at 6:46?AM Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>> Ok, I think the error I'm getting has something to do with how the multiple solves are being done in succession. I'll try to see if there's anything I'm doing wrong there. 
>>> 
>>> One question about the -pc_type lu -ksp_type preonly method: do you know which parts of the solve (factorization/triangular solves) are done on host and which are done on device?
>> 
>> For SEQDENSE, I believe both the factorization and solve is on device. It is hard to see, but I believe the dispatch code is here:
> Yes, it is correct.

But Sreeram matrix is sparse, so this does not really matter.
Sreeram, I don?t enough about the internals of CHOLMOD (and its interface in PETSc) to know which part is done on host and which part is done on device.
By the way, you mentioned a very high number of right-hand sides (> 1E4) for a moderately-sized problem (~ 1E6).
Is there a particular reason why you need so many of them?
Have you considered doing some sort of deflation to reduce the number of solves?

Thanks,
Pierre

>> 
>>   https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/dense/seq/cupm/matseqdensecupm.hpp?ref_type=heads#L368
>> 
>>   Thanks,
>> 
>>      Matt
>>  
>>> Thanks,
>>> Sreeram
>>> 
>>> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>> Unfortunately, I am not able to reproduce such a failure with your input matrix.
>>>> I?ve used ex79 that I linked previously and the system is properly solved.
>>>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs ascii::ascii_info
>>>> Linear solve converged due to CONVERGED_RTOL iterations 6
>>>> Mat Object: 1 MPI process
>>>>   type: seqaijcusparse
>>>>   rows=289, cols=289
>>>>   total: nonzeros=2401, allocated nonzeros=2401
>>>>   total number of mallocs used during MatSetValues calls=0
>>>>     not using I-node routines
>>>> Mat Object: 1 MPI process
>>>>   type: seqdensecuda
>>>>   rows=289, cols=10
>>>>   total: nonzeros=2890, allocated nonzeros=2890
>>>>   total number of mallocs used during MatSetValues calls=0
>>>> 
>>>> You mentioned in a subsequent email that you are interested in systems with at most 1E6 unknowns, and up to 1E4 right-hand sides.
>>>> I?m not sure you can expect significant gains from using GPU for such systems.
>>>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type preonly -ksp_matsolve_batch_size 100 or something, depending on the memory available on your host.
>>>> 
>>>> Thanks,
>>>> Pierre
>>>> 
>>>>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>> 
>>>>> Here are the ksp_view files.  I set the options -ksp_error_if_not_converged to try to get the vectors that caused the error. I noticed that some of the KSPMatSolves converge while others don't. In the code, the solves are called as:
>>>>> 
>>>>> input vector v --> insert data of v into a dense mat --> KSPMatSolve() --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output vector w -- output w
>>>>> 
>>>>> The operator used in the KSP is a Laplacian-like operator, and the MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve with a biharmonic-like operator. I can also run it with only the first KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP reportedly converges after 0 iterations (see the next line), but this causes problems in other parts of the code later on. 
>>>>> 
>>>>> I saw that sometimes the first KSPMatSolve "converges" after 0 iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything. 
>>>>> 
>>>>> I'll keep trying different options and also try to get the MWE made (this KSPMatSolve is pretty performance critical for us). 
>>>>> 
>>>>> Thanks for all your help,
>>>>> Sreeram
>>>>> 
>>>>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>> 
>>>>>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>> 
>>>>>>> Thanks, I will try to create a minimal reproducible example. This may take me some time though, as I need to figure out how to extract only the relevant parts (the full program this solve is used in is getting quite complex).
>>>>>> 
>>>>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files (I?m guessing your are using double-precision scalars with 32-bit PetscInt).
>>>>>> 
>>>>>>> I'll also try out some of the BoomerAMG options to see if that helps.
>>>>>> 
>>>>>> These should work (this is where all ?PCMatApply()-ready? PC are being tested): https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
>>>>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not with HIP).
>>>>>> I?m aware the performance should not be optimal (see your comment about host/device copies), I?ve money to hire someone to work on this but: a) I need to find the correct engineer/post-doc, b) I currently don?t have good use cases (of course, I could generate a synthetic benchmark, for science).
>>>>>> So even if you send me the three Mat, a MWE would be appreciated if the KSPMatSolve() is performance-critical for you (see point b) from above).
>>>>>> 
>>>>>> Thanks,
>>>>>> Pierre
>>>>>> 
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>> 
>>>>>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>> 
>>>>>>>>> Hello Pierre,
>>>>>>>>> 
>>>>>>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and it seems to be doing the batched solves, but the KSP is not converging due to a NaN or Inf being generated. I also noticed there are a lot of host-to-device and device-to-host copies of the matrices (the non-batched KSP solve did not have any memcopies). I have attached dump.0 again. Could you please take a look?
>>>>>>>> 
>>>>>>>> Yes, but you?d need to send me something I can run with your set of options (if you are more confident doing this in private, you can remove the list from c/c).
>>>>>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there is not much error checking, so instead of erroring out, this may be the reason why you are getting garbage.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Pierre
>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Sreeram
>>>>>>>>> 
>>>>>>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>> Hello Sreeram,
>>>>>>>>>> KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once.
>>>>>>>>>> There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY.
>>>>>>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>>>>>>>> I?m the main author of HPDDM, there is preliminary support for device matrices, but if it?s not working as intended/not faster than column by column, I?d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Pierre
>>>>>>>>>> 
>>>>>>>>>> PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made ?PCMatApply()-ready?.
>>>>>>>>>> 
>>>>>>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hello Pierre,
>>>>>>>>>>> 
>>>>>>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I am noticing that it is still solving column by column (this is stated explicitly in the info dump attached). I looked at the code for KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it should do the batched solve, though I'm not sure where that gets set. 
>>>>>>>>>>> 
>>>>>>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when running the code.
>>>>>>>>>>> 
>>>>>>>>>>> Can you please help me with this?
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Sreeram
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
>>>>>>>>>>>> N.B., AMGX interface is a bit experimental.
>>>>>>>>>>>> Mark
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Sreeram
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option. 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and see how it performs.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no PCMatApply() implementation.
>>>>>>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation.
>>>>>>>>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Pierre
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Sreeram
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce this on your own with https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner within your KSP.
>>>>>>>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of right-hand sides column by column, which is very inefficient.
>>>>>>>>>>>>>>>> You could run your code with -info dump and send us dump.0 to see what needs to be done on our end to make things more efficient, should you not be satisfied with the current performance of the code.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Pierre
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has size n. The data for v can be stored either in column-major or row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. 
>>>>>>>>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> From what I have read on the documentation, I can think of 2 approaches. 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to create a dense matrix V. Then do a MatMatMult with M*V = W, and take the data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve with R and V.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with the vector v. I don't know if KSPSolve with the MATMAIJ will know that it is a multiple RHS system and act accordingly.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Which would be the more efficient option?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Use 1. 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use row-major storage of the vector v.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> No
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The reason is that this could allow for more coalesced memory access when doing matvecs.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products for the computation so in theory they should already be well-optimized
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Sreeram
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>> <dump.0>
>>>>>>>>>> 
>>>>>>>>> <dump.0>
>>>>>>>> 
>>>>>> 
>>>>> <Pmat.bin><Amat.bin><rhs.bin>
>>>> 
>> 
>> 
>> --
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/7e91eca1/attachment-0001.html>

From bsmith at petsc.dev  Thu Dec 21 16:32:43 2023
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 21 Dec 2023 17:32:43 -0500
Subject: [petsc-users] TS docs wrong URLs in Examples
In-Reply-To: <a279a2e2-5008-4948-9a93-98d86e896cac@itp.uni-bremen.de>
References: <a279a2e2-5008-4948-9a93-98d86e896cac@itp.uni-bremen.de>
Message-ID: <2AC654F7-1119-4696-BC92-D0BDF271661E@petsc.dev>


https://gitlab.com/petsc/petsc/-/merge_requests/7135

Regex processing is not ideal for this task; I've modified the code to remove most false positive finds.

Thanks for reporting the problem,

Barry


> On Dec 21, 2023, at 8:35?AM, Niclas G?tting <ngoetting at itp.uni-bremen.de> wrote:
> 
> Hi all,
> 
> I noticed that all links to the examples under https://petsc.org/release/manualpages/TS/TS/ point to the wrong URL. Instead of src/ts/**/*, they point to src/sys/**/*, which does not seem to be right. This definitely is a minor issue, but I couldn't see an obvious fix via "Edit this page", so here is the e-mail.
> 
> Best regards
> Niclas
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231221/401ef04f/attachment.html>

From srvenkat at utexas.edu  Thu Dec 21 18:44:24 2023
From: srvenkat at utexas.edu (Sreeram R Venkat)
Date: Fri, 22 Dec 2023 06:14:24 +0530
Subject: [petsc-users] Matvecs and KSPSolves with multiple vectors
In-Reply-To: <B076115C-11BA-45D5-943A-2AA2E446EA93@joliv.et>
References: <CADtq7MuOQaMgArNAGZoz4-VvhzDmQ+XQvjFMKAxcSZj_ErbAUw@mail.gmail.com>
	<855B7267-E3F9-43CC-9EBE-96276E9A8E42@petsc.dev>
	<E6DF7289-D806-4615-83F6-7004B84525A7@joliv.et>
	<CADtq7MuL9mZuFd8iuCdR-kASHXtyHZJLo+ncbq5q3x_ooMrpHw@mail.gmail.com>
	<D4C36743-8643-4AEB-AAE0-68530B423F4F@joliv.et>
	<CADtq7Mt3CK3LwKqWTEfJCzpF3hfFCoUXaXP8oV6ac_NrA2waoQ@mail.gmail.com>
	<CADOhEh6Kt5eGbGWfv4Td_MTZEooSNAHbYPd0mkjuOPsNttMQUA@mail.gmail.com>
	<CADtq7Mvp1ckQ4Gt1pHuSiy2UWfN6gouLScm9_NVDHh6f1uusWA@mail.gmail.com>
	<9F227CDE-815C-4446-A424-0BBDA38B1E79@joliv.et>
	<CADtq7Muy7bm2yZn7Snj1HuPtwtMtEzVQgxPH_32fUBAQ+dJmKQ@mail.gmail.com>
	<C28ACF05-4F0F-4C26-9409-B45ED3B8C5EB@joliv.et>
	<CADtq7MtWUeZ1ejKp-a=vTtA5dGe9w3ANkwMNCg4JAaGxsCrxMw@mail.gmail.com>
	<186F2922-CD1B-4778-BDB3-8DA6EAB8D976@joliv.et>
	<CADtq7MvLAGEOpbTMW73tufr1onztjBfgcbR_1LKALCFQrimaaw@mail.gmail.com>
	<7921468C-F4B9-4BC0-ADD1-2F84619E8A7E@joliv.et>
	<CADtq7Mts9=0+LKeQXXUT6DYNoXfYops9j_dp=U1+3t9AM-gRrw@mail.gmail.com>
	<CAMYG4Gkzpyx4wDX4-CFz4s-uzH-p0B+uFNPtU1PUSxCBz7nWVQ@mail.gmail.com>
	<CA+MQGp-CJS=gvUNjCJRo__v39MJ=Aw+EAYB5sXAwrLRBNb_fAQ@mail.gmail.com>
	<B076115C-11BA-45D5-943A-2AA2E446EA93@joliv.et>
Message-ID: <CADtq7Mva8_j-vSVHtNJCASjPHt65nE07xKieLe3k2xmaqgU16g@mail.gmail.com>

The reason for the large number of RHS's is that the problem originally
comes from having to do one solve with 1e10 size matrix. If we make some
extra assumptions on our problem, that 1e10 matrix becomes block diagonal
with blocks of size 1e6 and all the blocks are the same. The 1e6 is a
spatial discretion dimension and the 1e4 is the number of time steps. So
that's why we wanted to do the batched solve like this.

Thanks,
Sreeram

On Fri, Dec 22, 2023, 1:56?AM Pierre Jolivet <pierre at joliv.et> wrote:

>
>
> On 21 Dec 2023, at 7:38?PM, Junchao Zhang <junchao.zhang at gmail.com> wrote:
>
>
>
>
> On Thu, Dec 21, 2023 at 5:54?AM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Thu, Dec 21, 2023 at 6:46?AM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>>> Ok, I think the error I'm getting has something to do with how the
>>> multiple solves are being done in succession. I'll try to see if there's
>>> anything I'm doing wrong there.
>>>
>>> One question about the -pc_type lu -ksp_type preonly method: do you know
>>> which parts of the solve (factorization/triangular solves) are done on host
>>> and which are done on device?
>>>
>>
>> For SEQDENSE, I believe both the factorization and solve is on device. It
>> is hard to see, but I believe the dispatch code is here:
>>
> Yes, it is correct.
>
>
> But Sreeram matrix is sparse, so this does not really matter.
> Sreeram, I don?t enough about the internals of CHOLMOD (and its interface
> in PETSc) to know which part is done on host and which part is done on
> device.
> By the way, you mentioned a very high number of right-hand sides (> 1E4)
> for a moderately-sized problem (~ 1E6).
> Is there a particular reason why you need so many of them?
> Have you considered doing some sort of deflation to reduce the number of
> solves?
>
> Thanks,
> Pierre
>
>
>>
>> https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/dense/seq/cupm/matseqdensecupm.hpp?ref_type=heads#L368
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Thanks,
>>> Sreeram
>>>
>>> On Sat, Dec 16, 2023 at 10:56?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>
>>>> Unfortunately, I am not able to reproduce such a failure with your
>>>> input matrix.
>>>> I?ve used ex79 that I linked previously and the system is properly
>>>> solved.
>>>> $ ./ex79 -pc_type hypre -ksp_type hpddm -ksp_hpddm_type cg
>>>> -ksp_converged_reason -ksp_view_mat ascii::ascii_info -ksp_view_rhs
>>>> ascii::ascii_info
>>>> Linear solve converged due to CONVERGED_RTOL iterations 6
>>>> Mat Object: 1 MPI process
>>>>   type: seqaijcusparse
>>>>   rows=289, cols=289
>>>>   total: nonzeros=2401, allocated nonzeros=2401
>>>>   total number of mallocs used during MatSetValues calls=0
>>>>     not using I-node routines
>>>> Mat Object: 1 MPI process
>>>>   type: seqdensecuda
>>>>   rows=289, cols=10
>>>>   total: nonzeros=2890, allocated nonzeros=2890
>>>>   total number of mallocs used during MatSetValues calls=0
>>>>
>>>> You mentioned in a subsequent email that you are interested in systems
>>>> with at most 1E6 unknowns, and up to 1E4 right-hand sides.
>>>> I?m not sure you can expect significant gains from using GPU for such
>>>> systems.
>>>> Probably, the fastest approach would indeed be -pc_type lu -ksp_type
>>>> preonly -ksp_matsolve_batch_size 100 or something, depending on the memory
>>>> available on your host.
>>>>
>>>> Thanks,
>>>> Pierre
>>>>
>>>> On 15 Dec 2023, at 9:52?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>> Here are the ksp_view files.  I set the options
>>>> -ksp_error_if_not_converged to try to get the vectors that caused the
>>>> error. I noticed that some of the KSPMatSolves converge while others don't.
>>>> In the code, the solves are called as:
>>>>
>>>> input vector v --> insert data of v into a dense mat --> KSPMatSolve()
>>>> --> MatMatMult() --> KSPMatSolve() --> insert data of dense mat into output
>>>> vector w -- output w
>>>>
>>>> The operator used in the KSP is a Laplacian-like operator, and the
>>>> MatMatMult is with a Mass Matrix. The whole thing is supposed to be a solve
>>>> with a biharmonic-like operator. I can also run it with only the first
>>>> KSPMatSolve (i.e. just a Laplacian-like operator). In that case, the KSP
>>>> reportedly converges after 0 iterations (see the next line), but this
>>>> causes problems in other parts of the code later on.
>>>>
>>>> I saw that sometimes the first KSPMatSolve "converges" after 0
>>>> iterations due to CONVERGED_RTOL. Then, the second KSPMatSolve produces a
>>>> NaN/Inf. I tried setting ksp_min_it, but that didn't seem to do anything.
>>>>
>>>> I'll keep trying different options and also try to get the MWE made
>>>> (this KSPMatSolve is pretty performance critical for us).
>>>>
>>>> Thanks for all your help,
>>>> Sreeram
>>>>
>>>> On Fri, Dec 15, 2023 at 1:01?AM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>
>>>>>
>>>>> On 14 Dec 2023, at 11:45?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>> Thanks, I will try to create a minimal reproducible example. This may
>>>>> take me some time though, as I need to figure out how to extract only the
>>>>> relevant parts (the full program this solve is used in is getting quite
>>>>> complex).
>>>>>
>>>>>
>>>>> You could just do -ksp_view_mat binary:Amat.bin -ksp_view_pmat
>>>>> binary:Pmat.bin -ksp_view_rhs binary:rhs.bin and send me those three files
>>>>> (I?m guessing your are using double-precision scalars with 32-bit PetscInt).
>>>>>
>>>>> I'll also try out some of the BoomerAMG options to see if that helps.
>>>>>
>>>>>
>>>>> These should work (this is where all ?PCMatApply()-ready? PC are being
>>>>> tested):
>>>>> https://petsc.org/release/src/ksp/ksp/tutorials/ex79.c.html#line215
>>>>> You can see it?s also testing PCHYPRE + KSPHPDDM on device (but not
>>>>> with HIP).
>>>>> I?m aware the performance should not be optimal (see your comment
>>>>> about host/device copies), I?ve money to hire someone to work on this but:
>>>>> a) I need to find the correct engineer/post-doc, b) I currently don?t have
>>>>> good use cases (of course, I could generate a synthetic benchmark, for
>>>>> science).
>>>>> So even if you send me the three Mat, a MWE would be appreciated if
>>>>> the KSPMatSolve() is performance-critical for you (see point b) from above).
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>> On Thu, Dec 14, 2023, 1:12?PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 14 Dec 2023, at 8:02?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>> wrote:
>>>>>>
>>>>>> Hello Pierre,
>>>>>>
>>>>>> Thank you for your reply. I tried out the HPDDM CG as you said, and
>>>>>> it seems to be doing the batched solves, but the KSP is not converging due
>>>>>> to a NaN or Inf being generated. I also noticed there are a lot of
>>>>>> host-to-device and device-to-host copies of the matrices (the non-batched
>>>>>> KSP solve did not have any memcopies). I have attached dump.0 again. Could
>>>>>> you please take a look?
>>>>>>
>>>>>>
>>>>>> Yes, but you?d need to send me something I can run with your set of
>>>>>> options (if you are more confident doing this in private, you can remove
>>>>>> the list from c/c).
>>>>>> Not all BoomerAMG smoothers handle blocks of right-hand sides, and
>>>>>> there is not much error checking, so instead of erroring out, this may be
>>>>>> the reason why you are getting garbage.
>>>>>>
>>>>>> Thanks,
>>>>>> Pierre
>>>>>>
>>>>>> Thanks,
>>>>>> Sreeram
>>>>>>
>>>>>> On Thu, Dec 14, 2023 at 12:42?AM Pierre Jolivet <pierre at joliv.et>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello Sreeram,
>>>>>>> KSPCG (PETSc implementation of CG) does not handle solves with
>>>>>>> multiple columns at once.
>>>>>>> There is only a single native PETSc KSP implementation which handles
>>>>>>> solves with multiple columns at once: KSPPREONLY.
>>>>>>> If you use --download-hpddm, you can use a CG (or GMRES, or more
>>>>>>> advanced methods) implementation which handles solves with multiple columns
>>>>>>> at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp,
>>>>>>> KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>>>>>>> I?m the main author of HPDDM, there is preliminary support for
>>>>>>> device matrices, but if it?s not working as intended/not faster than column
>>>>>>> by column, I?d be happy to have a deeper look (maybe in private), because
>>>>>>> most (if not all) of my users interested in (pseudo-)block Krylov solvers
>>>>>>> (i.e., solvers that treat right-hand sides in a single go) are using plain
>>>>>>> host matrices.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Pierre
>>>>>>>
>>>>>>> PS: you could have a look at
>>>>>>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
>>>>>>> understand the philosophy behind block iterative methods in PETSc (and in
>>>>>>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
>>>>>>> developed in the context of this paper to produce Figures 2-3. Note that
>>>>>>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
>>>>>>> others) have been made ?PCMatApply()-ready?.
>>>>>>>
>>>>>>> On 13 Dec 2023, at 11:05?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hello Pierre,
>>>>>>>
>>>>>>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
>>>>>>> However, I am noticing that it is still solving column by column (this is
>>>>>>> stated explicitly in the info dump attached). I looked at the code for
>>>>>>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is
>>>>>>> true, it should do the batched solve, though I'm not sure where that gets
>>>>>>> set.
>>>>>>>
>>>>>>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
>>>>>>> running the code.
>>>>>>>
>>>>>>> Can you please help me with this?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Dec 7, 2023 at 4:04?PM Mark Adams <mfadams at lbl.gov> wrote:
>>>>>>>
>>>>>>>> N.B., AMGX interface is a bit experimental.
>>>>>>>> Mark
>>>>>>>>
>>>>>>>> On Thu, Dec 7, 2023 at 4:11?PM Sreeram R Venkat <
>>>>>>>> srvenkat at utexas.edu> wrote:
>>>>>>>>
>>>>>>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>>>>>>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Sreeram
>>>>>>>>>
>>>>>>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7 Dec 2023, at 9:37?PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>>>>>>>
>>>>>>>>>> I want to use the AMGX preconditioner for the KSP. I will try it
>>>>>>>>>> out and see how it performs.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and
>>>>>>>>>> thus has no PCMatApply() implementation.
>>>>>>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>>>>>>>> implementation.
>>>>>>>>>> But let us know if you need assistance figuring things out.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Pierre
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Sreeram
>>>>>>>>>>
>>>>>>>>>> On Thu, Dec 7, 2023 at 2:02?PM Pierre Jolivet <pierre at joliv.et>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> To expand on Barry?s answer, we have observed repeatedly that
>>>>>>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>>>>>>>> reproduce this on your own with
>>>>>>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>>>>>>> Also, I?m guessing you are using some sort of preconditioner
>>>>>>>>>>> within your KSP.
>>>>>>>>>>> Not all are ?KSPMatSolve-ready?, i.e., they may treat blocks of
>>>>>>>>>>> right-hand sides column by column, which is very inefficient.
>>>>>>>>>>> You could run your code with -info dump and send us dump.0 to
>>>>>>>>>>> see what needs to be done on our end to make things more efficient, should
>>>>>>>>>>> you not be satisfied with the current performance of the code.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Pierre
>>>>>>>>>>>
>>>>>>>>>>> On 7 Dec 2023, at 8:34?PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Dec 7, 2023, at 1:17?PM, Sreeram R Venkat <
>>>>>>>>>>> srvenkat at utexas.edu> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of
>>>>>>>>>>> size n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where
>>>>>>>>>>> v_i has size n. The data for v can be stored either in column-major or
>>>>>>>>>>> row-major order.  Now, I want to do 2 types of operations:
>>>>>>>>>>>
>>>>>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>>>>>>
>>>>>>>>>>> From what I have read on the documentation, I can think of 2
>>>>>>>>>>> approaches.
>>>>>>>>>>>
>>>>>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>>>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>>>>>>>> with R and V.
>>>>>>>>>>>
>>>>>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly
>>>>>>>>>>> with the vector v. I don't know if KSPSolve with the MATMAIJ will know that
>>>>>>>>>>> it is a multiple RHS system and act accordingly.
>>>>>>>>>>>
>>>>>>>>>>> Which would be the more efficient option?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Use 1.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> As a side-note, I am also wondering if there is a way to use
>>>>>>>>>>> row-major storage of the vector v.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> No
>>>>>>>>>>>
>>>>>>>>>>> The reason is that this could allow for more coalesced memory
>>>>>>>>>>> access when doing matvecs.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector
>>>>>>>>>>> products for the computation so in theory they should already be
>>>>>>>>>>> well-optimized
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Sreeram
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> <dump.0>
>>>>>>>
>>>>>>>
>>>>>>> <dump.0>
>>>>>>
>>>>>>
>>>>>>
>>>>> <Pmat.bin><Amat.bin><rhs.bin>
>>>>
>>>>
>>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231222/ccb88e11/attachment-0001.html>