[petsc-users] about repeat of expensive functions using VecScatterCreateToAll

Venugopal, Vysakh (venugovh) venugovh at mail.uc.edu
Tue Jan 17 16:27:51 CST 2023


Sure, I will try this. I will update this thread once I get it working using the suggested method. Thank you!

Vysakh

From: Blaise Bourdin <bourdin at mcmaster.ca>
Sent: Tuesday, January 17, 2023 5:13 PM
To: Venugopal, Vysakh (venugovh) <venugovh at mail.uc.edu>
Cc: Barry Smith <bsmith at petsc.dev>; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] about repeat of expensive functions using VecScatterCreateToAll


External Email: Use Caution


Got it. Can you partition your mesh with only one processor in the z-direction? (Trivial if using DMDA)
Blaise



On Jan 17, 2023, at 4:49 PM, Venugopal, Vysakh (venugovh) <venugovh at mail.uc.edu<mailto:venugovh at mail.uc.edu>> wrote:

This is the support structure minimization filter. So I need to go layer-by-layer from the bottommost slice of the array and update it as I move up. Every slice needs the updated values below that slice.

Vysakh

From: Blaise Bourdin <bourdin at mcmaster.ca<mailto:bourdin at mcmaster.ca>>
Sent: Tuesday, January 17, 2023 4:47 PM
To: Venugopal, Vysakh (venugovh) <venugovh at mail.uc.edu<mailto:venugovh at mail.uc.edu>>
Cc: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>; petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] about repeat of expensive functions using VecScatterCreateToAll


External Email: Use Caution



What type of filter are you implementing?
Convolution filters are expensive to parallelize since you need an overlap of the size of the support of the filter, but it may still not be worst than doing it sequentially (typically the filter size is only one or 2 element diameters). Or you may be able to apply the filter in Fourier space.
PDE-filters are typically elliptic and can be parallelized.

Blaise



On Jan 17, 2023, at 4:38 PM, Venugopal,
Vysakh (venugovh) via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Thank you! I am doing a structural optimization filter that inherently cannot be parallelized.

Vysakh

From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Tuesday, January 17, 2023 3:28 PM
To: Venugopal, Vysakh (venugovh) <venugovh at mail.uc.edu<mailto:venugovh at mail.uc.edu>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] about repeat of expensive functions using VecScatterCreateToAll


External Email: Use Caution









On Jan 17, 2023, at 3:12 PM, Venugopal, Vysakh (venugovh) via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Hi,

I am doing the following thing.

Step 1. Create DM object and get global vector 'V' using DMGetGlobalVector.
Step 2. Doing some parallel operations on V.
Step 3. I am using VecScatterCreateToAll on V to create a sequential vector 'V_SEQ' using VecScatterBegin/End with SCATTER_FORWARD.
Step 4. I am performing an expensive operation on V_SEQ and outputting the updated V_SEQ.
Step 5. I am using VecScatterBegin/End with SCATTER_REVERSE (global and sequential flipped) to get V that is updated with new values from V_SEQ.
Step 6. I continue using this new V on the rest of the parallelized program.

Question: Suppose I have n MPI processes, is the expensive operation in Step 4 repeated n times? If yes, is there a workaround such that the operation in Step 4 is performed only once? I would like to follow the same structure as steps 1 to 6 with step 4 only performed once.

  Each MPI rank is doing the same operations on its copy of the sequential vector. Since they are running in parallel it probably does not matter much that each is doing the same computation. Step 5 does not require any MPI since only part of the sequential vector (which everyone has) is needed in the parallel vector.

  You could use VecScatterCreateToZero() but then step 3 would require less communication but step 5 would require communication to get parts of the solution from rank 0 to the other ranks. The time for step 4 would be roughly the same.

  You will likely only see a worthwhile improvement in performance if you can parallelize the computation in 4. What are you doing that is computational intense and requires all the data on a rank?

Barry





Thanks,

Vysakh Venugopal
---
Vysakh Venugopal
Ph.D. Candidate
Department of Mechanical Engineering
University of Cincinnati, Cincinnati, OH 45221-0072

-
Canada Research Chair in Mathematical and Computational Aspects of Solid Mechanics (Tier 1)
Professor, Department of Mathematics & Statistics
Hamilton Hall room 409A, McMaster University
1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada
https://www.math.mcmaster.ca/bourdin<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.math.mcmaster.ca%2Fbourdin&data=05%7C01%7Cvenugovh%40mail.uc.edu%7C3a1025fb4ab04e99932508daf8d80e07%7Cf5222e6c5fc648eb8f0373db18203b63%7C1%7C0%7C638095904080757941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1Gm1hWVHEAiKmMvLH5sMb8pdJpjykUtWTAI3f4XgwmU%3D&reserved=0> | +1 (905) 525 9140 ext. 27243

-
Canada Research Chair in Mathematical and Computational Aspects of Solid Mechanics (Tier 1)
Professor, Department of Mathematics & Statistics
Hamilton Hall room 409A, McMaster University
1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada
https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230117/87081e8e/attachment-0001.html>


More information about the petsc-users mailing list