[petsc-users] VecNorm causes program to hang
Sreeram R Venkat
srvenkat at utexas.edu
Thu Nov 16 17:19:02 CST 2023
I have a program which reads a vector from file into an array, and then
uses that array to create a PETSc Vec object. The Vec is defined on the
global communicator, but not all processes actually contain entries of it.
For example, suppose we have 4 processors, and the vector is of size 10.
Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks
2 and 3 will not have any entries of the Vec.
This Vec is then used as an input to other parts of the code, and those
work fine. However, if I try to take the norm of the Vec with VecNorm(), I
get the error
`MPI_Allreduce() called in different locations (code lines) on different
processors`
The stack trace shows that ranks 0 and 1 (from the above example) are still
in the VecNorm() function while ranks 2 and 3 have moved on to a later part
of the code. If I add a PetscBarrier() after the VecNorm(), I find that the
program hangs.
The funny thing is that part of the code duplicates the Vec with
VecDuplicate() and assigns to the duplicated vector the result of some
computations. The duplicated Vec has the same layout as the original Vec,
but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(),
however, the copied Vec also causes VecNorm() to hang. I've printed out the
original Vec, and there are no corrupted/NaN entries.
I have a temporary workaround where I perturb the original Vec slightly
before copying it to another Vec. This causes the program to successfully
terminate.
Any advice on how to get VecNorm() working with the original Vec?
Thanks,
Sreeram
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231116/bc2b182c/attachment.html>
More information about the petsc-users
mailing list