[petsc-users] VecNorm causes program to hang

Sreeram R Venkat srvenkat at utexas.edu
Thu Nov 16 19:38:02 CST 2023


Ok, will do. It may take me a few days to get a minimal reproducible
example though since the rest of the program has gotten quite large.

Thanks,
Sreeram

On Thu, Nov 16, 2023 at 8:27 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Nov 16, 2023 at 6:19 PM Sreeram R Venkat <srvenkat at utexas.edu>
> wrote:
>
>> I have a program which reads a vector from file into an array, and then
>> uses that array to create a PETSc Vec object. The Vec is defined on the
>> global communicator, but not all processes actually contain entries of it.
>> For example, suppose we have 4 processors, and the vector is of size 10.
>> Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks
>> 2 and 3 will not have any entries of the Vec.
>>
>> This Vec is then used as an input to other parts of the code, and those
>> work fine. However, if I try to take the norm of the Vec with VecNorm(), I
>> get the error
>>
>> `MPI_Allreduce() called in different locations (code lines) on different
>> processors`
>>
>> The stack trace shows that ranks 0 and 1 (from the above example) are
>> still in the VecNorm() function while ranks 2 and 3 have moved on to a
>> later part of the code. If I add a PetscBarrier() after the VecNorm(), I
>> find that the program hangs.
>>
>> The funny thing is that part of the code duplicates the Vec with
>> VecDuplicate() and assigns to the duplicated vector the result of some
>> computations. The duplicated Vec has the same layout as the original Vec,
>> but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(),
>> however, the copied Vec also causes VecNorm() to hang. I've printed out the
>> original Vec, and there are no corrupted/NaN entries.
>>
>> I have a temporary workaround where I perturb the original Vec slightly
>> before copying it to another Vec. This causes the program to successfully
>> terminate.
>>
>> Any advice on how to get VecNorm() working with the original Vec?
>>
>
> Vecs with empty layouts work fine, so it must be something else about how
> it is created.
>
> In order to track it down, I would first make a short program that just
> creates the Vec as you say and see if it hangs. If so, just send it and we
> will debug it. If not, I would systematically cut down your program until
> you get something that hangs that you can send to us.
>
>   Thanks,
>
>      Matt
>
>
>> Thanks,
>> Sreeram
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231116/3876fcaf/attachment.html>


More information about the petsc-users mailing list