[petsc-users] PETSc initialization error

Sam Guo sam.guo at cd-adapco.com
Fri Jun 26 17:43:42 CDT 2020


Hi Junchao,
   I am not ready to upgrade petsc yet(due to the lengthy technical and
legal approval process of our internal policy). Can you send me the diff
file so I can apply it to petsc 3.11.3)?

Thanks,
Sam

On Fri, Jun 26, 2020 at 3:33 PM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> Sam,
>   Please discard the origin patch I sent you. A better fix is already in
> maint/master. An test is at src/sys/tests/ex53.c
>   I modified that test at the end with
>
>   for (i=0; i<500; i++) {
>     ierr = PetscInitializeNoPointers(argc,argv,NULL,help);if (ierr) return
> ierr;
>     ierr = SlepcInitialize(&argc,&argv,NULL,help);if (ierr) return ierr;
>     ierr = SlepcFinalize();if (ierr) return ierr;
>     ierr = PetscFinalize();if (ierr) return ierr;
>   }
>
>
>  then I ran it with multiple mpi ranks and it ran correctly. So try your
> program with petsc master first. If not work, see if you can come up with a
> test example for us.
>
>  Thanks.
> --Junchao Zhang
>
>
> On Fri, Jun 26, 2020 at 3:37 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>
>> One work around for me is to call PetscInitialize once for my entire
>> program and skip PetscFinalize (since I don't have a good place to call
>> PetscFinalize   before ending the program).
>>
>> On Fri, Jun 26, 2020 at 1:33 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>
>>> I get the crash after calling Initialize/Finalize multiple times.
>>> Junchao fixed the bug for serial but parallel still crashes.
>>>
>>> On Fri, Jun 26, 2020 at 1:28 PM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>>   Ah, so you get the crash the second time you call PetscInitialize()?
>>>> That is a problem because we do intend to support that capability (but you
>>>> much call PetscFinalize() each time also).
>>>>
>>>>   Barry
>>>>
>>>>
>>>> On Jun 26, 2020, at 3:25 PM, Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>>
>>>> Hi Barry,
>>>>    Thanks for the quick response.
>>>>    I will call PetscInitialize once and skip the PetscFinalize for now
>>>> to avoid the crash. The crash is actually in PetscInitialize, not
>>>> PetscFinalize.
>>>>
>>>> Thanks,
>>>> Sam
>>>>
>>>> On Fri, Jun 26, 2020 at 1:21 PM Barry Smith <bsmith at petsc.dev> wrote:
>>>>
>>>>>
>>>>>   Sam,
>>>>>
>>>>>   You can skip PetscFinalize() so long as you only call
>>>>> PetscInitialize() once. It is not desirable in general to skip the finalize
>>>>> because PETSc can't free all its data structures and you cannot see the
>>>>> PETSc logging information with -log_view but in terms of the code running
>>>>> correctly you do not need to call PetscFinalize.
>>>>>
>>>>>    If your code crashes in PetscFinalize() please send the full error
>>>>> output and we can try to help you debug it.
>>>>>
>>>>>
>>>>>    Barry
>>>>>
>>>>> On Jun 26, 2020, at 3:14 PM, Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>>>
>>>>> To clarify, we have a mpi wrapper (so we can switch to different mpi
>>>>> at runtime). I compile petsc using our mpi wrapper.
>>>>> If I just call PETSc initialize once without calling finallize, it is
>>>>> ok. My question to you is that: can I skip finalize?
>>>>> Our program calls mpi_finalize at end anyway.
>>>>>
>>>>> On Fri, Jun 26, 2020 at 1:09 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>>>
>>>>>> Hi Junchao,
>>>>>>    Attached please find the configure.log.
>>>>>>    I also attach the pinit.c which contains your patch (I am
>>>>>> currently using 3.11.3. I've applied your patch to 3.11.3). Your patch
>>>>>> fixes the serial version. The error now is about the parallel.
>>>>>>    Here is the error log:
>>>>>>
>>>>>> [1]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>> [1]PETSC ERROR: #2 checkError() line 56 in
>>>>>> ../../../physics/src/eigensolver/SLEPc.cpp
>>>>>> [1]PETSC ERROR: #3 PetscInitialize() line 966 in
>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>> [1]PETSC ERROR: #4 SlepcInitialize() line 262 in
>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>> [0]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>> [0]PETSC ERROR: #2 checkError() line 56 in
>>>>>> ../../../physics/src/eigensolver/SLEPc.cpp
>>>>>> [0]PETSC ERROR: #3 PetscInitialize() line 966 in
>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>> [0]PETSC ERROR: #4 SlepcInitialize() line 262 in
>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>> You might have forgotten to call PetscInitialize().
>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>>>>> with errorcode 56.
>>>>>>
>>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>>>> You may or may not see output from other processes, depending on
>>>>>> exactly when Open MPI kills them.
>>>>>>
>>>>>> Thanks,
>>>>>> Sam
>>>>>>
>>>>>> On Thu, Jun 25, 2020 at 7:37 PM Junchao Zhang <
>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>
>>>>>>> Sam,
>>>>>>>    The MPI_Comm_create_keyval() error was fixed in maint/master.
>>>>>>> From the error message, it seems you need to configure --with-log=1
>>>>>>>    Otherwise, please send your full error stack trace and
>>>>>>> configure.log.
>>>>>>>   Thanks.
>>>>>>> --Junchao Zhang
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jun 25, 2020 at 2:18 PM Sam Guo <sam.guo at cd-adapco.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Junchao,
>>>>>>>>    I now encountered the same error with parallel. I am wondering
>>>>>>>> if there is a need for parallel fix as well.
>>>>>>>> [1]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>>
>>>>>>>> On Sat, Jun 20, 2020 at 7:35 PM Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Junchao,
>>>>>>>>>    Your patch works.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Sam
>>>>>>>>>
>>>>>>>>> On Sat, Jun 20, 2020 at 4:23 PM Junchao Zhang <
>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sat, Jun 20, 2020 at 12:24 PM Barry Smith <bsmith at petsc.dev>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    Junchao,
>>>>>>>>>>>
>>>>>>>>>>>      This is a good bug fix. It solves the problem when PETSc
>>>>>>>>>>> initialize is called many times.
>>>>>>>>>>>
>>>>>>>>>>>      There is another fix you can do to limit PETSc mpiuni
>>>>>>>>>>> running out of attributes inside a single PETSc run:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> int MPI_Comm_create_keyval(MPI_Copy_function
>>>>>>>>>>> *copy_fn,MPI_Delete_function *delete_fn,int *keyval,void *extra_state)
>>>>>>>>>>> {
>>>>>>>>>>>
>>>>>>>>>>>  if (num_attr >= MAX_ATTR){
>>>>>>>>>>>    for (i=0; i<num_attr; i++) {
>>>>>>>>>>>      if (!attr_keyval[i].extra_state) {
>>>>>>>>>>>
>>>>>>>>>> attr_keyval[i].extra_state is provided by user (could be NULL).
>>>>>>>>>> We can not rely on it.
>>>>>>>>>>
>>>>>>>>>>>         /* reuse this slot */
>>>>>>>>>>>         attr_keyval[i].extra_state = extra_state;
>>>>>>>>>>>        attr_keyval[i.]del         = delete_fn;
>>>>>>>>>>>        *keyval = i;
>>>>>>>>>>>         return MPI_SUCCESS;
>>>>>>>>>>>      }
>>>>>>>>>>>   }
>>>>>>>>>>>   return MPIUni_Abort(MPI_COMM_WORLD,1);
>>>>>>>>>>> }
>>>>>>>>>>>  return MPIUni_Abort(MPI_COMM_WORLD,1);
>>>>>>>>>>>   attr_keyval[num_attr].extra_state = extra_state;
>>>>>>>>>>>   attr_keyval[num_attr].del         = delete_fn;
>>>>>>>>>>>   *keyval                           = num_attr++;
>>>>>>>>>>>   return MPI_SUCCESS;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>   This will work if the user creates tons of attributes but is
>>>>>>>>>>> constantly deleting some as they new ones. So long as the number
>>>>>>>>>>> outstanding at one time is < MAX_ATTR)
>>>>>>>>>>>
>>>>>>>>>>> Barry
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Jun 20, 2020, at 10:54 AM, Junchao Zhang <
>>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I don't understand what your session means. Let's try this patch
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/src/sys/mpiuni/mpi.c b/src/sys/mpiuni/mpi.c
>>>>>>>>>>> index d559a513..c058265d 100644
>>>>>>>>>>> --- a/src/sys/mpiuni/mpi.c
>>>>>>>>>>> +++ b/src/sys/mpiuni/mpi.c
>>>>>>>>>>> @@ -283,6 +283,7 @@ int MPI_Finalize(void)
>>>>>>>>>>>    MPI_Comm_free(&comm);
>>>>>>>>>>>    comm = MPI_COMM_SELF;
>>>>>>>>>>>    MPI_Comm_free(&comm);
>>>>>>>>>>> +  num_attr = 1; /* reset the counter */
>>>>>>>>>>>    MPI_was_finalized = 1;
>>>>>>>>>>>    return MPI_SUCCESS;
>>>>>>>>>>>  }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --Junchao Zhang
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Jun 20, 2020 at 10:48 AM Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Typo: I mean “Assuming initializer is only needed once for
>>>>>>>>>>>> entire session”
>>>>>>>>>>>>
>>>>>>>>>>>> On Saturday, June 20, 2020, Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Assuming finalizer is only needed once for entire session(?),
>>>>>>>>>>>>> I can put initializer into the static block to call it once but where do I
>>>>>>>>>>>>> call finalizer?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Saturday, June 20, 2020, Junchao Zhang <
>>>>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The counter num_attr should be recycled. But first try to
>>>>>>>>>>>>>> call PETSc initialize/Finalize only once to see it fixes the error.
>>>>>>>>>>>>>> --Junchao Zhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Jun 20, 2020 at 12:48 AM Sam Guo <
>>>>>>>>>>>>>> sam.guo at cd-adapco.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To clarify, I call PETSc initialize and PETSc finalize
>>>>>>>>>>>>>>> everytime I call SLEPc:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   PetscInitializeNoPointers(argc,args,nullptr,nullptr);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   SlepcInitialize(&argc,&args,static_cast<char*>(nullptr),help);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   //calling slepc
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   SlepcFinalize();
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    PetscFinalize();
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jun 19, 2020 at 10:32 PM Sam Guo <
>>>>>>>>>>>>>>> sam.guo at cd-adapco.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dear PETSc team,
>>>>>>>>>>>>>>>>    When I called SLEPc multiple time, I eventually got
>>>>>>>>>>>>>>>> following error:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> MPI operation not supported by PETSc's sequential MPI
>>>>>>>>>>>>>>>> wrappers
>>>>>>>>>>>>>>>> [0]PETSC ERROR: #1 PetscInitialize() line 967 in
>>>>>>>>>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>>>>>>>>>> [0]PETSC ERROR: #2 SlepcInitialize() line 262 in
>>>>>>>>>>>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>>>>>>>>>>>> [0]PETSC ERROR: #3 SlepcInitializeNoPointers() line 359 in
>>>>>>>>>>>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>>>>>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>>>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>   I debugged: it is because of following in
>>>>>>>>>>>>>>>> petsc/src/sys/mpiuni/mpi.c
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> if (num_attr >= MAX_ATTR)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> in function int MPI_Comm_create_keyval(MPI_Copy_function
>>>>>>>>>>>>>>>> *copy_fn,MPI_Delete_function *delete_fn,int *keyval,void *extra_state)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> num_attr is declared static and keeps increasing every
>>>>>>>>>>>>>>>> time MPI_Comm_create_keyval is called.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am using petsc 3.11.3 but found 3.13.2 has the same logic.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is this a bug or I didn't use it correctly?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Sam
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200626/3770d0a0/attachment-0001.html>


More information about the petsc-users mailing list