[petsc-users] PETSc initialization error

Junchao Zhang junchao.zhang at gmail.com
Fri Jun 26 17:33:32 CDT 2020


Sam,
  Please discard the origin patch I sent you. A better fix is already in
maint/master. An test is at src/sys/tests/ex53.c
  I modified that test at the end with

  for (i=0; i<500; i++) {
    ierr = PetscInitializeNoPointers(argc,argv,NULL,help);if (ierr) return
ierr;
    ierr = SlepcInitialize(&argc,&argv,NULL,help);if (ierr) return ierr;
    ierr = SlepcFinalize();if (ierr) return ierr;
    ierr = PetscFinalize();if (ierr) return ierr;
  }


 then I ran it with multiple mpi ranks and it ran correctly. So try your
program with petsc master first. If not work, see if you can come up with a
test example for us.

 Thanks.
--Junchao Zhang


On Fri, Jun 26, 2020 at 3:37 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> One work around for me is to call PetscInitialize once for my entire
> program and skip PetscFinalize (since I don't have a good place to call
> PetscFinalize   before ending the program).
>
> On Fri, Jun 26, 2020 at 1:33 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>
>> I get the crash after calling Initialize/Finalize multiple times. Junchao
>> fixed the bug for serial but parallel still crashes.
>>
>> On Fri, Jun 26, 2020 at 1:28 PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>>   Ah, so you get the crash the second time you call PetscInitialize()?
>>> That is a problem because we do intend to support that capability (but you
>>> much call PetscFinalize() each time also).
>>>
>>>   Barry
>>>
>>>
>>> On Jun 26, 2020, at 3:25 PM, Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>
>>> Hi Barry,
>>>    Thanks for the quick response.
>>>    I will call PetscInitialize once and skip the PetscFinalize for now
>>> to avoid the crash. The crash is actually in PetscInitialize, not
>>> PetscFinalize.
>>>
>>> Thanks,
>>> Sam
>>>
>>> On Fri, Jun 26, 2020 at 1:21 PM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>>   Sam,
>>>>
>>>>   You can skip PetscFinalize() so long as you only call
>>>> PetscInitialize() once. It is not desirable in general to skip the finalize
>>>> because PETSc can't free all its data structures and you cannot see the
>>>> PETSc logging information with -log_view but in terms of the code running
>>>> correctly you do not need to call PetscFinalize.
>>>>
>>>>    If your code crashes in PetscFinalize() please send the full error
>>>> output and we can try to help you debug it.
>>>>
>>>>
>>>>    Barry
>>>>
>>>> On Jun 26, 2020, at 3:14 PM, Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>>
>>>> To clarify, we have a mpi wrapper (so we can switch to different mpi at
>>>> runtime). I compile petsc using our mpi wrapper.
>>>> If I just call PETSc initialize once without calling finallize, it is
>>>> ok. My question to you is that: can I skip finalize?
>>>> Our program calls mpi_finalize at end anyway.
>>>>
>>>> On Fri, Jun 26, 2020 at 1:09 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>>
>>>>> Hi Junchao,
>>>>>    Attached please find the configure.log.
>>>>>    I also attach the pinit.c which contains your patch (I am currently
>>>>> using 3.11.3. I've applied your patch to 3.11.3). Your patch fixes the
>>>>> serial version. The error now is about the parallel.
>>>>>    Here is the error log:
>>>>>
>>>>> [1]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>> [1]PETSC ERROR: #2 checkError() line 56 in
>>>>> ../../../physics/src/eigensolver/SLEPc.cpp
>>>>> [1]PETSC ERROR: #3 PetscInitialize() line 966 in
>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>> [1]PETSC ERROR: #4 SlepcInitialize() line 262 in
>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>> [0]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>> [0]PETSC ERROR: #2 checkError() line 56 in
>>>>> ../../../physics/src/eigensolver/SLEPc.cpp
>>>>> [0]PETSC ERROR: #3 PetscInitialize() line 966 in
>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>> [0]PETSC ERROR: #4 SlepcInitialize() line 262 in
>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>> PETSC ERROR: Logging has not been enabled.
>>>>> You might have forgotten to call PetscInitialize().
>>>>> PETSC ERROR: Logging has not been enabled.
>>>>> You might have forgotten to call PetscInitialize().
>>>>>
>>>>> --------------------------------------------------------------------------
>>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>>>> with errorcode 56.
>>>>>
>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>>> You may or may not see output from other processes, depending on
>>>>> exactly when Open MPI kills them.
>>>>>
>>>>> Thanks,
>>>>> Sam
>>>>>
>>>>> On Thu, Jun 25, 2020 at 7:37 PM Junchao Zhang <junchao.zhang at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Sam,
>>>>>>    The MPI_Comm_create_keyval() error was fixed in maint/master. From
>>>>>> the error message, it seems you need to configure --with-log=1
>>>>>>    Otherwise, please send your full error stack trace and
>>>>>> configure.log.
>>>>>>   Thanks.
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 25, 2020 at 2:18 PM Sam Guo <sam.guo at cd-adapco.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Junchao,
>>>>>>>    I now encountered the same error with parallel. I am wondering if
>>>>>>> there is a need for parallel fix as well.
>>>>>>> [1]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>
>>>>>>> On Sat, Jun 20, 2020 at 7:35 PM Sam Guo <sam.guo at cd-adapco.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Junchao,
>>>>>>>>    Your patch works.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sam
>>>>>>>>
>>>>>>>> On Sat, Jun 20, 2020 at 4:23 PM Junchao Zhang <
>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jun 20, 2020 at 12:24 PM Barry Smith <bsmith at petsc.dev>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    Junchao,
>>>>>>>>>>
>>>>>>>>>>      This is a good bug fix. It solves the problem when PETSc
>>>>>>>>>> initialize is called many times.
>>>>>>>>>>
>>>>>>>>>>      There is another fix you can do to limit PETSc mpiuni
>>>>>>>>>> running out of attributes inside a single PETSc run:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> int MPI_Comm_create_keyval(MPI_Copy_function
>>>>>>>>>> *copy_fn,MPI_Delete_function *delete_fn,int *keyval,void *extra_state)
>>>>>>>>>> {
>>>>>>>>>>
>>>>>>>>>>  if (num_attr >= MAX_ATTR){
>>>>>>>>>>    for (i=0; i<num_attr; i++) {
>>>>>>>>>>      if (!attr_keyval[i].extra_state) {
>>>>>>>>>>
>>>>>>>>> attr_keyval[i].extra_state is provided by user (could be NULL). We
>>>>>>>>> can not rely on it.
>>>>>>>>>
>>>>>>>>>>         /* reuse this slot */
>>>>>>>>>>         attr_keyval[i].extra_state = extra_state;
>>>>>>>>>>        attr_keyval[i.]del         = delete_fn;
>>>>>>>>>>        *keyval = i;
>>>>>>>>>>         return MPI_SUCCESS;
>>>>>>>>>>      }
>>>>>>>>>>   }
>>>>>>>>>>   return MPIUni_Abort(MPI_COMM_WORLD,1);
>>>>>>>>>> }
>>>>>>>>>>  return MPIUni_Abort(MPI_COMM_WORLD,1);
>>>>>>>>>>   attr_keyval[num_attr].extra_state = extra_state;
>>>>>>>>>>   attr_keyval[num_attr].del         = delete_fn;
>>>>>>>>>>   *keyval                           = num_attr++;
>>>>>>>>>>   return MPI_SUCCESS;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>   This will work if the user creates tons of attributes but is
>>>>>>>>>> constantly deleting some as they new ones. So long as the number
>>>>>>>>>> outstanding at one time is < MAX_ATTR)
>>>>>>>>>>
>>>>>>>>>> Barry
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Jun 20, 2020, at 10:54 AM, Junchao Zhang <
>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> I don't understand what your session means. Let's try this patch
>>>>>>>>>>
>>>>>>>>>> diff --git a/src/sys/mpiuni/mpi.c b/src/sys/mpiuni/mpi.c
>>>>>>>>>> index d559a513..c058265d 100644
>>>>>>>>>> --- a/src/sys/mpiuni/mpi.c
>>>>>>>>>> +++ b/src/sys/mpiuni/mpi.c
>>>>>>>>>> @@ -283,6 +283,7 @@ int MPI_Finalize(void)
>>>>>>>>>>    MPI_Comm_free(&comm);
>>>>>>>>>>    comm = MPI_COMM_SELF;
>>>>>>>>>>    MPI_Comm_free(&comm);
>>>>>>>>>> +  num_attr = 1; /* reset the counter */
>>>>>>>>>>    MPI_was_finalized = 1;
>>>>>>>>>>    return MPI_SUCCESS;
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --Junchao Zhang
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sat, Jun 20, 2020 at 10:48 AM Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Typo: I mean “Assuming initializer is only needed once for
>>>>>>>>>>> entire session”
>>>>>>>>>>>
>>>>>>>>>>> On Saturday, June 20, 2020, Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Assuming finalizer is only needed once for entire session(?), I
>>>>>>>>>>>> can put initializer into the static block to call it once but where do I
>>>>>>>>>>>> call finalizer?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Saturday, June 20, 2020, Junchao Zhang <
>>>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The counter num_attr should be recycled. But first try to call
>>>>>>>>>>>>> PETSc initialize/Finalize only once to see it fixes the error.
>>>>>>>>>>>>> --Junchao Zhang
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Jun 20, 2020 at 12:48 AM Sam Guo <
>>>>>>>>>>>>> sam.guo at cd-adapco.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> To clarify, I call PETSc initialize and PETSc finalize
>>>>>>>>>>>>>> everytime I call SLEPc:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   PetscInitializeNoPointers(argc,args,nullptr,nullptr);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   SlepcInitialize(&argc,&args,static_cast<char*>(nullptr),help);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   //calling slepc
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   SlepcFinalize();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    PetscFinalize();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jun 19, 2020 at 10:32 PM Sam Guo <
>>>>>>>>>>>>>> sam.guo at cd-adapco.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dear PETSc team,
>>>>>>>>>>>>>>>    When I called SLEPc multiple time, I eventually got
>>>>>>>>>>>>>>> following error:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> MPI operation not supported by PETSc's sequential MPI
>>>>>>>>>>>>>>> wrappers
>>>>>>>>>>>>>>> [0]PETSC ERROR: #1 PetscInitialize() line 967 in
>>>>>>>>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>>>>>>>>> [0]PETSC ERROR: #2 SlepcInitialize() line 262 in
>>>>>>>>>>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>>>>>>>>>>> [0]PETSC ERROR: #3 SlepcInitializeNoPointers() line 359 in
>>>>>>>>>>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>>>>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   I debugged: it is because of following in
>>>>>>>>>>>>>>> petsc/src/sys/mpiuni/mpi.c
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> if (num_attr >= MAX_ATTR)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> in function int MPI_Comm_create_keyval(MPI_Copy_function
>>>>>>>>>>>>>>> *copy_fn,MPI_Delete_function *delete_fn,int *keyval,void *extra_state)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> num_attr is declared static and keeps increasing every
>>>>>>>>>>>>>>> time MPI_Comm_create_keyval is called.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am using petsc 3.11.3 but found 3.13.2 has the same logic.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is this a bug or I didn't use it correctly?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Sam
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200626/4d4179f8/attachment.html>


More information about the petsc-users mailing list