[petsc-users] PETSc initialization error

Junchao Zhang junchao.zhang at gmail.com
Mon Jun 29 23:10:42 CDT 2020


On Mon, Jun 29, 2020 at 1:00 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> Hi Junchao,
>    I'll test the ex53. At the meantime, I use the following work around:
> my program call MPI initialize once for entire program
> PetscInitialize once for entire program
> SlecpInitialize once for entire program (I think I can skip
> PetscInitialize above)
> calling slepc multiple times
> my program call MPI finalize before ending program
>
>    You can see I stkip PetscFinalize/SlepcFinalize. I am uneasy for
> skipping them since I am not sure what is the consequence. Can you comment
> on it?
>
It should be fine. MPI_Finalize does not free objects created by MPI.  But
since you end your program after MPI_Finalize, there should be no memory
leaks. In general, one needs to call PetscFinalize/SlepcFinalize.
Try to get a minimal working example and then we can have a look.


>
> Thanks,
> Sam
>
>
>
> On Fri, Jun 26, 2020 at 6:58 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Did the test included in that commit fail in your environment? You can
>> also change the test by adding calls to SlepcInitialize/SlepcFinalize
>> between PetscInitializeNoPointers/PetscFinalize as in my previous email.
>>
>> --Junchao Zhang
>>
>>
>> On Fri, Jun 26, 2020 at 5:54 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>
>>> Hi Junchao,
>>>    If you are talking about this commit of yours
>>> https://gitlab.com/petsc/petsc/-/commit/f0463fa09df52ce43e7c5bf47a1c87df0c9e5cbb
>>>
>>> Recycle keyvals and fix bugs in MPI_Comm creation
>>>    I think I got it. It fixes the serial one but parallel one is still
>>> crashing.
>>>
>>> Thanks,
>>> Sam
>>>
>>> On Fri, Jun 26, 2020 at 3:43 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>
>>>> Hi Junchao,
>>>>    I am not ready to upgrade petsc yet(due to the lengthy technical and
>>>> legal approval process of our internal policy). Can you send me the diff
>>>> file so I can apply it to petsc 3.11.3)?
>>>>
>>>> Thanks,
>>>> Sam
>>>>
>>>> On Fri, Jun 26, 2020 at 3:33 PM Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>> Sam,
>>>>>   Please discard the origin patch I sent you. A better fix is already
>>>>> in maint/master. An test is at src/sys/tests/ex53.c
>>>>>   I modified that test at the end with
>>>>>
>>>>>   for (i=0; i<500; i++) {
>>>>>     ierr = PetscInitializeNoPointers(argc,argv,NULL,help);if (ierr)
>>>>> return ierr;
>>>>>     ierr = SlepcInitialize(&argc,&argv,NULL,help);if (ierr) return
>>>>> ierr;
>>>>>     ierr = SlepcFinalize();if (ierr) return ierr;
>>>>>     ierr = PetscFinalize();if (ierr) return ierr;
>>>>>   }
>>>>>
>>>>>
>>>>>  then I ran it with multiple mpi ranks and it ran correctly. So try
>>>>> your program with petsc master first. If not work, see if you can come up
>>>>> with a test example for us.
>>>>>
>>>>>  Thanks.
>>>>> --Junchao Zhang
>>>>>
>>>>>
>>>>> On Fri, Jun 26, 2020 at 3:37 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>>>
>>>>>> One work around for me is to call PetscInitialize once for my entire
>>>>>> program and skip PetscFinalize (since I don't have a good place to call
>>>>>> PetscFinalize   before ending the program).
>>>>>>
>>>>>> On Fri, Jun 26, 2020 at 1:33 PM Sam Guo <sam.guo at cd-adapco.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I get the crash after calling Initialize/Finalize multiple times.
>>>>>>> Junchao fixed the bug for serial but parallel still crashes.
>>>>>>>
>>>>>>> On Fri, Jun 26, 2020 at 1:28 PM Barry Smith <bsmith at petsc.dev>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>   Ah, so you get the crash the second time you call
>>>>>>>> PetscInitialize()?  That is a problem because we do intend to support that
>>>>>>>> capability (but you much call PetscFinalize() each time also).
>>>>>>>>
>>>>>>>>   Barry
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jun 26, 2020, at 3:25 PM, Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>>>>>>
>>>>>>>> Hi Barry,
>>>>>>>>    Thanks for the quick response.
>>>>>>>>    I will call PetscInitialize once and skip the PetscFinalize for
>>>>>>>> now to avoid the crash. The crash is actually in PetscInitialize, not
>>>>>>>> PetscFinalize.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sam
>>>>>>>>
>>>>>>>> On Fri, Jun 26, 2020 at 1:21 PM Barry Smith <bsmith at petsc.dev>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Sam,
>>>>>>>>>
>>>>>>>>>   You can skip PetscFinalize() so long as you only call
>>>>>>>>> PetscInitialize() once. It is not desirable in general to skip the finalize
>>>>>>>>> because PETSc can't free all its data structures and you cannot see the
>>>>>>>>> PETSc logging information with -log_view but in terms of the code running
>>>>>>>>> correctly you do not need to call PetscFinalize.
>>>>>>>>>
>>>>>>>>>    If your code crashes in PetscFinalize() please send the full
>>>>>>>>> error output and we can try to help you debug it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    Barry
>>>>>>>>>
>>>>>>>>> On Jun 26, 2020, at 3:14 PM, Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> To clarify, we have a mpi wrapper (so we can switch to different
>>>>>>>>> mpi at runtime). I compile petsc using our mpi wrapper.
>>>>>>>>> If I just call PETSc initialize once without calling finallize, it
>>>>>>>>> is ok. My question to you is that: can I skip finalize?
>>>>>>>>> Our program calls mpi_finalize at end anyway.
>>>>>>>>>
>>>>>>>>> On Fri, Jun 26, 2020 at 1:09 PM Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Junchao,
>>>>>>>>>>    Attached please find the configure.log.
>>>>>>>>>>    I also attach the pinit.c which contains your patch (I am
>>>>>>>>>> currently using 3.11.3. I've applied your patch to 3.11.3). Your patch
>>>>>>>>>> fixes the serial version. The error now is about the parallel.
>>>>>>>>>>    Here is the error log:
>>>>>>>>>>
>>>>>>>>>> [1]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>>>> [1]PETSC ERROR: #2 checkError() line 56 in
>>>>>>>>>> ../../../physics/src/eigensolver/SLEPc.cpp
>>>>>>>>>> [1]PETSC ERROR: #3 PetscInitialize() line 966 in
>>>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>>>> [1]PETSC ERROR: #4 SlepcInitialize() line 262 in
>>>>>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>>>>>> [0]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>>>> [0]PETSC ERROR: #2 checkError() line 56 in
>>>>>>>>>> ../../../physics/src/eigensolver/SLEPc.cpp
>>>>>>>>>> [0]PETSC ERROR: #3 PetscInitialize() line 966 in
>>>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>>>> [0]PETSC ERROR: #4 SlepcInitialize() line 262 in
>>>>>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>>>>
>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>>>>>>>>> with errorcode 56.
>>>>>>>>>>
>>>>>>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI
>>>>>>>>>> processes.
>>>>>>>>>> You may or may not see output from other processes, depending on
>>>>>>>>>> exactly when Open MPI kills them.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Sam
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 25, 2020 at 7:37 PM Junchao Zhang <
>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Sam,
>>>>>>>>>>>    The MPI_Comm_create_keyval() error was fixed in maint/master.
>>>>>>>>>>> From the error message, it seems you need to configure --with-log=1
>>>>>>>>>>>    Otherwise, please send your full error stack trace and
>>>>>>>>>>> configure.log.
>>>>>>>>>>>   Thanks.
>>>>>>>>>>> --Junchao Zhang
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 25, 2020 at 2:18 PM Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Junchao,
>>>>>>>>>>>>    I now encountered the same error with parallel. I am
>>>>>>>>>>>> wondering if there is a need for parallel fix as well.
>>>>>>>>>>>> [1]PETSC ERROR: #1 PetscInitialize() line 969 in
>>>>>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jun 20, 2020 at 7:35 PM Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Junchao,
>>>>>>>>>>>>>    Your patch works.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Sam
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Jun 20, 2020 at 4:23 PM Junchao Zhang <
>>>>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Jun 20, 2020 at 12:24 PM Barry Smith <
>>>>>>>>>>>>>> bsmith at petsc.dev> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    Junchao,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      This is a good bug fix. It solves the problem when
>>>>>>>>>>>>>>> PETSc initialize is called many times.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      There is another fix you can do to limit PETSc mpiuni
>>>>>>>>>>>>>>> running out of attributes inside a single PETSc run:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> int MPI_Comm_create_keyval(MPI_Copy_function
>>>>>>>>>>>>>>> *copy_fn,MPI_Delete_function *delete_fn,int *keyval,void *extra_state)
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  if (num_attr >= MAX_ATTR){
>>>>>>>>>>>>>>>    for (i=0; i<num_attr; i++) {
>>>>>>>>>>>>>>>      if (!attr_keyval[i].extra_state) {
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> attr_keyval[i].extra_state is provided by user (could be
>>>>>>>>>>>>>> NULL). We can not rely on it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         /* reuse this slot */
>>>>>>>>>>>>>>>         attr_keyval[i].extra_state = extra_state;
>>>>>>>>>>>>>>>        attr_keyval[i.]del         = delete_fn;
>>>>>>>>>>>>>>>        *keyval = i;
>>>>>>>>>>>>>>>         return MPI_SUCCESS;
>>>>>>>>>>>>>>>      }
>>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>>   return MPIUni_Abort(MPI_COMM_WORLD,1);
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>  return MPIUni_Abort(MPI_COMM_WORLD,1);
>>>>>>>>>>>>>>>   attr_keyval[num_attr].extra_state = extra_state;
>>>>>>>>>>>>>>>   attr_keyval[num_attr].del         = delete_fn;
>>>>>>>>>>>>>>>   *keyval                           = num_attr++;
>>>>>>>>>>>>>>>   return MPI_SUCCESS;
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   This will work if the user creates tons of attributes but
>>>>>>>>>>>>>>> is constantly deleting some as they new ones. So long as the number
>>>>>>>>>>>>>>> outstanding at one time is < MAX_ATTR)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Barry
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Jun 20, 2020, at 10:54 AM, Junchao Zhang <
>>>>>>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I don't understand what your session means. Let's try this
>>>>>>>>>>>>>>> patch
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/src/sys/mpiuni/mpi.c b/src/sys/mpiuni/mpi.c
>>>>>>>>>>>>>>> index d559a513..c058265d 100644
>>>>>>>>>>>>>>> --- a/src/sys/mpiuni/mpi.c
>>>>>>>>>>>>>>> +++ b/src/sys/mpiuni/mpi.c
>>>>>>>>>>>>>>> @@ -283,6 +283,7 @@ int MPI_Finalize(void)
>>>>>>>>>>>>>>>    MPI_Comm_free(&comm);
>>>>>>>>>>>>>>>    comm = MPI_COMM_SELF;
>>>>>>>>>>>>>>>    MPI_Comm_free(&comm);
>>>>>>>>>>>>>>> +  num_attr = 1; /* reset the counter */
>>>>>>>>>>>>>>>    MPI_was_finalized = 1;
>>>>>>>>>>>>>>>    return MPI_SUCCESS;
>>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --Junchao Zhang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Jun 20, 2020 at 10:48 AM Sam Guo <
>>>>>>>>>>>>>>> sam.guo at cd-adapco.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Typo: I mean “Assuming initializer is only needed once for
>>>>>>>>>>>>>>>> entire session”
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Saturday, June 20, 2020, Sam Guo <sam.guo at cd-adapco.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Assuming finalizer is only needed once for entire
>>>>>>>>>>>>>>>>> session(?), I can put initializer into the static block to call it once but
>>>>>>>>>>>>>>>>> where do I call finalizer?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Saturday, June 20, 2020, Junchao Zhang <
>>>>>>>>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The counter num_attr should be recycled. But first try to
>>>>>>>>>>>>>>>>>> call PETSc initialize/Finalize only once to see it fixes the error.
>>>>>>>>>>>>>>>>>> --Junchao Zhang
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat, Jun 20, 2020 at 12:48 AM Sam Guo <
>>>>>>>>>>>>>>>>>> sam.guo at cd-adapco.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> To clarify, I call PETSc initialize and PETSc finalize
>>>>>>>>>>>>>>>>>>> everytime I call SLEPc:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>   PetscInitializeNoPointers(argc,args,nullptr,nullptr);
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>   SlepcInitialize(&argc,&args,static_cast<char*>(nullptr),help);
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>   //calling slepc
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>   SlepcFinalize();
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    PetscFinalize();
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Jun 19, 2020 at 10:32 PM Sam Guo <
>>>>>>>>>>>>>>>>>>> sam.guo at cd-adapco.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Dear PETSc team,
>>>>>>>>>>>>>>>>>>>>    When I called SLEPc multiple time, I eventually got
>>>>>>>>>>>>>>>>>>>> following error:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> MPI operation not supported by PETSc's sequential MPI
>>>>>>>>>>>>>>>>>>>> wrappers
>>>>>>>>>>>>>>>>>>>> [0]PETSC ERROR: #1 PetscInitialize() line 967 in
>>>>>>>>>>>>>>>>>>>> ../../../petsc/src/sys/objects/pinit.c
>>>>>>>>>>>>>>>>>>>> [0]PETSC ERROR: #2 SlepcInitialize() line 262 in
>>>>>>>>>>>>>>>>>>>> ../../../slepc/src/sys/slepcinit.c
>>>>>>>>>>>>>>>>>>>> [0]PETSC ERROR: #3 SlepcInitializeNoPointers() line 359
>>>>>>>>>>>>>>>>>>>> in ../../../slepc/src/sys/slepcinit.c
>>>>>>>>>>>>>>>>>>>> PETSC ERROR: Logging has not been enabled.
>>>>>>>>>>>>>>>>>>>> You might have forgotten to call PetscInitialize().
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   I debugged: it is because of following in
>>>>>>>>>>>>>>>>>>>> petsc/src/sys/mpiuni/mpi.c
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> if (num_attr >= MAX_ATTR)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> in function int
>>>>>>>>>>>>>>>>>>>> MPI_Comm_create_keyval(MPI_Copy_function *copy_fn,MPI_Delete_function
>>>>>>>>>>>>>>>>>>>> *delete_fn,int *keyval,void *extra_state)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> num_attr is declared static and keeps increasing every
>>>>>>>>>>>>>>>>>>>> time MPI_Comm_create_keyval is called.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am using petsc 3.11.3 but found 3.13.2 has the
>>>>>>>>>>>>>>>>>>>> same logic.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is this a bug or I didn't use it correctly?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Sam
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200629/85870a9b/attachment-0001.html>


More information about the petsc-users mailing list