[petsc-users] MatDestroy problem with multiple matrices and SUPERLU_DIST

Barry Smith bsmith at petsc.dev
Fri Apr 23 12:09:15 CDT 2021


   Thanks for looking. Do these modules have any "automatic freeing" when variables go out of scope (like C++ classes do)? 

    Do you make specific new MPI communicators to use create the matrices? 

    Have you tried MPICH or a different version of OpenMPI. 

    Maybe run the program with valgrind.  The stack frames you sent look "funny", that is I would not normally expect them to be in such an order.

   Barry


> On Apr 23, 2021, at 8:31 AM, Deij-van Rijswijk, Menno <M.Deij at marin.nl> wrote:
> 
> 
> Hi Barry,
>  
> Thank you for looking into this. The code I'm referring to is part of a larger fortran module, and I have tried to isolate the problem in a reproducible test case. Unfortunately, I have not been able to reproduce the problem there. It is probably a subtle bug or misuse on our part, but for now I can't seem to pinpoint the problem.
>  
> Menno
>  
> 
> dr. ir. Menno A. Deij-van Rijswijk | Researcher | Research & Development
> MARIN | T +31 317 49 35 06 | M.Deij at marin.nl <mailto:M.Deij at marin.nl> | www.marin.nl <http://www.marin.nl/>
> 
> <image578a50.PNG> <https://www.linkedin.com/company/marin> <imaged2652e.PNG> <http://www.youtube.com/marinmultimedia> <image454226.PNG> <https://twitter.com/MARIN_nieuws> <image9421ef.PNG> <https://www.facebook.com/marin.wageningen>
> MARIN news: Report magazine 131: read now and register for future editions <https://www.marin.nl/news/report-magazine-131-read-now-and-register-for-future-editions>
> 
> 
> 
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> 
> Sent: Friday, April 23, 2021 1:18 AM
> To: Deij-van Rijswijk, Menno <M.Deij at marin.nl <mailto:M.Deij at marin.nl>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] MatDestroy problem with multiple matrices and SUPERLU_DIST
>  
>  
>    Please send a code that reproduces the problem. I cannot reproduce it.  It is absolutely expected to work, but there may be an untested corner case you have hit on.
>  
>    The comment 
>  
> /*  This allows reusing the Superlu_DIST communicator and grid when only a single SuperLU_DIST matrix is used at a time */
>  
> means that __IF__ only a single SuperLU_DIST matrix is used at a time then the Superlu_DIST communicator and grid ARE reused for a next one created after the first has been destroyed. If multiple SuperLU_DIST matrix are needed at the same time then they each get there own Superlu_DIST communicator and grid and there is no reuse. But this does not mean you cannot have as many matrices and solvers outstanding at the same time as you want.
>  
>   Barry
>  
> Yes there could be a subtle bug in our logic but we need a test case to find it.
>  
> 
> 
> On Apr 21, 2021, at 8:06 AM, Deij-van Rijswijk, Menno <M.Deij at marin.nl <mailto:M.Deij at marin.nl>> wrote:
>  
>  
> Good afternoon,
>  
> In our code we're using two matrices that are preconditioned using SUPERLU_dist. Upon freeing the second matrix with MatDestroy, the program segfaults with the following stacktrace (see below). This happens with PETSc versions 3.14.5 and 3.15.0, whereas version 3.11.2 does not have this problem. In the code src\mat\impls\aij\mpi\superlu_dist\superlu_dist.c I see that Petsc_Superlu_dist_keyval_Delete_Fn has been added between 3.11.2 and 3.14.5, and the comment reads that it allows reusing the communicator when only a single matrix is used at the time. I'm wondering if using multiple matrices with SUPERLU_dist is problematic here?
>  
> Note: this is happening on single process and on multiple MPI processes.
>  
> Best regards,
>  
>  
> Menno Deij - van Rijswijk
>  
>  
>  
> #0  0x000015554ffc7db4 in ompi_comm_free () from /home/mdeij/install-gnu/extLibs/lib/libmpi.so.40
> #1  0x0000155550021536 in PMPI_Comm_free () from /home/mdeij/install-gnu/extLibs/lib/libmpi.so.40
> #2  0x00001555534262ba in superlu_gridexit (grid=0x3689da0)
>     at /home/mdeij/install-gnu/extLibs/Linux-x86_64-Intel/superlu_dist-6.3.0/SRC/superlu_grid.c:174
> #3  0x0000155553df094b in Petsc_Superlu_dist_keyval_Delete_Fn (comm=0x26a4e50, keyval=16, attr_val=0x3689d90,
>     extra_state=0x0)
>     at /home/mdeij/build-libs-gnu/superbuild/petsc/src/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:97
> #4  0x000015554ffc145c in ompi_attr_delete_impl () from /home/mdeij/install-gnu/extLibs/lib/libmpi.so.40
> #5  0x000015554ffc3fdf in ompi_attr_delete_all () from /home/mdeij/install-gnu/extLibs/lib/libmpi.so.40
> #6  0x000015554ffc7ca7 in ompi_comm_free () from /home/mdeij/install-gnu/extLibs/lib/libmpi.so.40
> #7  0x0000155550021536 in PMPI_Comm_free () from /home/mdeij/install-gnu/extLibs/lib/libmpi.so.40
> #8  0x00001555538f31c5 in PetscCommDestroy (comm=0x29712e0)
>     at /home/mdeij/build-libs-gnu/superbuild/petsc/src/src/sys/objects/tagm.c:217
> #9  0x00001555538f55ab in PetscHeaderDestroy_Private (h=0x29712a0)
>     at /home/mdeij/build-libs-gnu/superbuild/petsc/src/src/sys/objects/inherit.c:121
> #10 0x0000155553b44626 in MatDestroy (A=0x26e1c98)
>     at /home/mdeij/build-libs-gnu/superbuild/petsc/src/src/mat/interface/matrix.c:1310
> #11 0x0000155553b87a86 in matdestroy_ (x=0x26e1c98, ierr=0x7fffffffc9cc)
>  
> 
> dr. ir. Menno A. Deij-van Rijswijk | Researcher | Research & Development
> MARIN | T +31 317 49 35 06 | M.Deij at marin.nl <mailto:M.Deij at marin.nl> | www.marin.nl <http://www.marin.nl/>
> 
> <image350854.PNG> <https://www.linkedin.com/company/marin> <image623ac8.PNG> <http://www.youtube.com/marinmultimedia> <image6c0411.PNG> <https://twitter.com/MARIN_nieuws> <imagea99443.PNG> <https://www.facebook.com/marin.wageningen>
> MARIN news: IWSA Open letter release for shipping decarbonisation <https://www.marin.nl/news/iwsa-open-letter-release-for-shipping-decarbonisation>
>  
>  
> 
> Help us improve the spam filter. If this message contains SPAM, click here <https://www.mailcontrol.com/sr/rkrirCoveRbGX2PQPOmvUs3oG7lEfVxCKk45h3sl3AVch8EgHsvHBwi-JRbSzaNb4s9osQehojJsVqdfswMBgw==> to report. Thank you, MARIN Digital Services
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210423/b9135249/attachment-0001.html>


More information about the petsc-users mailing list