[petsc-users] Reaching limit number of communicator with Spectrum MPI

Junchao Zhang junchao.zhang at gmail.com
Wed Aug 18 15:53:22 CDT 2021


Hi, Feimi,
  I need to consult Jed (cc'ed).
  Jed, is this an example of
https://lists.mcs.anl.gov/mailman/htdig/petsc-dev/2018-April/thread.html#22663?
If Feimi really can not free matrices, then we just need to attach a
hypre-comm to a petsc inner comm, and pass that to hypre.

--Junchao Zhang


On Wed, Aug 18, 2021 at 3:38 PM Satish Balay <balay at mcs.anl.gov> wrote:

> Is the communicator used to create PETSc objects MPI_COMM_WORLD?
>
> If so - try changing it to PETSC_COMM_WORLD
>
> Satish
>
>  On Wed, 18 Aug 2021, Feimi Yu wrote:
>
> > Hi Junchao,
> >
> > Thank you for the suggestion! I'm using the deal.ii wrapper
> > dealii::PETScWrappers::PreconditionBase to handle the PETSc
> preconditioners,
> > and the wrappers does the destroy when the preconditioner is
> reinitialized or
> > gets out of scope. I just double-checked, this is called to make sure
> the old
> > matrices are destroyed:
> >
> >    void
> >    PreconditionBase::clear()
> >    {
> >      matrix = nullptr;
> >
> >      if (pc != nullptr)
> >        {
> >          PetscErrorCode ierr = PCDestroy(&pc);
> >          pc                  = nullptr;
> >          AssertThrow(ierr == 0, ExcPETScError(ierr));
> >        }
> >    }
> >
> > Thanks!
> >
> > Feimi
> >
> > On 8/18/21 4:23 PM, Junchao Zhang wrote:
> > >
> > >
> > >
> > > On Wed, Aug 18, 2021 at 12:52 PM Feimi Yu <yuf2 at rpi.edu
> > > <mailto:yuf2 at rpi.edu>> wrote:
> > >
> > >     Hi,
> > >
> > >     I was trying to run a simulation with a PETSc-wrapped Hypre
> > >     preconditioner, and encountered this problem:
> > >
> > >     [dcs122:133012] Out of resources: all 4095 communicator IDs have
> > >     been used.
> > >     [19]PETSC ERROR: --------------------- Error Message
> > >     --------------------------------------------------------------
> > >     [19]PETSC ERROR: General MPI error
> > >     [19]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
> > >     [19]PETSC ERROR: See
> > >     https://www.mcs.anl.gov/petsc/documentation/faq.html
> > >     <https://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble
> > >     shooting.
> > >     [19]PETSC ERROR: Petsc Release Version 3.15.2, unknown
> > >     [19]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by
> > >     CFSIfmyu Wed Aug 11 19:51:47 2021
> > >     [19]PETSC ERROR: [dcs122:133010] Out of resources: all 4095
> > >     communicator IDs have been used.
> > >     [18]PETSC ERROR: --------------------- Error Message
> > >     --------------------------------------------------------------
> > >     [18]PETSC ERROR: General MPI error
> > >     [18]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
> > >     [18]PETSC ERROR: See
> > >     https://www.mcs.anl.gov/petsc/documentation/faq.html
> > >     <https://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble
> > >     shooting.
> > >     [18]PETSC ERROR: Petsc Release Version 3.15.2, unknown
> > >     [18]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by
> > >     CFSIfmyu Wed Aug 11 19:51:47 2021
> > >     [18]PETSC ERROR: Configure options --download-scalapack
> > >     --download-mumps --download-hypre --with-cc=mpicc
> > >     --with-cxx=mpicxx --with-fc=mpif90 --with-cudac=0
> > >     --with-debugging=0
> > >
>  --with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
> > >     [18]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1>
> > >     MatCreate_HYPRE() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
> > >     [18]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2>
> > >     MatSetType() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
> > >     [18]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3>
> > >     MatConvert_AIJ_HYPRE() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
> > >     [18]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4>
> > >     MatConvert() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
> > >     [18]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5>
> > >     PCSetUp_HYPRE() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
> > >     [18]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6>
> > >     PCSetUp() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015
> > >     Configure options --download-scalapack --download-mumps
> > >     --download-hypre --with-cc=mpicc --with-cxx=mpicxx
> > >     --with-fc=mpif90 --with-cudac=0 --with-debugging=0
> > >
>  --with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
> > >     [19]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1>
> > >     MatCreate_HYPRE() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
> > >     [19]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2>
> > >     MatSetType() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
> > >     [19]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3>
> > >     MatConvert_AIJ_HYPRE() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
> > >     [19]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4>
> > >     MatConvert() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
> > >     [19]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5>
> > >     PCSetUp_HYPRE() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
> > >     [19]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6>
> > >     PCSetUp() at
> > >
>  /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015
> > >
> > >     It seems that MPI_Comm_dup() at
> > >     petsc/src/mat/impls/hypre/mhypre.c:2120 caused the problem. Since
> > >     mine is a time-dependent problem, MatCreate_HYPRE() is called
> > >     every time the new system matrix is assembled. The above error
> > >     message is reported after ~4095 calls of MatCreate_HYPRE(), which
> > >     is around 455 time steps in my code. Here is some basic compiler
> > >     information:
> > >
> > > Can you destroy old matrices to free MPI communicators? Otherwise, you
> run
> > > into a limitation we knew before.
> > >
> > >     IBM Spectrum MPI 10.4.0
> > >
> > >     GCC 8.4.1
> > >
> > >     I've never had this problem before with OpenMPI or MPICH
> > >     implementation, so I was wondering if this can be resolved from my
> > >     end, or it's an implementation specific problem.
> > >
> > >     Thanks!
> > >
> > >     Feimi
> > >
> >
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210818/2993b47b/attachment-0001.html>


More information about the petsc-users mailing list