<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi Satish and Junchao,</p>
<p>I just tried replacing all MPI_COMM_WORLD with PETSC_COMM_WORLD,
but it didn't do the trick. One thing that interests me is that, I
ran with 40 ranks but only 2 ranks reported the communicator
error. I think this means at least the rest 38 ranks freed the
communicators properly.</p>
<p>Thanks!</p>
<p>Feimi<br>
</p>
<div class="moz-cite-prefix">On 8/18/21 4:53 PM, Junchao Zhang
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CA+MQGp_PsC5bNr461Ehc5t81cy-PxzVmy5MNhKGMGS7NnPFRew@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>Hi, Feimi,</div>
<div> I need to consult Jed (cc'ed).</div>
Jed, is this an example of <a
href="https://lists.mcs.anl.gov/mailman/htdig/petsc-dev/2018-April/thread.html#22663"
moz-do-not-send="true">https://lists.mcs.anl.gov/mailman/htdig/petsc-dev/2018-April/thread.html#22663</a>?
If Feimi really can not free matrices, then we just need to
attach a hypre-comm to a petsc inner comm, and pass that to
hypre.
<div>
<div>
<div><br clear="all">
<div>
<div dir="ltr" class="gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">--Junchao Zhang</div>
</div>
</div>
<br>
</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Aug 18, 2021 at 3:38
PM Satish Balay <<a href="mailto:balay@mcs.anl.gov"
moz-do-not-send="true">balay@mcs.anl.gov</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Is
the communicator used to create PETSc objects MPI_COMM_WORLD?<br>
<br>
If so - try changing it to PETSC_COMM_WORLD<br>
<br>
Satish<br>
<br>
On Wed, 18 Aug 2021, Feimi Yu wrote:<br>
<br>
> Hi Junchao,<br>
> <br>
> Thank you for the suggestion! I'm using the deal.ii
wrapper<br>
> dealii::PETScWrappers::PreconditionBase to handle the
PETSc preconditioners,<br>
> and the wrappers does the destroy when the preconditioner
is reinitialized or<br>
> gets out of scope. I just double-checked, this is called
to make sure the old<br>
> matrices are destroyed:<br>
> <br>
> void<br>
> PreconditionBase::clear()<br>
> {<br>
> matrix = nullptr;<br>
> <br>
> if (pc != nullptr)<br>
> {<br>
> PetscErrorCode ierr = PCDestroy(&pc);<br>
> pc = nullptr;<br>
> AssertThrow(ierr == 0, ExcPETScError(ierr));<br>
> }<br>
> }<br>
> <br>
> Thanks!<br>
> <br>
> Feimi<br>
> <br>
> On 8/18/21 4:23 PM, Junchao Zhang wrote:<br>
> ><br>
> ><br>
> ><br>
> > On Wed, Aug 18, 2021 at 12:52 PM Feimi Yu <<a
href="mailto:yuf2@rpi.edu" target="_blank"
moz-do-not-send="true">yuf2@rpi.edu</a><br>
> > <mailto:<a href="mailto:yuf2@rpi.edu"
target="_blank" moz-do-not-send="true">yuf2@rpi.edu</a>>>
wrote:<br>
> ><br>
> > Hi,<br>
> ><br>
> > I was trying to run a simulation with a
PETSc-wrapped Hypre<br>
> > preconditioner, and encountered this problem:<br>
> ><br>
> > [dcs122:133012] Out of resources: all 4095
communicator IDs have<br>
> > been used.<br>
> > [19]PETSC ERROR: --------------------- Error
Message<br>
> >
--------------------------------------------------------------<br>
> > [19]PETSC ERROR: General MPI error<br>
> > [19]PETSC ERROR: MPI error 17 MPI_ERR_INTERN:
internal error<br>
> > [19]PETSC ERROR: See<br>
> > <a
href="https://www.mcs.anl.gov/petsc/documentation/faq.html"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.mcs.anl.gov/petsc/documentation/faq.html</a><br>
> > <<a
href="https://www.mcs.anl.gov/petsc/documentation/faq.html"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.mcs.anl.gov/petsc/documentation/faq.html</a>>
for trouble<br>
> > shooting.<br>
> > [19]PETSC ERROR: Petsc Release Version 3.15.2,
unknown<br>
> > [19]PETSC ERROR: ./main on a arch-linux-c-opt
named dcs122 by<br>
> > CFSIfmyu Wed Aug 11 19:51:47 2021<br>
> > [19]PETSC ERROR: [dcs122:133010] Out of
resources: all 4095<br>
> > communicator IDs have been used.<br>
> > [18]PETSC ERROR: --------------------- Error
Message<br>
> >
--------------------------------------------------------------<br>
> > [18]PETSC ERROR: General MPI error<br>
> > [18]PETSC ERROR: MPI error 17 MPI_ERR_INTERN:
internal error<br>
> > [18]PETSC ERROR: See<br>
> > <a
href="https://www.mcs.anl.gov/petsc/documentation/faq.html"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.mcs.anl.gov/petsc/documentation/faq.html</a><br>
> > <<a
href="https://www.mcs.anl.gov/petsc/documentation/faq.html"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.mcs.anl.gov/petsc/documentation/faq.html</a>>
for trouble<br>
> > shooting.<br>
> > [18]PETSC ERROR: Petsc Release Version 3.15.2,
unknown<br>
> > [18]PETSC ERROR: ./main on a arch-linux-c-opt
named dcs122 by<br>
> > CFSIfmyu Wed Aug 11 19:51:47 2021<br>
> > [18]PETSC ERROR: Configure options
--download-scalapack<br>
> > --download-mumps --download-hypre
--with-cc=mpicc<br>
> > --with-cxx=mpicxx --with-fc=mpif90
--with-cudac=0<br>
> > --with-debugging=0<br>
> >
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/<br>
> > [18]PETSC ERROR: #1 <<a
href="https://itssc.rpi.edu/hc/requests/1" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/1</a>><br>
> > MatCreate_HYPRE() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120<br>
> > [18]PETSC ERROR: #2 <<a
href="https://itssc.rpi.edu/hc/requests/2" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/2</a>><br>
> > MatSetType() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91<br>
> > [18]PETSC ERROR: #3 <<a
href="https://itssc.rpi.edu/hc/requests/3" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/3</a>><br>
> > MatConvert_AIJ_HYPRE() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392<br>
> > [18]PETSC ERROR: #4 <<a
href="https://itssc.rpi.edu/hc/requests/4" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/4</a>><br>
> > MatConvert() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439<br>
> > [18]PETSC ERROR: #5 <<a
href="https://itssc.rpi.edu/hc/requests/5" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/5</a>><br>
> > PCSetUp_HYPRE() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240<br>
> > [18]PETSC ERROR: #6 <<a
href="https://itssc.rpi.edu/hc/requests/6" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/6</a>><br>
> > PCSetUp() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015<br>
> > Configure options --download-scalapack
--download-mumps<br>
> > --download-hypre --with-cc=mpicc
--with-cxx=mpicxx<br>
> > --with-fc=mpif90 --with-cudac=0
--with-debugging=0<br>
> >
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/<br>
> > [19]PETSC ERROR: #1 <<a
href="https://itssc.rpi.edu/hc/requests/1" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/1</a>><br>
> > MatCreate_HYPRE() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120<br>
> > [19]PETSC ERROR: #2 <<a
href="https://itssc.rpi.edu/hc/requests/2" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/2</a>><br>
> > MatSetType() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91<br>
> > [19]PETSC ERROR: #3 <<a
href="https://itssc.rpi.edu/hc/requests/3" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/3</a>><br>
> > MatConvert_AIJ_HYPRE() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392<br>
> > [19]PETSC ERROR: #4 <<a
href="https://itssc.rpi.edu/hc/requests/4" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/4</a>><br>
> > MatConvert() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439<br>
> > [19]PETSC ERROR: #5 <<a
href="https://itssc.rpi.edu/hc/requests/5" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/5</a>><br>
> > PCSetUp_HYPRE() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240<br>
> > [19]PETSC ERROR: #6 <<a
href="https://itssc.rpi.edu/hc/requests/6" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://itssc.rpi.edu/hc/requests/6</a>><br>
> > PCSetUp() at<br>
> >
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015<br>
> ><br>
> > It seems that MPI_Comm_dup() at<br>
> > petsc/src/mat/impls/hypre/mhypre.c:2120 caused
the problem. Since<br>
> > mine is a time-dependent problem,
MatCreate_HYPRE() is called<br>
> > every time the new system matrix is assembled.
The above error<br>
> > message is reported after ~4095 calls of
MatCreate_HYPRE(), which<br>
> > is around 455 time steps in my code. Here is
some basic compiler<br>
> > information:<br>
> ><br>
> > Can you destroy old matrices to free MPI
communicators? Otherwise, you run<br>
> > into a limitation we knew before.<br>
> ><br>
> > IBM Spectrum MPI 10.4.0<br>
> ><br>
> > GCC 8.4.1<br>
> ><br>
> > I've never had this problem before with OpenMPI
or MPICH<br>
> > implementation, so I was wondering if this can
be resolved from my<br>
> > end, or it's an implementation specific problem.<br>
> ><br>
> > Thanks!<br>
> ><br>
> > Feimi<br>
> ><br>
> <br>
> </blockquote>
</div>
</blockquote>
</body>
</html>