[petsc-users] Reaching limit number of communicator with Spectrum MPI

Feimi Yu yuf2 at rpi.edu
Wed Aug 18 12:52:30 CDT 2021


Hi,

I was trying to run a simulation with a PETSc-wrapped Hypre 
preconditioner, and encountered this problem:

[dcs122:133012] Out of resources: all 4095 communicator IDs have been used.
[19]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[19]PETSC ERROR: General MPI error
[19]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
[19]PETSC ERROR: See 
https://www.mcs.anl.gov/petsc/documentation/faq.html 
<https://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
[19]PETSC ERROR: Petsc Release Version 3.15.2, unknown
[19]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by CFSIfmyu 
Wed Aug 11 19:51:47 2021
[19]PETSC ERROR: [dcs122:133010] Out of resources: all 4095 communicator 
IDs have been used.
[18]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[18]PETSC ERROR: General MPI error
[18]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
[18]PETSC ERROR: See 
https://www.mcs.anl.gov/petsc/documentation/faq.html 
<https://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
[18]PETSC ERROR: Petsc Release Version 3.15.2, unknown
[18]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by CFSIfmyu 
Wed Aug 11 19:51:47 2021
[18]PETSC ERROR: Configure options --download-scalapack --download-mumps 
--download-hypre --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 
--with-cudac=0 --with-debugging=0 
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
[18]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1> 
MatCreate_HYPRE() at 
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
[18]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2> MatSetType() 
at /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
[18]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3> 
MatConvert_AIJ_HYPRE() at 
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
[18]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4> MatConvert() 
at /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
[18]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5> 
PCSetUp_HYPRE() at 
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
[18]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6> PCSetUp() at 
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015
Configure options --download-scalapack --download-mumps --download-hypre 
--with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-cudac=0 
--with-debugging=0 
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
[19]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1> 
MatCreate_HYPRE() at 
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
[19]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2> MatSetType() 
at /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
[19]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3> 
MatConvert_AIJ_HYPRE() at 
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
[19]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4> MatConvert() 
at /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
[19]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5> 
PCSetUp_HYPRE() at 
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
[19]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6> PCSetUp() at 
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015

It seems that MPI_Comm_dup() at petsc/src/mat/impls/hypre/mhypre.c:2120 
caused the problem. Since mine is a time-dependent problem, 
MatCreate_HYPRE() is called every time the new system matrix is 
assembled. The above error message is reported after ~4095 calls of 
MatCreate_HYPRE(), which is around 455 time steps in my code. Here is 
some basic compiler information:

IBM Spectrum MPI 10.4.0

GCC 8.4.1

I've never had this problem before with OpenMPI or MPICH implementation, 
so I was wondering if this can be resolved from my end, or it's an 
implementation specific problem.

Thanks!

Feimi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210818/343ce926/attachment.html>


More information about the petsc-users mailing list