running out mpi communicators with Prometheus

Stephan Kramer stephan.kramer at imperial.ac.uk
Thu Aug 14 12:17:54 CDT 2008


Hello

When running with Prometheus in parallel we get the following error 
after about 1000 solves:

 Fatal error in MPI_Comm_dup: Other MPI error, error stack:
 MPI_Comm_dup(176)..: MPI_Comm_dup(comm=0x84000002,
               new_comm=0x36901e8c) failed
 MPIR_Comm_copy(547): Too many communicators

So it's seems like a MPI_Comm_dup not matched by a MPI_Comm_free. Our 
model does a number of solves per time step, and for each solve all 
PETSc objects (ksp, mat, pc and vec) are created and destroyed again. 
It's only when we switch to prometheus and in parallel (no problems with 
prometheus in serial) that the error occurs. Does anyone have any 
suggestions how to track down the problem, or seen a similar problem? 
I'm fairly sure we're not missing any KSP/PC/Vec or MatDestroys.

Cheers
Stephan Kramer




More information about the petsc-users mailing list