[mpich-discuss] Assertion failure from too many MPI_Gets between fences

Jeremiah Willcock jewillco at osl.iu.edu
Fri Jan 7 12:56:34 CST 2011


When I run a large number of MPI_Get operations (8 bytes each) between two 
MPI_Fences, I sometimes receive the error:

Assertion failed in file ch3_istartmsg.c at line 90: sreq != NULL
internal ABORT - process 0

on some or all ranks.  I am using the SVN head version currently, but the 
same error (and same line number) occurred with 1.3.1.  I am running two 
processes on one machine using "mpiexec -n 2 app"; the platform is x86-64 
Linux (RHEL 5.5, gcc 4.1.2).  The number of MPI_Get operations required 
seems to be about 260k; fewer appears to work fine, but the exact number 
required for the error varies.  The kind of code I am using is:

MPI_Win_fence(MPI_MODE_NOPRECEDE, win);
size_t i;
for (i = 0; i < count; ++i) {
   MPI_Get(..., 1, MPI_INT64_T, owner, index, 1, MPI_INT64_T, win);
}
MPI_Win_fence(MPI_MODE_NOSUCCEED, win);

Is this a known issue?  Is there any other information I need to provide? 
Thank you for your help.

-- Jeremiah Willcock


More information about the mpich-discuss mailing list