[mpich-discuss] Assertion failure from too many MPI_Gets between fences
Jeremiah Willcock
jewillco at osl.iu.edu
Fri Jan 7 12:56:34 CST 2011
When I run a large number of MPI_Get operations (8 bytes each) between two
MPI_Fences, I sometimes receive the error:
Assertion failed in file ch3_istartmsg.c at line 90: sreq != NULL
internal ABORT - process 0
on some or all ranks. I am using the SVN head version currently, but the
same error (and same line number) occurred with 1.3.1. I am running two
processes on one machine using "mpiexec -n 2 app"; the platform is x86-64
Linux (RHEL 5.5, gcc 4.1.2). The number of MPI_Get operations required
seems to be about 260k; fewer appears to work fine, but the exact number
required for the error varies. The kind of code I am using is:
MPI_Win_fence(MPI_MODE_NOPRECEDE, win);
size_t i;
for (i = 0; i < count; ++i) {
MPI_Get(..., 1, MPI_INT64_T, owner, index, 1, MPI_INT64_T, win);
}
MPI_Win_fence(MPI_MODE_NOSUCCEED, win);
Is this a known issue? Is there any other information I need to provide?
Thank you for your help.
-- Jeremiah Willcock
More information about the mpich-discuss
mailing list