[MPICH] Simple MPI program crashes randomly

Rajeev Thakur thakur at mcs.anl.gov
Mon May 28 10:50:27 CDT 2007


If it crashes at the first send/recv, can you take out everything in the
code after the first send/recv and see if it still crashes. That will give
you something smaller to debug.
 
Rajeev
 


  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Christian Zemlin
Sent: Monday, May 28, 2007 5:24 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] Simple MPI program crashes randomly


The attached program ran literally thousands of times on our old cluster,
which has MPICH1 libraries, without any problems.
It is simple in terms of MPI (just the minimum MPI setup + a few Send and
Recv commands), although it is fairly long because it implements a
complicated model of cardiac tissue.
 
On our new cluster (with MPICH2) the program crashes with segmentation
faults either on the first MPI_Send/Recv or at the end (MPI_Finalize).  If
you run it several times in a row, it seems to be random which of the two
occurs.  I have checked carefully that the space for the passed data has
been allocated.
 
I would greatly appreciate if someone with more MPI experience could have a
look at the source code and give me his/her opinion.
 
Best,
 
Christian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070528/859a201b/attachment.htm>


More information about the mpich-discuss mailing list