<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman, new york, times, serif;font-size:12pt"><DIV>OS : RedHat Enterprise 4, 2.6.9-42.ELsmp</DIV>
<DIV>CPU 4 dual core Intel</DIV>
<DIV> </DIV>
<DIV>the package was built with :</DIV>
<DIV>setenv CFLAGS "-m32 -O2"<BR>setenv CC gcc<BR>./configure -prefix=/u/cgtan/my_release_dir --with-device=ch3:ssm --enable-fast |& tee configure.log<BR></DIV>
<DIV>-----</DIV>
<DIV>the test programs run 5 processes, one master and 4 slaves. Master always recv from slaves and them send to all of them. Randomly, an MPI_Send performed in the master will complete, but the corresponidng MPI_Recv in the targeted slave would not complete, and the who thing hangs. </DIV>
<DIV> </DIV>
<DIV>I have a debugging mechanism that attachs a sequence id to all packages sent. The packages are dumped before and after sent, and recv. a message is also dumped on the the pending recv. The sequence id traced OK all the way to the lost package.</DIV>
<DIV> </DIV>
<DIV>The same code work fine with 2.1.04p1. it has been tested on test cases longer than 100 million send/recv sequences. any suggestions ?</DIV>
<DIV> </DIV>
<DIV>tan</DIV>
<DIV> </DIV></div><br>
<hr size=1><a href="http://us.rd.yahoo.com/evt=49935/*http://games.yahoo.com">Bored stiff?</a> Loosen up...<br><a href="http://us.rd.yahoo.com/evt=49935/*http://games.yahoo.com">Download and play hundreds of games for free</a> on Yahoo! Games.</body></html>