[mpich-discuss] version 1.1 strange behavior : all processes becomeidle for extensive period

Rajeev Thakur thakur at mcs.anl.gov
Sat Jul 11 11:27:23 CDT 2009


The first issue has been fixed. If you try one of the nightly snapshots,
it should go away. It will be included in 1.1.1 to be out next week.
 
Can you tell us more about the second issue. What are the processes
doing when they suddenly become idle? Have they already communicated
before? Are they all running on a single machine?
 
Rajeev
 



  _____  

From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of chong tan
Sent: Friday, July 10, 2009 6:20 PM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] version 1.1 strange behavior : all processes
becomeidle for extensive period
  

I am seeing this funny situation which I did not see on 1.0.6 and 1.0.8.
Some background:
 
machine : INTEL 4Xcore 2
 
running mpiexec -n 4
 
machine has 32G of mem.  
 
when my application runs,  almost all memory are used.  However, there
is no swapping.
I have exclusive use of the machine, so contention is not an issue.
 
issue #1 :  processes take extra long to be initialized, compared to
1.0.6
issue #2 : during the run, at time all of them will become idle at the
same time, for almost a
                minute.  We never observed this with 1.0.6
 
 
The codes are the same, only linked with different versions of MPICH2.
 
MPICH2 was built with --enable-threads=multiple for 1.1.  without for
1.0.6 or 1.0.8
 
MPI calls are all in the main application thread.  I used only 4 MPI
functions :
init(), Send(), Recv() and Barrier().  
 
 
 
any suggestion ?
 
thanks
tan

 


 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090711/56824df5/attachment.htm>


More information about the mpich-discuss mailing list