[mpich-discuss] version 1.1 strange behavior : all processes become idle for extensive period

chong tan chong_guan_tan at yahoo.com
Mon Jul 13 12:36:07 CDT 2009



thanks darius,

When I did the comparison (or benchmarking), I have 2 identical source trees.  Everything
were recompiled group up and compiled/linked accordinglyto the version of MPICH2 
to be used.

I have many tests, this is the only one showing this behavior, and is predictably repeatable.
most of my tests are showing comaptible performance and many do better
with 1.1.

The 'weirdest' thing is the ~1 minute span where there is no activity on the box at all, zipo 
activity except 'top', with machine load at around 0.12.  I don't know how to explain this
'behavior', and I am extremely curious if anyone can explain this.

I can't repeat this on AMD boxes as I don't have one that has only 32G of memory.  I can't
repeat this on Niagara box as thread multiple won't build.

I will try to rebuild 1.1 without thread-multiple.  Will keep you posted.

Meanwhile, if anyone has any speculations on this, please bring them up.

thanks
tan


________________________________
From: Darius Buntinas <buntinas at mcs.anl.gov>
To: mpich-discuss at mcs.anl.gov
Sent: Monday, July 13, 2009 8:30:19 AM
Subject: Re: [mpich-discuss] version 1.1 strange behavior : all processes become idle for extensive period

Tan,

Did you just re-link the applications, or did you recompile them?
Version 1.1 is most likely not binary compatible with 1.0.6, so you
really need to recompile the application.

Next, don't use the --enable-threads=multiple flag when configuring
mpich2.  By default, mpich2 supports all thread levels and will select
the thread level at run time (depending on the parameters passed to
MPI_Init_thread).  By allowing the thread level to be selected
automatically at run time, you'll avoid the overhead of thread safety
when it's not needed, allowing your non-threaded applications to run faster.

Let us know if either of these fixes the problem, especially if just
removing the --enable-threads option fixes this.

Thanks,
-d

On 07/10/2009 06:19 PM, chong tan wrote:
> I am seeing this funny situation which I did not see on 1.0.6 and
> 1.0.8.  Some background:
>  
> machine : INTEL 4Xcore 2
>  
> running mpiexec -n 4
>  
> machine has 32G of mem. 
>  
> when my application runs,  almost all memory are used.  However, there
> is no swapping.
> I have exclusive use of the machine, so contention is not an issue.
>  
> issue #1 :  processes take extra long to be initialized, compared to 1.0.6
> issue #2 : during the run, at time all of them will become idle at the
> same time, for almost a
>                minute.  We never observed this with 1.0.6
>  
>  
> The codes are the same, only linked with different versions of MPICH2.
>  
> MPICH2 was built with --enable-threads=multiple for 1.1.  without for
> 1.0.6 or 1.0.8
>  
> MPI calls are all in the main application thread.  I used only 4 MPI
> functions :
> init(), Send(), Recv() and Barrier(). 
>  
>  
>  
> any suggestion ?
>  
> thanks
> tan
> 
>  
> 
>      
> 
> 



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090713/42ae4d51/attachment.htm>


More information about the mpich-discuss mailing list