[MPICH] any way to ask nemesis to turn-off and turn of active polling ?
chong tan
chong_guan_tan at yahoo.com
Mon Dec 24 13:14:36 CST 2007
No, I have twice the Proceessors than processes. all the CPU are multi-core.
Only that 1 CPU has 1 proc, the master, more than others.
The interesting issue is that if I merge the master and the slave on that CPU into 1
process, the issue of extra 60+ minute before successful first MPI disappears. During this extra
60 minutes, the only things that is active other than the slave is the nemesis polling. Which
lead me to conclude that polling is one of the main contributor to the issue.
tan
----- Original Message ----
From: Darius Buntinas <buntinas at mcs.anl.gov>
To: chong tan <chong_guan_tan at yahoo.com>
Cc: mpich-discuss at mcs.anl.gov
Sent: Friday, December 21, 2007 1:27:24 PM
Subject: Re: [MPICH] any way to ask nemesis to turn-off and turn of active polling ?
If you have P processors, and you're running P slaves and 1 master, I'm
not sure how you could have P+1 processes running at 100%. Are you
running one slave per processor, then adding an additional master?
If you have a process waiting in a blocking receive, it will show that
it's using 100% of the CPU if it has its own CPU to run on, but if that
process has to share a CPU with another process that's doing some work,
only then will you see the CPU usage of the waiting process go down.
-d
On 12/17/2007 04:48 PM, chong tan wrote:
> I am running RedHat enterprise 5. sysctl complains that
> sched_compat_yield is not known for kernel.
>
> BTW, I run the test, and both master and slaves utilize 100% of CPU.
>
> any suggestion ?
>
> thanks
> tan
>
>
>
> ----- Original Message ----
> From: Darius Buntinas <buntinas at mcs.anl.gov>
> To: chong tan <chong_guan_tan at yahoo.com>
> Cc: mpich-discuss at mcs.anl.gov
> Sent: Monday, December 17, 2007 1:23:46 PM
> Subject: Re: [MPICH] any way to ask nemesis to turn-off and turn of
> active polling ?
>
>
>
> On 12/17/2007 01:00 PM, chong tan wrote:
> > Thanks,
> > I don;t have root access to the box. I will see if I can ask sys-admin
> > to do it. I am running
> > Linux snowwhite 2.6.18-8.el5 #1 SMP
> >
> > DO you know if the broken yield got into this version ?
>
> I don't know, but you can try the master/slave programs from the
> discussion we had on sched_yield a few months ago:
>
> Master:
>
> int main() {
> while (1) {
> sched_yield();
> }
> return 0;
> }
>
> Slave:
>
> int main() {
> while (1);
> return 0;
> }
>
> Start 4 slaves first, THEN one master, and check 'top'. If it shows
> that the master is taking more than 1% or so, you have a kernel with the
> 'broken' yield.
>
> > FYI : the 'yield' people said it is not 'broken', it is in fact the
> > 'right yield'.
>
> Maybe, but Linus is on my side :-)
>
> -d
>
> >
> > tan
> >
> >
> >
> > ----- Original Message ----
> > From: Darius Buntinas <buntinas at mcs.anl.gov
> <mailto:buntinas at mcs.anl.gov>>
> > To: chong tan <chong_guan_tan at yahoo.com
> <mailto:chong_guan_tan at yahoo.com>>
> > Cc: mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>
> > Sent: Monday, December 17, 2007 10:50:12 AM
> > Subject: Re: [MPICH] any way to ask nemesis to turn-off and turn of
> > active polling ?
> >
> >
> > Try setting the processor affinity for the "average" processes (map each
> > one to its own processor). If you have a kernel with the "broken"
> > sched_yield implementations, that may not help.
> >
> > If you have a "broken" sched_yield implementation, you can try doing
> > this as root:
> > sysctl kernel.sched_compat_yield=1
> > or
> > echo "1">/proc/sys/kernel/sched_compat_yield
> >
> > -d
> >
> >
> > On 12/17/2007 11:35 AM, chong tan wrote:
> > > Yes, in a very subtle way which has major impact on performance.
> I will
> > > try to decribe it a litle here:
> > >
> > > system has 32G, total image 35G. Load is a litle offbalance
> > > mathematically, 4X dual core, running 5 processes.
> > > 4 processes are the same size, each runs on a CPU. the last
> process is
> > > very small, about10% of others, run
> > > on a core of one of the CPU. SO 1 CPU runs 2 procs: average
> (P1)one and
> > > light one (P2).
> > >
> > > All proc do first MPI comm in a fixed algorithmic point. The 'useful'
> > > image is about 29G at that point, and should
> > > fit into the physical memory. P2 get there in a heart beat, then
> > > others., followed by P1 which took another 60+ minutes
> > > to get there. If I combine P1 and P2 into 1 process, then I don;t no
> > > see this extra delay.
> > >
> > > tan
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Darius Buntinas <buntinas at mcs.anl.gov
> <mailto:buntinas at mcs.anl.gov>
> > <mailto:buntinas at mcs.anl.gov <mailto:buntinas at mcs.anl.gov>>>
> > > To: chong tan <chong_guan_tan at yahoo.com
> <mailto:chong_guan_tan at yahoo.com>
> > <mailto:chong_guan_tan at yahoo.com <mailto:chong_guan_tan at yahoo.com>>>
> > > Cc: mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>
> <mailto:mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>>
> > > Sent: Monday, December 17, 2007 8:02:17 AM
> > > Subject: Re: [MPICH] any way to ask nemesis to turn-off and turn of
> > > active polling ?
> > >
> > >
> > > No, there's no way to do that. Even MPI_Barrier will do active
> polling.
> > >
> > > Are you having issues where an MPI process that is waiting in a
> blocking
> > > call is taking CPU time away from other processes?
> > >
> > > -d
> > >
> > > On 12/14/2007 04:53 PM, chong tan wrote:
> > > > My issue is like this :
> > > >
> > > > among all the processess, some will get to the point of first MPI
> > > > communication points faster than
> > > > than other. Is there a way that I tell nemesis to start
> without doing
> > > > active polling, and then turn
> > > > on active polling with some function ?
> > > >
> > > > Or should I just use MPI_Barrier() on that ?
> > > >
> > > > thanks
> > > > tan
> > > >
> > > >
> > > >
> > ------------------------------------------------------------------------
> > > > Be a better friend, newshound, and know-it-all with Yahoo!
> Mobile. Try
> > > > it now.
> > > >
> > >
> >
> <http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
>
> >
> > >
> > > > >
> > >
> > >
> > >
> ------------------------------------------------------------------------
> > > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try
> > > it now.
> > >
> >
> <http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
>
> >
> > > >
> >
> >
> > ------------------------------------------------------------------------
> > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try
> > it now.
> >
> <http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
>
> > >
>
>
> ------------------------------------------------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try
> it now.
> <http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> >
____________________________________________________________________________________
Looking for last minute shopping deals?
Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20071224/a3746c59/attachment.htm>
More information about the mpich-discuss
mailing list