[MPICH] An idle communication process use the same CPU as computation process on multi-core chips
Yusong Wang
ywang25 at aps.anl.gov
Mon Sep 17 14:22:45 CDT 2007
I verified the expected result on the quad-core machine, while this is not the case on both dual-core and eight-core machines, which we are particular interested, as dual-core laptops and 8-core personal clusters become popular. The master uses real CPU no matter how I set the CPU affinity on those machines.
Yusong
----- Original Message -----
From: Darius Buntinas <buntinas at mcs.anl.gov>
Date: Monday, September 17, 2007 12:39 pm
Subject: Re: [MPICH] An idle communication process use the same CPU as computation process on multi-core chips
>
> I can verify that I saw the same problem Yusong did when starting
> the
> master first on a dual quadcore machine. But assigning each slave
> to
> its own core (using taskset) fixed that.
>
> Interestingly, when there are less than 8 slaves, top shows that
> the
> master has 100% usage (when top is in "irix mode", and 12.5% (1/8)
> when
> not in irix mode). When I have 8 slaves, the usage of the master
> process goes to 0.
>
> Yusong, I'm betting that if you set the cpu affinity for the
> slaves,
> you'll see no impact of the master on the slaves. Can you try that?
>
> e.g.,:
> ./master &
> for i in `seq 0 3` ; do taskset -c $i ./slave & done
>
> -d
>
> On 09/17/2007 02:31 AM, Sylvain Jeaugey wrote:
> > This seems to be the key of the problem. When the master is
> launched
> > before others, it takes one CPU and this won't change until for
> any
> > scheduling reason he comes to share its CPU (with a slave). It
> then
> > falls to 0% and we're saved.
> >
> > So, to conduct you experiment, you definetely need to taskset
> your
> > slaves. Just launch them with
> > taskset -c <cpu> ./slave (1 process per cpu)
> > or use the -p option of taskset to do it after launch and ensure
> that
> > each slave _will_ take one CPU. Thus, the master will be obliged
> to
> > share the cpu with others and sched_yield() will be effective.
> >
> > Sylvain
> >
> > On Sun, 16 Sep 2007, Yusong Wang wrote:
> >
> >> I did the experiments on four types of muti-core chips (2 dual-
> core,
> >> 1 quad-core and 1 eight-core). All of my tests shows the idle
> master
> >> process has a big impact on the other slave processes except for
> the
> >> test of the quad-core, in which I found the order does matter:
> when
> >> the master was launched after the slave processes were launched,
> there
> >> is no affect, while if the master started first, two slaves
> processes
> >> would go to the same core and cause the two processes to slow
> down
> >> significantly than others.
> >>
> >> Yusong
> >>
> >> ----- Original Message -----
> >> From: Darius Buntinas <buntinas at mcs.anl.gov>
> >> Date: Friday, September 14, 2007 12:55 pm
> >> Subject: Re: [MPICH] An idle communication process use the same
> CPU as
> >> computation process on multi-core chips
> >>
> >>>
> >>> It's possible that different versions of the kernel/os/top compute
> >>> %cpu
> >>> differently. "CPU utilization" is really a nebulous term. What
> >>> you
> >>> really want to know is whether the master is stealing significant
> >>> cycles
> >>> from the slaves. A test of this would be to replace Sylvain's
> >>> slave
> >>> code with this:
> >>>
> >>> #include <sys/time.h>
> >>> int main() {
> >>> while (1) {
> >>> int i;
> >>> struct timeval t0,t1;
> >>> double usec;
> >>>
> >>> gettimeofday(&t0, 0);
> >>> for (i = 0; i < 100000000; ++i)
> >>> ;
> >>> gettimeofday(&t1, 0);
> >>>
> >>> usec = (t1.tv_sec * 1e6 + t1.tv_usec) - (t0.tv_sec *
> 1e6 +
> >>> t0.tv_usec);
> >>> printf ("%8.0f\n", usec);
> >>> }
> >>> return 0;
> >>> }
> >>>
> >>> This will repeatedly time the inner loop. On an N core system,
> run>>> N of
> >>> these, and look at the times reported. Then start the master and
> >>> see if
> >>> the timings change. If the master does steal significant cycles
> >>> from
> >>> the slaves, then you'll see the timings reported by the slaves
> >>> increase.
> >>> On my single processor laptop (fc6, 2.6.20), running one
> slave, I
> >>> see
> >>> no impact from the master.
> >>>
> >>> Please let me know what you find.
> >>>
> >>> As far as slave processes hopping around on processors, you can
> set>>> processor affinity (
> http://www.linuxjournal.com/article/6799 has a
> >>> good
> >>> description) on the slaves.
> >>>
> >>> -d
> >>>
> >>> On 09/14/2007 12:11 PM, Bob Soliday wrote:
> >>>> Sylvain Jeaugey wrote:
> >>>>> That's unfortunate.
> >>>>>
> >>>>> Still, I did two programs. A master :
> >>>>> ----------------------
> >>>>> int main() {
> >>>>> while (1) {
> >>>>> sched_yield();
> >>>>> }
> >>>>> return 0;
> >>>>> }
> >>>>> ----------------------
> >>>>> and a slave :
> >>>>> ----------------------
> >>>>> int main() {
> >>>>> while (1);
> >>>>> return 0;
> >>>>> }
> >>>>> ----------------------
> >>>>>
> >>>>> I launch 4 slaves and 1 master on a bi dual-core machine. Here
> >>> is the
> >>>>> result in top :
> >>>>>
> >>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
> >>> COMMAND>> 12361 sylvain 25 0 2376 244 188 R 100 0.0
> >>> 0:18.26 slave
> >>>>> 12362 sylvain 25 0 2376 244 188 R 100 0.0 0:18.12
> slave>>>>> 12360 sylvain 25 0 2376 244 188 R 100 0.0
> 0:18.23 slave
> >>>>> 12363 sylvain 25 0 2376 244 188 R 100 0.0 0:18.15
> slave>>>>> 12364 sylvain 20 0 2376 248 192 R 0 0.0
> 0:00.00 master
> >>>>> 12365 sylvain 16 0 6280 1120 772 R 0 0.0 0:00.08 top
> >>>>>
> >>>>> If you are seeing 66% each, I guess that your master is not
> >>>>> sched_yield'ing as much as expected. Maybe you should look at
> >>>>> environment variables to force yield when no message is
> >>> available, and
> >>>>> maybe your master isn't so idle after all and has message to
> >>> send
> >>>>> continuously, thus not yield'ing.
> >>>>>
> >>>>
> >>>> On our FC5 nodes with 4 cores we get similar results. But on our
> >>> FC7
> >>>> nodes with 8 cores we don't. The kernel seems to think that
> all 9
> >>> jobs
> >>>> require 100% and they end up jumping from one core to another.
> >>> Often the
> >>>> master job is left on it's own core while two slaves run on
> another.>>>>
> >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P
> >>> COMMAND> 20127 ywang25 20 0 106m 22m 4168 R 68 0.5
> >>> 0:06.84 0 slave
> >>>> 20131 ywang25 20 0 106m 22m 4184 R 73 0.5 0:07.26 1
> slave>>>> 20133 ywang25 20 0 106m 22m 4196 R 75 0.5
> 0:07.49 2 slave
> >>>> 20129 ywang25 20 0 106m 22m 4176 R 84 0.5 0:08.44 3
> slave>>>> 20135 ywang25 20 0 106m 22m 4176 R 73 0.5
> 0:07.29 4 slave
> >>>> 20132 ywang25 20 0 106m 22m 4188 R 70 0.5 0:07.04 4
> slave>>>> 20128 ywang25 20 0 106m 22m 4180 R 78 0.5
> 0:07.79 5 slave
> >>>> 20130 ywang25 20 0 106m 22m 4180 R 74 0.5 0:07.45 6
> slave>>>> 20134 ywang25 20 0 106m 24m 6708 R 80 0.6
> 0:07.98 7
> >>> master>
> >>>> 20135 ywang25 20 0 106m 22m 4176 R 75 0.5 0:14.75 0
> slave>>>> 20132 ywang25 20 0 106m 22m 4188 R 79 0.5
> 0:14.96 1 slave
> >>>> 20130 ywang25 20 0 106m 22m 4180 R 99 0.5 0:17.32 2
> slave>>>> 20129 ywang25 20 0 106m 22m 4176 R 100 0.5
> 0:18.44 3 slave
> >>>> 20127 ywang25 20 0 106m 22m 4168 R 75 0.5 0:14.36 4
> slave>>>> 20133 ywang25 20 0 106m 22m 4196 R 96 0.5
> 0:17.09 5 slave
> >>>> 20131 ywang25 20 0 106m 22m 4184 R 78 0.5 0:15.02 6
> slave>>>> 20128 ywang25 20 0 106m 22m 4180 R 99 0.5
> 0:17.70 6 slave
> >>>> 20134 ywang25 20 0 106m 24m 6708 R 100 0.6 0:17.97 7
> >>> master>
> >>>> 20130 ywang25 20 0 106m 22m 4180 R 87 0.5 0:25.99 0
> slave>>>> 20132 ywang25 20 0 106m 22m 4188 R 79 0.5
> 0:22.83 0 slave
> >>>> 20127 ywang25 20 0 106m 22m 4168 R 75 0.5 0:21.89 1
> slave>>>> 20133 ywang25 20 0 106m 22m 4196 R 98 0.5
> 0:26.94 2 slave
> >>>> 20129 ywang25 20 0 106m 22m 4176 R 100 0.5 0:28.45 3
> slave>>>> 20135 ywang25 20 0 106m 22m 4176 R 74 0.5
> 0:22.12 4 slave
> >>>> 20134 ywang25 20 0 106m 24m 6708 R 98 0.6 0:27.73 5
> >>> master> 20128 ywang25 20 0 106m 22m 4180 R 90 0.5
> >>> 0:26.72 6 slave
> >>>> 20131 ywang25 20 0 106m 22m 4184 R 99 0.5 0:24.96 7
> slave>>>>
> >>>> 20133 ywang25 20 0 91440 5756 4852 R 87 0.1 0:44.20 0
> slave>>>> 20132 ywang25 20 0 91436 5764 4860 R 80 0.1
> 0:39.32 0
> >>> slave
> >>>> 20134
> >>>> ywang25 20 0 112m 36m 11m R 96 0.9 0:47.35 5 master
> >>>> 20129 ywang25 20 0 91440 5736 4832 R 91 0.1 0:46.84 1
> slave>>>> 20130 ywang25 20 0 91440 5748 4844 R 83 0.1
> 0:43.07 3 slave
> >>>> 20131 ywang25 20 0 91432 5744 4840 R 84 0.1 0:41.20 4
> slave>>>> 20134 ywang25 20 0 112m 36m 11m R 96 0.9
> 0:47.35 5
> >>> master> 20128 ywang25 20 0 91432 5752 4844 R 93 0.1
> >>> 0:45.36 5 slave
> >>>> 20127 ywang25 20 0 91440 5724 4824 R 94 0.1 0:40.56 6
> slave>>>> 20135 ywang25 20 0 91440 5736 4832 R 92 0.1
> 0:39.75 7 slave
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >
>
More information about the mpich-discuss
mailing list