<div>try bringing up top while you run your application, then type 1 into top, that will give a </div> <div>a detail per core/CPU info. Some some system, this info is more precise that the default.</div> <div> </div> <div>tan</div> <div><BR><BR><B><I>Darius Buntinas <buntinas@mcs.anl.gov></I></B> wrote:</div> <BLOCKQUOTE class=replbq style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px solid"><BR>It's possible that different versions of the kernel/os/top compute %cpu <BR>differently. "CPU utilization" is really a nebulous term. What you <BR>really want to know is whether the master is stealing significant cycles <BR>from the slaves. A test of this would be to replace Sylvain's slave <BR>code with this:<BR><BR>#include <SYS time.h><BR>int main() {<BR>while (1) {<BR>int i;<BR>struct timeval t0,t1;<BR>double usec;<BR><BR>gettimeofday(&t0, 0);<BR>for (i = 0; i < 100000000; ++i)<BR>;<BR>gettimeofday(&t1, 0);<BR><BR>usec =
(t1.tv_sec * 1e6 + t1.tv_usec) - (t0.tv_sec * 1e6 + <BR>t0.tv_usec);<BR>printf ("%8.0f\n", usec);<BR>}<BR>return 0;<BR>}<BR><BR>This will repeatedly time the inner loop. On an N core system, run N of <BR>these, and look at the times reported. Then start the master and see if <BR>the timings change. If the master does steal significant cycles from <BR>the slaves, then you'll see the timings reported by the slaves increase. <BR>On my single processor laptop (fc6, 2.6.20), running one slave, I see <BR>no impact from the master.<BR><BR>Please let me know what you find.<BR><BR>As far as slave processes hopping around on processors, you can set <BR>processor affinity ( http://www.linuxjournal.com/article/6799 has a good <BR>description) on the slaves.<BR><BR>-d<BR><BR>On 09/14/2007 12:11 PM, Bob Soliday wrote:<BR>> Sylvain Jeaugey wrote:<BR>>> That's unfortunate.<BR>>><BR>>> Still, I did two programs. A master :<BR>>> ----------------------<BR>>>
int main() {<BR>>> while (1) {<BR>>> sched_yield();<BR>>> }<BR>>> return 0;<BR>>> }<BR>>> ----------------------<BR>>> and a slave :<BR>>> ----------------------<BR>>> int main() {<BR>>> while (1);<BR>>> return 0;<BR>>> }<BR>>> ----------------------<BR>>><BR>>> I launch 4 slaves and 1 master on a bi dual-core machine. Here is the <BR>>> result in top :<BR>>><BR>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND<BR>>> 12361 sylvain 25 0 2376 244 188 R 100 0.0 0:18.26 slave<BR>>> 12362 sylvain 25 0 2376 244 188 R 100 0.0 0:18.12 slave<BR>>> 12360 sylvain 25 0 2376 244 188 R 100 0.0 0:18.23 slave<BR>>> 12363 sylvain 25 0 2376 244 188 R 100 0.0 0:18.15 slave<BR>>> 12364 sylvain 20 0 2376 248 192 R 0 0.0 0:00.00 master<BR>>> 12365 sylvain 16 0 6280 1120 772 R 0 0.0 0:00.08 top<BR>>><BR>>> If you are seeing 66% each, I
guess that your master is not <BR>>> sched_yield'ing as much as expected. Maybe you should look at <BR>>> environment variables to force yield when no message is available, and <BR>>> maybe your master isn't so idle after all and has message to send <BR>>> continuously, thus not yield'ing.<BR>>><BR>> <BR>> On our FC5 nodes with 4 cores we get similar results. But on our FC7 <BR>> nodes with 8 cores we don't. The kernel seems to think that all 9 jobs <BR>> require 100% and they end up jumping from one core to another. Often the <BR>> master job is left on it's own core while two slaves run on another.<BR>> <BR>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND<BR>> 20127 ywang25 20 0 106m 22m 4168 R 68 0.5 0:06.84 0 slave<BR>> 20131 ywang25 20 0 106m 22m 4184 R 73 0.5 0:07.26 1 slave<BR>> 20133 ywang25 20 0 106m 22m 4196 R 75 0.5 0:07.49 2 slave<BR>> 20129 ywang25 20 0 106m 22m 4176 R 84 0.5 0:08.44 3
slave<BR>> 20135 ywang25 20 0 106m 22m 4176 R 73 0.5 0:07.29 4 slave<BR>> 20132 ywang25 20 0 106m 22m 4188 R 70 0.5 0:07.04 4 slave<BR>> 20128 ywang25 20 0 106m 22m 4180 R 78 0.5 0:07.79 5 slave<BR>> 20130 ywang25 20 0 106m 22m 4180 R 74 0.5 0:07.45 6 slave<BR>> 20134 ywang25 20 0 106m 24m 6708 R 80 0.6 0:07.98 7 master<BR>> <BR>> 20135 ywang25 20 0 106m 22m 4176 R 75 0.5 0:14.75 0 slave<BR>> 20132 ywang25 20 0 106m 22m 4188 R 79 0.5 0:14.96 1 slave<BR>> 20130 ywang25 20 0 106m 22m 4180 R 99 0.5 0:17.32 2 slave<BR>> 20129 ywang25 20 0 106m 22m 4176 R 100 0.5 0:18.44 3 slave<BR>> 20127 ywang25 20 0 106m 22m 4168 R 75 0.5 0:14.36 4 slave<BR>> 20133 ywang25 20 0 106m 22m 4196 R 96 0.5 0:17.09 5 slave<BR>> 20131 ywang25 20 0 106m 22m 4184 R 78 0.5 0:15.02 6 slave<BR>> 20128 ywang25 20 0 106m 22m 4180 R 99 0.5 0:17.70 6 slave<BR>> 20134 ywang25 20 0 106m 24m 6708 R 100 0.6 0:17.97 7 master<BR>> <BR>> 20130 ywang25 20 0 106m 22m
4180 R 87 0.5 0:25.99 0 slave<BR>> 20132 ywang25 20 0 106m 22m 4188 R 79 0.5 0:22.83 0 slave<BR>> 20127 ywang25 20 0 106m 22m 4168 R 75 0.5 0:21.89 1 slave<BR>> 20133 ywang25 20 0 106m 22m 4196 R 98 0.5 0:26.94 2 slave<BR>> 20129 ywang25 20 0 106m 22m 4176 R 100 0.5 0:28.45 3 slave<BR>> 20135 ywang25 20 0 106m 22m 4176 R 74 0.5 0:22.12 4 slave<BR>> 20134 ywang25 20 0 106m 24m 6708 R 98 0.6 0:27.73 5 master<BR>> 20128 ywang25 20 0 106m 22m 4180 R 90 0.5 0:26.72 6 slave<BR>> 20131 ywang25 20 0 106m 22m 4184 R 99 0.5 0:24.96 7 slave<BR>> <BR>> 20133 ywang25 20 0 91440 5756 4852 R 87 0.1 0:44.20 0 slave<BR>> 20132 ywang25 20 0 91436 5764 4860 R 80 0.1 0:39.32 0 slave <BR>> 20134 <BR>> ywang25 20 0 112m 36m 11m R 96 0.9 0:47.35 5 master<BR>> 20129 ywang25 20 0 91440 5736 4832 R 91 0.1 0:46.84 1 slave<BR>> 20130 ywang25 20 0 91440 5748 4844 R 83 0.1 0:43.07 3 slave<BR>> 20131 ywang25 20 0 91432 5744 4840 R 84 0.1 0:41.20 4
slave<BR>> 20134 ywang25 20 0 112m 36m 11m R 96 0.9 0:47.35 5 master<BR>> 20128 ywang25 20 0 91432 5752 4844 R 93 0.1 0:45.36 5 slave<BR>> 20127 ywang25 20 0 91440 5724 4824 R 94 0.1 0:40.56 6 slave<BR>> 20135 ywang25 20 0 91440 5736 4832 R 92 0.1 0:39.75 7 slave<BR>> <BR>> <BR>> <BR>> <BR><BR></BLOCKQUOTE><BR><p> 
<hr size=1>Take the Internet to Go: Yahoo!Go puts the <a href="http://us.rd.yahoo.com/evt=48253/*http://mobile.yahoo.com/go?refer=1GNXIC">Internet in your pocket:</a> mail, news, photos & more.