[MPICH] An idle communication process use the same CPU as computation process on multi-core chips

Sylvain Jeaugey sylvain.jeaugey at bull.net
Fri Sep 14 02:42:23 CDT 2007


Yusong,

I may be wrong for nemesis, but most shm-based MPI implementations rely on 
busy polling, which make them appear as using 100% CPU. It may not be a 
problem though because thay also call frequently sched_yield() when they 
have nothing to receive, which means that if another task is running on 
the same CPU, the "master" task will give all his CPU time to the 
other task.

So, it's not really a problem to have task 0 at 100% CPU. Just launch an 
additionnal task and see if it takes the CPU cycles of the master. You 
might also use taskset (at least on Fedora) to bind tasks on CPUs.

Sylvain

On Thu, 13 Sep 2007, Yusong Wang wrote:

> Hi all,
>
> I have a program which is implemented with a master/slave model and the
> master just do very little computation. In my test, the master spent
> most of its time to wait other process to finish MPI_Gather
> communication (confirmed with jumpshot/MPE). In several tests on
> different multi-core chips (dual-core, quad-core, 8-core), I found the
> master use the same amount of CPU as the slaves, which should do all the
> computation. There are only two exceptions that the master use near 0%
> CPU (one on Window, one on Linux), which is what I expect. The tests
> were did on both Fedora Linux and Widows with MPICH2 (shm/nemesis
> mpd/smpd). I don't know if it is a software/system issue or caused by
> different hardware. I would think this is  (at least )related with
> hardware. As with the same operating system, I got different CPU usage
> (near 0% or near 100%) for the master on different multi-core nodes of
> our clusters.
>
> Is there any documents I can check out for this issue?
>
> Thanks,
>
> Yusong
>
>




More information about the mpich-discuss mailing list