[mpich-discuss] Program uses 100% with MPICH2 (<10% with MPICH1)

Darius Buntinas buntinas at mcs.anl.gov
Tue Jul 24 16:01:50 CDT 2012

Most likely the issue you're seeing is caused by MPICH2 doing active polling while waiting for incoming messages.  This significantly improves performance unless the processor cores are oversubscribed.  If there is more than one MPICH process running on a core, then a process waiting for incoming messages is using cycles that could be used by another process to do useful work.

There are two options to address this:

1. Don't oversubscribe the processor cores.  If you have n cores on a node, make sure the number of processes and threads is no more than n.

2. Use the sock channel in MPICH2 rather than the default nemesis channel.  The sock channel does not use shared memory for intranode communication and does not do active polling.  You can do this by configuring MPICH2 with the --with-device=ch3:sock

Hope this helps.


On Jul 24, 2012, at 3:43 PM, asco developer wrote:

> Hello,
> I have a compile C program where communication between process is
> implemented using MPI (more precisely MPICH1). In 2005 this seemed to
> be a good option and I have kept using this version until today.
> Recently a user of the program (free software hosted on
> sourceforge.net) tried MPICH2 and reported huge CPU usage. I tried
> myself with both version mpich2-1.4.1p1 and mpich2-1.5b2 and I got the
> same results. The binary (compiled C code on Linux) uses 100% CPU and
> run slowly. Interesting for me it is the fact that if I renice the
> process to the lowest possible priority, the performance with MPICH2
> gets similar to the MPICH1 counterpart. This puzzles me as with this
> result I do know not where to start looking.
> Before start posting lots information, possible not really the one you
> would need, I ask for your suggestions on what type of information do
> you need from my side. The master node sends about 100 MPI_DOUBLE to 1
> slave (100/n for n slaves) and little information is printed on
> stdout. Each one of the slave processes then executes another binary
> external program via the C "system" function call. I have looked
> around and I have found this entry in the discussion list,
> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2011-January/008787.html,
> but I'm not sure if it applies to me, and Hydra is not using CPU, my
> program is.
> Regards,
> João Ramos
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

More information about the mpich-discuss mailing list