[mpich-discuss] Program uses 100% with MPICH2 (<10% with MPICH1)

asco developer ascodev at gmail.com
Wed Jul 25 12:43:09 CDT 2012

Hello Darius,

Thanks for answering. I have tried both of your suggestions, 1) using
one slave process per core and also 2) recompiling with configure
option '--with-device=ch3:sock'. On both of them the MPI program still
uses 100% CPU.

Based on the above, I have a set of questions, which I would
appreciate if you could answer. Is this the expected behavior from any
MPICH2 program? Do all MPICH2 work this way? Is there anything that I
should change from the programing point of view in the? Currently, I
use MPI_Scatterv as away of distributing the array among the
processes. Do you need any information to better anser my questions?
If yes, let me know.

I tempted to think that If there is nothing to do about it and at
least for the application I'm using, MPICH1 seems to be a better fit
than MPICH2. Do you agree is this last sentence?

João Ramos

On Tue, Jul 24, 2012 at 10:01 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
> Most likely the issue you're seeing is caused by MPICH2 doing active polling while waiting for incoming messages.  This significantly improves performance unless the processor cores are oversubscribed.  If there is more than one MPICH process running on a core, then a process waiting for incoming messages is using cycles that could be used by another process to do useful work.
> There are two options to address this:
> 1. Don't oversubscribe the processor cores.  If you have n cores on a node, make sure the number of processes and threads is no more than n.
> 2. Use the sock channel in MPICH2 rather than the default nemesis channel.  The sock channel does not use shared memory for intranode communication and does not do active polling.  You can do this by configuring MPICH2 with the --with-device=ch3:sock
> Hope this helps.
> -d
> On Jul 24, 2012, at 3:43 PM, asco developer wrote:
>> Hello,
>> I have a compile C program where communication between process is
>> implemented using MPI (more precisely MPICH1). In 2005 this seemed to
>> be a good option and I have kept using this version until today.
>> Recently a user of the program (free software hosted on
>> sourceforge.net) tried MPICH2 and reported huge CPU usage. I tried
>> myself with both version mpich2-1.4.1p1 and mpich2-1.5b2 and I got the
>> same results. The binary (compiled C code on Linux) uses 100% CPU and
>> run slowly. Interesting for me it is the fact that if I renice the
>> process to the lowest possible priority, the performance with MPICH2
>> gets similar to the MPICH1 counterpart. This puzzles me as with this
>> result I do know not where to start looking.
>> Before start posting lots information, possible not really the one you
>> would need, I ask for your suggestions on what type of information do
>> you need from my side. The master node sends about 100 MPI_DOUBLE to 1
>> slave (100/n for n slaves) and little information is printed on
>> stdout. Each one of the slave processes then executes another binary
>> external program via the C "system" function call. I have looked
>> around and I have found this entry in the discussion list,
>> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2011-January/008787.html,
>> but I'm not sure if it applies to me, and Hydra is not using CPU, my
>> program is.
>> Regards,
>> João Ramos
>> _______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

More information about the mpich-discuss mailing list