[mpich-discuss] resources are equally shared when running processes in one machine ?
Dave Goodell
goodell at mcs.anl.gov
Fri Jul 17 08:29:25 CDT 2009
On Jul 16, 2009, at 4:11 PM, Gra zeus wrote:
> when i run my mpi code with "mpiexec -n 1 ./myprogram", everything
> work fine.
>
> however, when i rn with "mpiexec -n 4 ./myprogram", performance of
> my program dropped significantly.
>
> I code my program to work only for "process 0" like this:-
> if(id=0){ do_computation_task(); } else { //do_nothing };
>
> Is this mean physical resources are shared when I spawn more than
> one process in one physical machine(even all of work were done by
> "process 0" and other processes do nothing)?
What kind of system are you running this program on? How many
processors/cores does the machine have? Do either branches of your
code (do_computation_task or do_nothing) call into the MPI library?
How is your do_nothing implemented? Do you just fall through to the
code after this if/else (which I assume is communication code), do you
call sleep, or do you busy wait on something?
If you are oversubscribing the machine (more MPI processes than cores)
and they perform very much MPI communication then it is expected that
you would see a performance drop with a default build of the
MPICH2-1.1 series. The default channel is nemesis and it busy polls
by default, which takes up CPU resources whenever a process is inside
the MPI library. For this reason it is best not to oversubscribe when
using nemesis.
If it is a problem because of the oversubscription then you have a
couple of options:
1) You can re-configure your MPICH2 build with "--with-
device=ch3:sock". This channel is slower for intra-node communication
but it doesn't busy poll so it is more suitable for oversubscription.
2) Get a machine with more cores or spread the work over multiple
machines such that you are no longer over-subscribing.
> If the answer is "yes", can you pls tell me if it's stated somewhere
> in official document that I can make reference to.
Hmm... I thought we had something about this somewhere, but I can't
seem to find it. I'll add something to the FAQ later today.
-Dave
More information about the mpich-discuss
mailing list