[mpich-discuss] on how processes are distributed to processors

Nicolas Rosner nrosner at gmail.com
Fri Jan 23 03:08:06 CST 2009


Hello all,

I'm testing an idea that would require to run 2 processes --let's call
them A and B-- on each physical processor (or core). The test platform
is a cluster of quad-core machines running MPICH2 1.0.7, and the
machine file that is fed to mpd specifies ncpus=4 for each node. As an
example, when using a machine file with 5 quad-core nodes on it (i.e.
20 cores), I'd tell mpiexec to launch 40 processes.

Following the usual pattern, both roles are implemented on a single
program that decides whether to run as A or B depending on rank. The
if statement currently decides to run as A for ranks within the lower
half (0-19 in the example above), and as B for the upper half (20-39
in the example). As expected, this does yield 40 processes in total
(20xA + 20xB), with exactly 8 processes on each host -- so far, so
good.

However, understanding the mpiexec/mpd behavior that affects how roles
are mapped to hosts doesn't seem as easy. From what I've read it seems
that a number of processes greater than the number of processors
should cause  mpd "wraparound" (i.e. two full "turns" around the
ring), so I was expecting 4 As to be launched on each host (first
turn), then 4 Bs on each host (second turn). In other words, I was
hoping 4 of the 8 processes lauched on each host to be A, and the
other 4 to be B.

Instead, to my surprise, the actual result is not that symmetrical at
all. For some reason, the policy governing how ranks are "dealt" to
hosts seems to change during said dealing. For instance, on a recent
test with 3 quad-core nodes, I observed the following behavior:

- ranks 0, 1, 2, 3 were launched on host 1
- ranks 4, 5, 6, 7 were launched on host 2
- ranks 8, 9, 10, 11 were launched on host 3
- rank 12 was lanched on host 1
- rank 13 was launched on host 2
- rank 14 was launched on host 3
- rank 15 was launched on host 1
- rank 16 was launched on host 2
     ... and so on.

I must be missing the reason for this apparent lack of consistency.
The policy during the 1st turn seemed reasonable (considering
ncpus=4) -- why does it suddenly change on the 2nd turn? Can this be
normalized somehow? Where could I find more information about these
policies? I tried reading up on both mpiexec and mpd, but didn't find
anything that explains this in detail.

Thanks in advance!
N.



More information about the mpich-discuss mailing list