[MPICH] Scalability problem in mpdboot
Jeff Squyres
jsquyres at cisco.com
Fri May 11 07:07:53 CDT 2007
Greetings. I was on a large cluster yesterday trying to run some
performance comparisons between Open MPI and MPICH2, but encountered
significant problems because scalability issue in mpdboot
(mpich2-1.0.5p4.tar.gz).
Specifically, I had a hostfile of 2048 nodes, and I had invoked:
shell$ mpdboot -n 2048 -f my_hostfile -v
mpdboot sat there *with no output* and 100% CPU activity for over 30
minutes. I finally poked into the mpdboot source code to see what
was happening and found the following loop in the hostfile analysis
section:
-----
if oneMPDPerHost and totalnumToStart > 1:
oldHosts = hostsAndInfo[:]
hostsAndInfo = []
for x in oldHosts:
keep = 1
for y in hostsAndInfo:
if mpd_same_ips(x['host'],y['host']):
keep = 0
break
if keep:
hostsAndInfo.append(x)
-----
This is an N^2 loop looking for duplicates in the hostfile, which, in
itself, seems fine. The problem is in mpd_same_ip's() in mpdlib.py
-- here's a snipit:
-----
def mpd_same_ips(host1,host2): # hosts may be names or IPs
try:
ips1 = socket.gethostbyname_ex(host1)[2] # may fail if
invalid host
ips2 = socket.gethostbyname_ex(host2)[2] # may fail if
invalid host
-----
Specifically, the innermost part of the N^2 loop fires off 2 DNS
queries. There does not appear to be any attempt to cache IP
addresses, so there are roughly 2*N^2 DNS lookups, the vast majority
of which are duplicates (there are only N unique hostnames).
Hence, mpdboot was sitting for 30+ minutes looping over DNS lookups
before launching anything (I ended up killing mpdboot; I don't know
how much longer it would have run). Yes, the DNS at this particular
site was rather slow.
Regardless of the speed of DNS, however, it seems like a better
solution would be to do a linear lookup of all N names in the
hostfile, cache the resulting lists of IP addresses, and then perform
the duplicate comparisons with the cache rather than forcing a large
number of explicit (and potentially very very slow) DNS lookups.
I converted my hostfile to use IP addresses instead of names, and
then mpdboot went through the duplicate hostname comparison and
started launching mpd's within a minute or less (I didn't clock it;
it was significantly faster).
I'm not a python programmer, so I don't know / have time to figure
out the Right syntax for how to change this in mpdboot / mpdlib, but
I thought I'd pass the information on.
Good luck!
--
Jeff Squyres
Cisco Systems
More information about the mpich-discuss
mailing list