[MPICH] Scalability problem in mpdboot

Jeff Squyres jsquyres at cisco.com
Fri May 11 07:07:53 CDT 2007


Greetings.  I was on a large cluster yesterday trying to run some  
performance comparisons between Open MPI and MPICH2, but encountered  
significant problems because scalability issue in mpdboot  
(mpich2-1.0.5p4.tar.gz).

Specifically, I had a hostfile of 2048 nodes, and I had invoked:

   shell$ mpdboot -n 2048 -f my_hostfile -v

mpdboot sat there *with no output* and 100% CPU activity for over 30  
minutes.  I finally poked into the mpdboot source code to see what  
was happening and found the following loop in the hostfile analysis  
section:

-----
     if oneMPDPerHost  and  totalnumToStart > 1:
         oldHosts = hostsAndInfo[:]
         hostsAndInfo = []
         for x in oldHosts:
            keep = 1
            for y in hostsAndInfo:
                if mpd_same_ips(x['host'],y['host']):
                    keep = 0
                    break
            if keep:
                hostsAndInfo.append(x)
-----

This is an N^2 loop looking for duplicates in the hostfile, which, in  
itself, seems fine.  The problem is in mpd_same_ip's() in mpdlib.py  
-- here's a snipit:

-----
def mpd_same_ips(host1,host2):    # hosts may be names or IPs
     try:
         ips1 = socket.gethostbyname_ex(host1)[2]    # may fail if  
invalid host
         ips2 = socket.gethostbyname_ex(host2)[2]    # may fail if  
invalid host
-----

Specifically, the innermost part of the N^2 loop fires off 2 DNS  
queries.  There does not appear to be any attempt to cache IP  
addresses, so there are roughly 2*N^2 DNS lookups, the vast majority  
of which are duplicates (there are only N unique hostnames).

Hence, mpdboot was sitting for 30+ minutes looping over DNS lookups  
before launching anything (I ended up killing mpdboot; I don't know  
how much longer it would have run).  Yes, the DNS at this particular  
site was rather slow.

Regardless of the speed of DNS, however, it seems like a better  
solution would be to do a linear lookup of all N names in the  
hostfile, cache the resulting lists of IP addresses, and then perform  
the duplicate comparisons with the cache rather than forcing a large  
number of explicit (and potentially very very slow) DNS lookups.

I converted my hostfile to use IP addresses instead of names, and  
then mpdboot went through the duplicate hostname comparison and  
started launching mpd's within a minute or less (I didn't clock it;  
it was significantly faster).

I'm not a python programmer, so I don't know / have time to figure  
out the Right syntax for how to change this in mpdboot / mpdlib, but  
I thought I'd pass the information on.

Good luck!

-- 
Jeff Squyres
Cisco Systems




More information about the mpich-discuss mailing list