[MPICH] Scalability problem in mpdboot

Ralph Butler rbutler at mtsu.edu
Fri May 11 11:23:01 CDT 2007


I created a hostsfile containing 6000 entries (many duplicates) and  
it took about 12 seconds to run thru them all.
Then I made a chg to mpdboot to cache the IPs and it took less than  
one second.  So, I am attaching a new copy
of mpdboot that has that chg in it.  If it seems fine, then I will  
commit it to cvs later today.
--ralph


-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpdboot.py
Type: text/x-python-script
Size: 16612 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070511/599dcf31/attachment.bin>
-------------- next part --------------


On FriMay 11, at Fri May 11 7:07AM, Jeff Squyres wrote:

> Greetings.  I was on a large cluster yesterday trying to run some  
> performance comparisons between Open MPI and MPICH2, but  
> encountered significant problems because scalability issue in  
> mpdboot (mpich2-1.0.5p4.tar.gz).
>
> Specifically, I had a hostfile of 2048 nodes, and I had invoked:
>
>   shell$ mpdboot -n 2048 -f my_hostfile -v
>
> mpdboot sat there *with no output* and 100% CPU activity for over  
> 30 minutes.  I finally poked into the mpdboot source code to see  
> what was happening and found the following loop in the hostfile  
> analysis section:
>
> -----
>     if oneMPDPerHost  and  totalnumToStart > 1:
>         oldHosts = hostsAndInfo[:]
>         hostsAndInfo = []
>         for x in oldHosts:
>            keep = 1
>            for y in hostsAndInfo:
>                if mpd_same_ips(x['host'],y['host']):
>                    keep = 0
>                    break
>            if keep:
>                hostsAndInfo.append(x)
> -----
>
> This is an N^2 loop looking for duplicates in the hostfile, which,  
> in itself, seems fine.  The problem is in mpd_same_ip's() in  
> mpdlib.py -- here's a snipit:
>
> -----
> def mpd_same_ips(host1,host2):    # hosts may be names or IPs
>     try:
>         ips1 = socket.gethostbyname_ex(host1)[2]    # may fail if  
> invalid host
>         ips2 = socket.gethostbyname_ex(host2)[2]    # may fail if  
> invalid host
> -----
>
> Specifically, the innermost part of the N^2 loop fires off 2 DNS  
> queries.  There does not appear to be any attempt to cache IP  
> addresses, so there are roughly 2*N^2 DNS lookups, the vast  
> majority of which are duplicates (there are only N unique hostnames).
>
> Hence, mpdboot was sitting for 30+ minutes looping over DNS lookups  
> before launching anything (I ended up killing mpdboot; I don't know  
> how much longer it would have run).  Yes, the DNS at this  
> particular site was rather slow.
>
> Regardless of the speed of DNS, however, it seems like a better  
> solution would be to do a linear lookup of all N names in the  
> hostfile, cache the resulting lists of IP addresses, and then  
> perform the duplicate comparisons with the cache rather than  
> forcing a large number of explicit (and potentially very very slow)  
> DNS lookups.
>
> I converted my hostfile to use IP addresses instead of names, and  
> then mpdboot went through the duplicate hostname comparison and  
> started launching mpd's within a minute or less (I didn't clock it;  
> it was significantly faster).
>
> I'm not a python programmer, so I don't know / have time to figure  
> out the Right syntax for how to change this in mpdboot / mpdlib,  
> but I thought I'd pass the information on.
>
> Good luck!
>
> -- 
> Jeff Squyres
> Cisco Systems
>



More information about the mpich-discuss mailing list