[mpich-discuss] mpdboot hangs, but ...

Dave Goodell goodell at mcs.anl.gov
Fri Dec 4 09:28:19 CST 2009


Hi Ben,

This looks very similar to ticket #963: https://trac.mcs.anl.gov/projects/mpich2/ticket/963

Please feel free to add yourself to the CC list if you would like to  
receive progress updates.  Thanks for letting us know that you are  
having trouble.

Thinking about it this morning, I just had an idea of what might be  
going on. I'll spend some time on it today and see if I can reproduce  
it and work up a fix.

In the mean time, you can either use the hydra process manager (built  
by default as "mpiexec.hydra") or copy the mpd.py script from 1.1.1p1  
as a workaround.

-Dave

On Dec 4, 2009, at 5:38 AM, Benjamin Svetitsky wrote:

> Hello everybody,
>
> I just upgraded from mpich2-1.0.8 to mpich2-1.2.1.  When I run
>
> 	mpdboot -n 4 -f /root/mpd.hosts
>
> as root, the command just hangs until I hit ^C.  Nonetheless, it  
> starts the daemons successfully and I can run MPI jobs as usual (so  
> far).  A subsequent mpdallexit kills the saemons without complaint
>
> Details:
>
> I am running four Intel quad-cores under CentOS:
> Linux version 2.6.18-164.6.1.el5.centos.plus
> The file /root/mpd.hosts contains:
> --
> nodeA
> nodeB
> nodeC
> nodeD
> --
> and I executed mpdboot on nodeC.
> I compiled the MPICH2 source without any config options.
> After mpdboot hangs for several minutes and I hit ^C, it responds:
> --
> Traceback (most recent call last):
>  File "/usr/local/bin/mpdboot", line 476, in ?
>    mpdboot()
>  File "/usr/local/bin/mpdboot", line 347, in mpdboot
>    handle_mpd_output(fd,fd2idx,hostsAndInfo)
>  File "/usr/local/bin/mpdboot", line 385, in handle_mpd_output
>    for line in fd.readlines():    # handle output from shells that  
> echo stuff
> KeyboardInterrupt
> --
> which may be irrelevant.
>
> Thanks,
> 	Ben
> -- 
> Prof. Benjamin Svetitsky         Phone:            +972-3-640 8870
> School of Physics and Astronomy  Fax:              +972-3-640 7932
> Tel Aviv University              E-mail:      bqs at julian.tau.ac.il
> 69978 Tel Aviv, Israel           WWW: http://julian.tau.ac.il/~bqs
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list