[MPICH] MPDBOOT Problems

Ralph Butler rbutler at mtsu.edu
Wed Jan 31 16:00:35 CST 2007


A couple of days ago, Rusty responded to you recommending following  
the guidelines in
the install manual for these problems.  The manual indicates that  
when there are mpdboot
problems, they are typically cause by host/net config problems.  It  
says that you should first
debug with mpdcheck using it alone on each box and then pairwise on  
machines that are
having problems.  Pairwise means run it as server on one box and  
client on the other, then
reverse the roles and try it again.  It states that if mpdboot is a  
problem, back off and try
mpdcheck alone on all boxes, then in the pairwise fashion on each  
pair having problems.
If you are running on a single box, mpdcheck with no args might be  
all the help nec.
And, running both the server and client version may yield additional  
insights.
There are appendices that cover all sorts of issues like firewalls,  
etc.  However, the manual
is not a sysadmin book on hos/net config.  It merely tries to point  
you in the right direction.

On WedJan 31, at Wed Jan 31 3:24PM, Luiz Mendes wrote:

> Hi all,
>
> I have installed MPICH2 in a shared folder outside home dir.
>
> And i shared this folder with other 6 computers in a cluster.
>
> Well, when i try to run mpdboot it doesnt work for more than 2 PCS.
>
> When i try mpdboot -n 2 -f <hosts file> it performs ok, however  
> when i try with 3 PCS or more i doesnt works and record an error  
> message like this following message :
>
> mpdboot_computer1 (handle_mpd_output 374): failed to ping mpd on  
> computer1; recvd output={}
>
> It is strange because with 2 PCS, this one computer1 works, and now  
> with 3 it doesnt work anymore.
>
> Furthermore, i touch and set permission to .mpd.conf files in every  
> node of the cluster, and i set the secretwork too..
>
> What is going on with MPDBOOT?
>
> Thanks in Advance,
>
> Luiz Mendes
> UFJF
>
>




More information about the mpich-discuss mailing list