[MPICH] MPDBOOT Problems
Ralph Butler
rbutler at mtsu.edu
Wed Jan 31 16:00:35 CST 2007
A couple of days ago, Rusty responded to you recommending following
the guidelines in
the install manual for these problems. The manual indicates that
when there are mpdboot
problems, they are typically cause by host/net config problems. It
says that you should first
debug with mpdcheck using it alone on each box and then pairwise on
machines that are
having problems. Pairwise means run it as server on one box and
client on the other, then
reverse the roles and try it again. It states that if mpdboot is a
problem, back off and try
mpdcheck alone on all boxes, then in the pairwise fashion on each
pair having problems.
If you are running on a single box, mpdcheck with no args might be
all the help nec.
And, running both the server and client version may yield additional
insights.
There are appendices that cover all sorts of issues like firewalls,
etc. However, the manual
is not a sysadmin book on hos/net config. It merely tries to point
you in the right direction.
On WedJan 31, at Wed Jan 31 3:24PM, Luiz Mendes wrote:
> Hi all,
>
> I have installed MPICH2 in a shared folder outside home dir.
>
> And i shared this folder with other 6 computers in a cluster.
>
> Well, when i try to run mpdboot it doesnt work for more than 2 PCS.
>
> When i try mpdboot -n 2 -f <hosts file> it performs ok, however
> when i try with 3 PCS or more i doesnt works and record an error
> message like this following message :
>
> mpdboot_computer1 (handle_mpd_output 374): failed to ping mpd on
> computer1; recvd output={}
>
> It is strange because with 2 PCS, this one computer1 works, and now
> with 3 it doesnt work anymore.
>
> Furthermore, i touch and set permission to .mpd.conf files in every
> node of the cluster, and i set the secretwork too..
>
> What is going on with MPDBOOT?
>
> Thanks in Advance,
>
> Luiz Mendes
> UFJF
>
>
More information about the mpich-discuss
mailing list