[MPICH] Running on root's MPD as either root or another user
Ralph Butler
rbutler at mtsu.edu
Mon Oct 1 20:56:59 CDT 2007
OK. So I tried to reproduce the problem but could not. Here is the
sequence of steps I followed on 2 nodes of my cluster:
- build mpich2-1.0.6
- su to root
- install mpich2 in /tmp/mpich2i ( make sure mpdroot is +s)
- create /etc/mpd.conf with secretword=foobar
- install in the same way on a second machine
- on 1st machine, start mpd by hand
- on 2nd machine, start mpd by hand using the -h and -p options to
join the first mpd
- (still as root) run mpdtrace and some mpiexec jobs to make sure all
works
- logout as root and login to an unused student acct
- as student:
setenv MPD_USE_ROOT_MPD 1
/tmp/mpich2i/bin/mpiexec -n 2 hostname
I did not even create a .mpd.conf file for the student.
On MonOct 1, at Mon Oct 1 8:14PM, Matt Chambers wrote:
> I don't understand how getting an MPD running from a user account
> will help me in debugging why I can't get my user account to use
> root's existing MPD ring. To clarify, root's MPD ring works
> perfectly fine and I can run MPI programs as root over any number
> of nodes. It's only when I try to use root's ring from a user
> account that the multi-node mpiexec calls hang (apparently while
> trying to get a socket, or at least that's the error it gives when
> breaking at the hang).
>
> -Matt
>
> Ralph Butler wrote:
>> Yes, if it all works in side one machine, then the problem is
>> almost certainly due to host/net config
>> issues. The manual actually suggests not trying things as root
>> until you have all those issues
>> addressed. Those problems are addressed via running mpdcheck.
>> Here is a blurb about that:
>>
>> Sometimes there are problems with mpd or mpdboot while following
>> the Quick Start portion of the mpich2 install guide. This typically
>> happens somewhere during Steps 10-13, but may occur during other
>> steps as well. The guide suggests that when mpd/mpdboot problems
>> arise, you follow the procedures in Appendix A (Troubleshooting
>> MPDs).
>>
>> Section A.1 (Getting Started with MPD) provides a 7-step procedure
>> to follow to get one or more mpds to working, first by hand, and
>> then via mpdboot. However, some of the early steps begin with a
>> pre-MPD program called mpdcheck. That program is designed to help
>> determine in advance if there will be problems associated wtih host
>> or network configuration. The instructions in section A.1 suggest
>> first using mpdcheck on individual machines, and then pair-wise.
>> It is particularly important to try the pair-wise experiments where
>> one machine plays the role of the server and the other the client,
>> and then to reverse the roles.
>>
>> Sometimes the procedures in A.1 indicate that MPDs are not likely
>> to run on your systems due to problems with host and/or network
>> configuration. At those points, you are referred to subsequent
>> sections, e.g. A.2 Debugging host/network configuration problems,
>> or A.3 Firewalls, etc.
>>
>>
>
More information about the mpich-discuss
mailing list