[MPICH] Running on root's MPD as either root or another user

Ralph Butler rbutler at mtsu.edu
Mon Oct 1 20:56:59 CDT 2007


OK.  So I tried to reproduce the problem but could not.  Here is the  
sequence of steps I followed on 2 nodes of my cluster:

- build mpich2-1.0.6
- su to root
- install mpich2 in /tmp/mpich2i  ( make sure mpdroot is +s)
- create /etc/mpd.conf with secretword=foobar
- install in the same way on a second machine
- on 1st machine, start mpd by hand
- on 2nd machine, start mpd by hand using the -h and -p options to  
join the first mpd
- (still as root) run mpdtrace and some mpiexec jobs to make sure all  
works
- logout as root and login to an unused student acct
- as student:
       setenv MPD_USE_ROOT_MPD 1
       /tmp/mpich2i/bin/mpiexec -n 2 hostname

I did not even create a .mpd.conf file for the student.


On MonOct 1, at Mon Oct 1 8:14PM, Matt Chambers wrote:

> I don't understand how getting an MPD running from a user account  
> will help me in debugging why I can't get my user account to use  
> root's existing MPD ring.  To clarify, root's MPD ring works  
> perfectly fine and I can run MPI programs as root over any number  
> of nodes.  It's only when I try to use root's ring from a user  
> account that the multi-node mpiexec calls hang (apparently while  
> trying to get a socket, or at least that's the error it gives when  
> breaking at the hang).
>
> -Matt
>
> Ralph Butler wrote:
>> Yes, if it all works in side one machine, then the problem is  
>> almost certainly due to host/net config
>> issues.  The manual actually suggests not trying things as root  
>> until you have all those issues
>> addressed.  Those problems are addressed via running mpdcheck.
>> Here is a blurb about that:
>>
>> Sometimes there are problems with mpd or mpdboot while following
>> the Quick Start portion of the mpich2 install guide.  This typically
>> happens somewhere during Steps 10-13, but may occur during other
>> steps as well.  The guide suggests that when mpd/mpdboot problems
>> arise, you follow the procedures in Appendix A (Troubleshooting  
>> MPDs).
>>
>> Section A.1 (Getting Started with MPD) provides a 7-step procedure
>> to follow to get one or more mpds to working, first by hand, and
>> then via mpdboot.  However, some of the early steps begin with a
>> pre-MPD program called mpdcheck.  That program is designed to help
>> determine in advance if there will be problems associated wtih host
>> or network configuration.  The instructions in section A.1 suggest
>> first using mpdcheck on individual machines, and then pair-wise.
>> It is particularly important to try the pair-wise experiments where
>> one machine plays the role of the server and the other the client,
>> and then to reverse the roles.
>>
>> Sometimes the procedures in A.1 indicate that MPDs are not likely
>> to run on your systems due to problems with host and/or network
>> configuration.  At those points, you are referred to subsequent
>> sections, e.g. A.2 Debugging host/network configuration problems,
>> or A.3 Firewalls, etc.
>>
>>
>




More information about the mpich-discuss mailing list