[MPICH] Running on root's MPD as either root or another user

Matthew Chambers matthew.chambers at vanderbilt.edu
Mon Oct 1 17:13:38 CDT 2007


Hmm, I tried that (killed the running mpd and restarted it on all 
machines) but no change.  The MPD is running as root:
root      9629     1  0 17:06 ?        00:00:00 python2.4 
/frogstar/usr/ppc/bin/mpd --listenport=4050 --ifhn=172.20.0.1 --daemon

Is the user's mpiexec call supposed to start its own mpd even when it's 
supposed to be using the root's existing mpd?

Also, is this user supposed to exist on the other machines or is the MPI 
program run as whichever user started the mpd?

-Matt


Ralph Butler wrote:
> I would suggest killing all mpds and related processes and starting 
> them from scratch.
> Make sure that the mpd is running as root if users plan to use its 
> services.  Then, the users
> need to make sure they use the mpiexec that is linked to the mpdroot 
> which is marked as +s.
>
> On MonOct 1, at Mon Oct 1 5:01PM, Matthew Chambers wrote:
>
>> Ah, the s bit has to be on the owner's set, I hadn't tried that (and 
>> don't really understand why).  But now I'm back to mpiexec locking up 
>> when I try to run a job from the user's account, and when I break the 
>> process, I get:
>> (mpiexec 413): mpiexec: failed to obtain sock from manager
>> And also there's a stranded mpd process in the process list each time 
>> I try:
>> rslebos  27003 10967  0 16:58 ?        00:00:00 python2.4 
>> /frogstar/usr/ppc/bin/mpd --listenport=4050 --ifhn=172.20.0.1 --dae
>> rslebos  27008 10967  0 16:58 ?        00:00:00 python2.4 
>> /frogstar/usr/ppc/bin/mpd --listenport=4050 --ifhn=172.20.0.1 --dae
>> rslebos  27028 10967  0 17:00 ?        00:00:00 python2.4 
>> /frogstar/usr/ppc/bin/mpd --listenport=4050 --ifhn=172.20.0.1 --dae
>>
>> Confused...
>> -Matt
>>
>




More information about the mpich-discuss mailing list