[MPICH] Running on root's MPD as either root or another user
Matthew Chambers
matthew.chambers at vanderbilt.edu
Mon Oct 1 17:13:38 CDT 2007
Hmm, I tried that (killed the running mpd and restarted it on all
machines) but no change. The MPD is running as root:
root 9629 1 0 17:06 ? 00:00:00 python2.4
/frogstar/usr/ppc/bin/mpd --listenport=4050 --ifhn=172.20.0.1 --daemon
Is the user's mpiexec call supposed to start its own mpd even when it's
supposed to be using the root's existing mpd?
Also, is this user supposed to exist on the other machines or is the MPI
program run as whichever user started the mpd?
-Matt
Ralph Butler wrote:
> I would suggest killing all mpds and related processes and starting
> them from scratch.
> Make sure that the mpd is running as root if users plan to use its
> services. Then, the users
> need to make sure they use the mpiexec that is linked to the mpdroot
> which is marked as +s.
>
> On MonOct 1, at Mon Oct 1 5:01PM, Matthew Chambers wrote:
>
>> Ah, the s bit has to be on the owner's set, I hadn't tried that (and
>> don't really understand why). But now I'm back to mpiexec locking up
>> when I try to run a job from the user's account, and when I break the
>> process, I get:
>> (mpiexec 413): mpiexec: failed to obtain sock from manager
>> And also there's a stranded mpd process in the process list each time
>> I try:
>> rslebos 27003 10967 0 16:58 ? 00:00:00 python2.4
>> /frogstar/usr/ppc/bin/mpd --listenport=4050 --ifhn=172.20.0.1 --dae
>> rslebos 27008 10967 0 16:58 ? 00:00:00 python2.4
>> /frogstar/usr/ppc/bin/mpd --listenport=4050 --ifhn=172.20.0.1 --dae
>> rslebos 27028 10967 0 17:00 ? 00:00:00 python2.4
>> /frogstar/usr/ppc/bin/mpd --listenport=4050 --ifhn=172.20.0.1 --dae
>>
>> Confused...
>> -Matt
>>
>
More information about the mpich-discuss
mailing list