FW: [mpich-discuss] mpdtrace cannot connect (cygwin & mpd), previous smpd remnants?
Jayesh Krishna
jayesh at mcs.anl.gov
Wed Apr 2 10:34:40 CDT 2008
Hi,
This is the message from the mpd dev,
============================================================================
==============
A little more info may gained by changing this line in the installed copy of
mpdlib.py:
mpd_print(1111,"PTC2\n")
to this:
mpd_print(1111,"PTC2 %s\n",errmsg) I should have done it this way
but didn't think of it.
Even without that, we can make some guesses as to what is going on.
It appears that mpiexec is simply not able to connect to the UNIX socket (as
opposed to an INET socket) that mpd has placed in the file system.
If mpd is running, and you do an "ls -l" of /tmp, e.g.:
ls -l /tmp/mpd*
you should see a file whose type is socket. For example on my MacOSX box, I
see:
srwxr-xr-x 1 rbutler wheel 0 Apr 2 09:36 /tmp/
mpd2.console_rbutler This is mpd's socket that mpiexec tries by default to
connect to.
The interesting thing that I note from this most recent email is:
mpdtrace: cannot connect to local mpd (/tmp/ mpd2.console_ibojak);
possible causes:
In prior emails, the ibojak was listed as "username". This implies that
there is some discrepancy between the name of the user who started mpd and
the user trying to execute mpiexec.
============================================================================
==============
Did you intentionally mask the username (ibojak) in your previous emails
(see below) ?
===========from prev email =============
ls -l /tmp
total 2048
-rw-r--r-- 1 username None 2111 Mar 28 19:01 XWin.log
-rw-r--r-- 1 username None 53 Mar 28 18:02 mpd2.console_username
==========================================
Please change the mpd_print() as mentioned above and re-run mpd &
mpdtrace. Also make sure that the user running mpd is the same user running
the mpiexec command.
Query the file permissions of the socket using the getfacl command, eg:
getfacl /tmp/mpd2.console_ibojak . Also send us the send us the output of
"ls -l /tmp/mpd*".
(PS: Use "whoami" to list the current user. Use "mkpasswd -l" to get a list
of users on the system.)
Regards,
Jayesh
>
>
> On WedApr 2, at Wed Apr 2 8:59AM, Jayesh Krishna wrote:
>
> > FYI...
> >
> >> -----Original Message-----
> >> From: owner-mpich-discuss at mcs.anl.gov
> >> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Ingo Bojak
> >> Sent: Wednesday, April 02, 2008 3:52 AM
> >> To: mpich-discuss at mcs.anl.gov
> >> Subject: Re: [mpich-discuss] mpdtrace cannot connect
> (cygwin & mpd),
> >> previous smpd remnants?
> >>
> >> Jayesh Krishna wrote:
> >>> To obtain more information on your problem a version of mpdlib.py
> >>> which contains trace information is attached. Please replace the
> >>> existing version of mpdlib.py with the version attached and
> >> provide us
> >>> with the trace output (Try to compress/zip the output if it
> >> is long).
> >>>
> >>> Does your ".mpd.conf" contain anything apart from the
> "secretword" ?
> >>>
> >> Here's the output with the changed library:
> >>
> >> ---
> >>> mpd &
> >> [1] 3748
> >>
> >>> mpdtrace
> >> mpdtrace (__init__ 1194): PTB
> >>
> >> mpdtrace (__init__ 1208): PTC
> >>
> >> mpdtrace (__init__ 1219): PTC2
> >>
> >> PTF
> >>
> >> mpdtrace: cannot connect to local mpd (/tmp/mpd2.console_ibojak);
> >> possible causes:
> >> 1. no mpd is running on this host
> >> 2. an mpd is running but was started without a "console"
> >> (-n option) In case 1, you can start an mpd on this host with:
> >> mpd &
> >> and you will be able to run jobs just on this host.
> >> For more details on starting mpds on a set of hosts, see the
> >> MPICH2 Installation Guide.
> >> ---
> >>
> >> For contrast here the mpdcheck routine started in two
> cygwin windows
> >> on the same machine:
> >>
> >> ---
> >> (In window 1:)
> >>> mpdcheck -s
> >> server listening at INADDR_ANY on: computername 4635
> >>
> >> (In window 2:)
> >>> mpdcheck -c computername 4635
> >> client successfully recvd ack from server:
> ack_from_server_to_client
> >>
> >> (In window 1:)
> >> server has conn on <socket._socketobject object at
> >> 0x7ff3ea04> from ('xx1.xx4.xx1.8', 4637) server successfully
> >> recvd msg from client: hello_from_client_to_server
> >> ---
> >>
> >> finally, here's the entire content of .mpd.conf at the moment:
> >>
> >> ---
> >>> cat .mpd.conf
> >> secretword=yadda
> >> ---
> >>
> >> Thanks in advance for your help,
> >> Ingo
> >>
> >>
> >
>
>
More information about the mpich-discuss
mailing list