[mpich-discuss] mpiexec / Windows networking

Michael Colonno mcolonno at stanford.edu
Tue Jun 19 23:22:04 CDT 2012


            Update on this issue: it seems to have to do with the paths of the remote processes. I have the bin dir in my path. Without any absolute paths, I get this: 

 

>mpiexec -n 4 cxxpi.exe

Error while connecting to host, No connection could be made because the target machine actively refused it. (10061)

Connect on sock (host=mike-studio17, port=8678) failed, exhausted all end points

Unable to connect to 'mike-studio17:8678',

sock error: Error = -1

 

            Adding the absolute path on mpiexec.exe fixes this issue:

 

>C:\Users\mcolonno\Desktop\Software\MPICH2\bin\mpiexec -n 4 cxxpi.exe

Process 2 of 4 is on mike-studio17

Process 3 of 4 is on mike-studio17

Process 0 of 4 is on mike-studio17

Process 1 of 4 is on mike-studio17

pi is approximately 3.14159 Error is 8.33331e-010

wall clock time = 0.000230972

 

            Absolute path on cxxpi.exe has no effect:

 

>mpiexec -n 4 C:\Users\mcolonno\Desktop\Software\MPICH2\bin\cxxpi.exe

Error while connecting to host, No connection could be made because the target machine actively refused it. (10061)

Connect on sock (host=mike-studio17, port=8678) failed, exhausted all end points

Unable to connect to 'mike-studio17:8678',

sock error: Error = -1

 

            Note the status of smpd doesn't seem to matter to function:

 

>smpd -status

no smpd running on mike-studio17.bsc.corp

 

            Interestingly, running hostname works fine: 

 

>C:\Users\mcolonno\Desktop\Software\MPICH2\bin\mpiexec -n 4 hostname

mike-studio17

mike-studio17

mike-studio17

mike-studio17

 

            Other basic system commands throw an error:

 

>C:\Users\mcolonno\Desktop\Software\MPICH2\bin\mpiexec -n 4 chdir

launch failed: CreateProcess(chdir) on 'mike-studio17' failed, error 2 - The system cannot find the file specified.

 

launch failed: CreateProcess(chdir) on 'mike-studio17' failed, error 2 - The system cannot find the file specified.

 

launch failed: CreateProcess(chdir) on 'mike-studio17' failed, error 2 - The system cannot find the file specified.

 

launch failed: CreateProcess(chdir) on 'mike-studio17' failed, error 2 - The system cannot find the file specified.

 

Error posting writev, An established connection was aborted by the software in your host machine.(10053)

unable to post a write for the next command,

sock error: Error = 10053

 

unable to post a write of the close command to tear down the job tree as part of the abort process.

unable to post an abort command.

 

            I'm used to the Linux setup in which one's config file would take care of any path issues; here I'm not sure what to do beyond putting the binaries in my path. Any advice on Windows operation is appreciated. 

 

            Thanks,

            ~Mike C. 

 

-----Original Message-----
From: Michael Colonno [mailto:mcolonno at stanford.edu] 
Sent: Tuesday, June 19, 2012 2:47 PM
To: 'mpich-discuss at mcs.anl.gov'
Cc: 'Jayesh Krishna'
Subject: RE: mpiexec / Windows networking

 

            Looks like a spoke too soon: I'm back into the mode where I can run anything through the wmpiexec GUI but anything on the command line does not show SMPD running and can't launch jobs. Windows services shows the service as running normally. What are the differences between general command line usage and this GUI tool? why would one recognize the running service and another not? I should have known better than to see it work without knowing why the status changed... 

 

            Thanks,

            ~Mike C.

 

-----Original Message-----

From: Jayesh Krishna  <mailto:[mailto:jayesh at mcs.anl.gov]> [mailto:jayesh at mcs.anl.gov] 

Sent: Tuesday, June 19, 2012 8:19 AM

To: Michael Colonno

Cc:  <mailto:mpich-discuss at mcs.anl.gov> mpich-discuss at mcs.anl.gov

Subject: Re: mpiexec / Windows networking

 

Hi,

Great, let us know if you have any further issues.

 

(PS: Yeah, I meant ignore the message from wmpiconfig.) Regards, Jayesh

 

----- Original Message -----

From: "Michael Colonno" < <mailto:mcolonno at stanford.edu> mcolonno at stanford.edu>

To: "Jayesh Krishna" < <mailto:jayesh at mcs.anl.gov> jayesh at mcs.anl.gov>

Cc:  <mailto:mpich-discuss at mcs.anl.gov> mpich-discuss at mcs.anl.gov

Sent: Tuesday, June 19, 2012 9:54:11 AM

Subject: RE: mpiexec / Windows networking

 

            Hi Jayesh ~

 

            Sorry - perhaps I wasn't clear: everything was working through wmpiexec but not working on the command line (behaved like MPICH2 was installed but services were not running). (You may have meant ignore the error message from wmpiconfig below.) However, after closing my command prompts and reopening them (after the successful test through wmpiexec) they now seem to behave perfectly. I can't say I have a good explanation for this but I am glad everything is operational. I will chime in again if I have any more difficulty. 

 

            Thanks for all the help,

            ~Mike C. 

 

-----Original Message-----

From: Jayesh Krishna  <mailto:[mailto:jayesh at mcs.anl.gov]> [mailto:jayesh at mcs.anl.gov]

Sent: Tuesday, June 19, 2012 7:41 AM

To: Michael Colonno

Cc:  <mailto:mpich-discuss at mcs.anl.gov> mpich-discuss at mcs.anl.gov

Subject: Re: mpiexec / Windows networking

 

Hi,

Ignore the error message from wmpiexec. There is a known bug that causes it to report inaccurate status (does not work as expected).

Can you run an MPI job from the command line?

 

Regards,

Jayesh

 

----- Original Message -----

From: "Michael Colonno" < <mailto:mcolonno at stanford.edu> mcolonno at stanford.edu>

To:  <mailto:mpich-discuss at mcs.anl.gov> mpich-discuss at mcs.anl.gov

Cc: "Jayesh Krishna" < <mailto:jayesh at mcs.anl.gov> jayesh at mcs.anl.gov>

Sent: Tuesday, June 19, 2012 9:33:26 AM

Subject: RE: mpiexec / Windows networking

 

            The instructions referenced below were followed to install MPICH2 on the Windows 7 system (uninstalled / reinstalled twice with the same results). Is there any reason the wmpiexec and command line behavior would be different? Perhaps some system-wide post-install setting?

 

            Thanks,

            ~Mike C. 

 

-----Original Message-----

From: Jayesh Krishna  <mailto:[mailto:jayesh at mcs.anl.gov]> [mailto:jayesh at mcs.anl.gov]

Sent: Tuesday, June 19, 2012 7:22 AM

To:  <mailto:mpich-discuss at mcs.anl.gov> mpich-discuss at mcs.anl.gov

Cc: Michael Colonno

Subject: Re: mpiexec / Windows networking

 

Hi,

The earlier error message (smpd error message) indicates that MPICH2 was not installed correctly on your system. I would recommend the following,

 

# Uninstall MPICH2 from the system

# Follow instructions in Section 9.4 (NOT 9.1) of the MPICH2 installer's guide (available at  <http://www.mcs.anl.gov/research/projects/mpich2/documentation/index.php?s=docs> http://www.mcs.anl.gov/research/projects/mpich2/documentation/index.php?s=docs) to install MPICH2.

 

Regards,

Jayesh

 

----- Original Message -----

From: "Michael Colonno" < <mailto:mcolonno at stanford.edu> mcolonno at stanford.edu>

To:  <mailto:mpich-discuss at mcs.anl.gov> mpich-discuss at mcs.anl.gov

Cc: "Jayesh Krishna" < <mailto:jayesh at mcs.anl.gov> jayesh at mcs.anl.gov>

Sent: Monday, June 18, 2012 6:15:07 PM

Subject: RE: mpiexec / Windows networking

 

 

 

 

Follow up: I can run the example successfully through the GUI wrapper (wmpiexec.exe) but not from the command line through a console, which seems odd. So the installation works and the relevant service must be running but it doesn’t seem the command line environment can communicate with it. Besides setting the path, is there anything else I can do on this front? 

 

 

 

Thanks, 

 

~Mike C. 

 

 

 

 

 

From: Michael Colonno  <mailto:[mailto:mcolonno at stanford.edu]> [mailto:mcolonno at stanford.edu]

Sent: Monday, June 18, 2012 4:06 PM

To: 'mpich-discuss at mcs.anl.gov' 

Cc: 'Jayesh Krishna' 

Subject: mpiexec / Windows networking 

 

 

 

Trying to run the cxxpi.exe example program and I'm hitting a roadblock (seems others have shared this as well) on a Windows 7 x64 system. I have followed the instructions summarized in:  <http://lists.mcs.anl.gov/pipermail/mpich-discuss/2011-April/009694.html> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2011-April/009694.html . Checking services, the "MPICH2 Process Manager" is started (I restarted it to confirm). However, checking the status of SMPD: 

 

 

 

>smpd -status

 

no smpd running on mike-studio17 

 

 

 

Trying to run a test produces the error in the thread above: 

 

 

 

>mpiexec -n 2 hostname

 

Error while connecting to host, No connection could be made because the target machine actively refused it. (10061) 

 

Connect on sock (host=mike-studio17, port=8678) failed, exhausted all end points 

 

Unable to connect to 'mike-studio17:8678', 

 

sock error: Error = -1 

 

 

 

In the menu of wmpiconfig.exe under "error" it says " mike-studio17: MPICH2 not installed or unable to query the host ". I didn’t install to the default path, but other than that there is nothing extraordinary (bin directory added to path of course). If I scan hosts for versions, the wmpiconfig tool does detect the correct version on this host. Using “scan hosts”, I get “ Error: No servers available for this domain ”. It seems like the service simultaneously is and is not running. Anything I can do to debug? 

 

 

 

Thanks, 

 

~Mike C. 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120619/d8519a2e/attachment-0001.html>


More information about the mpich-discuss mailing list