[MPICH] setup and testing of smpd and mpiexec for MPICH2 under WinXP - a fresh install could work

Jayesh Krishna jayesh at mcs.anl.gov
Tue Jan 15 13:50:38 CST 2008


Hi,
 I am not sure what is wrong with your machine. Do you have any other
problems with the same machine ? Do you have admin privileges in the machine
? 
 Please try a fresh installation of MPICH2 (mpich2-1.0.3-1-win32-ia32.msi)
by following the steps mentioned below.
 
1) Make sure that you uninstall MPICH2 using the 'Add/Remove programs" in
the control panel. 
2) After uninstalling MPICH2 try mpiexec/smpd commands to make sure that you
do not have another version of MPICH2 installed on your machine.
3) Check the task manager to make sure that you do not have any instance of
smpd.exe running in your system.
4) Re-install MPICH2 using the MSI installer. 
5) Reboot your machine
6) From the task manager check if smpd.exe is running (and run by "SYSTEM"
account)
7) Try "smpd -status -d"
 
 Let us know the results.
 
Regards,
Jayesh

  _____  

From: Kim Parnell [mailto:kim.parnell at mscsoftware.com] 
Sent: Tuesday, January 15, 2008 12:29 PM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP - Try specifying passphrase explicitly


This smpd (smpd -phrase behappy -status -d)  command does not terminate and
has to be ended with a CTRL-C
It does not return Status and last line echoed is:
    [-1:5712]..\smpd_do_console
 
 
It did add "phrase" into the Registry:
 
For:
values in HKEY_LOCAL_MACHINE\SOFTWARE\MPICH\SMPD
  1st Entry:   (Default)   REG_SZ  (value not set)
  2nd Entry:   binary     REG_SZ
C:\MSC.Software\Marc\2007r1\marc2007r1\mpich2\bin\smpd.exe
  3rd Entry:   phrase     REG_SZ   behappy  
  4th Entry:    version    REG_SZ   1.0.3  
 
smpd -phrase behappy -status -d
[-1:5712]..\smpd_get_opt_int
[-1:5712]../smpd_get_opt_int
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt_int
[-1:5712]../smpd_get_opt_int
[-1:5712]..\smpd_get_opt_int
[-1:5712]../smpd_get_opt_int
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt_string
[-1:5712]../smpd_get_opt_string
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_hostname
[-1:5712]../smpd_get_hostname
[-1:5712]..\smpd_get_opt_string
[-1:5712]../smpd_get_opt_string
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt_string
[-1:5712]../smpd_get_opt_string
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt_string
[-1:5712]../smpd_get_opt_string
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt_string
[-1:5712]../smpd_get_opt_string
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt_string
[-1:5712]../smpd_get_opt_string
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_get_opt_string
[-1:5712]../smpd_get_opt_string
[-1:5712]..\smpd_get_opt_string
[-1:5712]../smpd_get_opt_string
[-1:5712]..\smpd_get_opt
[-1:5712]../smpd_get_opt
[-1:5712]..\smpd_do_console
^C
 
 

  _____  

From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
Sent: Tuesday, January 15, 2008 9:44 AM
To: Kim Parnell
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP - Try specifying passphrase explicitly


Hi,
 Can you try specify the passphrase explicitly and checking the status of
the process manager ? ("smpd -phrase behappy -status -d")
 
Regards,
Jayesh

  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Kim Parnell
Sent: Monday, January 14, 2008 2:47 PM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP - Is Reg enties ok ?


Some corrections on the Registry entries I sent earlier:
 


For:
values in HKEY_LOCAL_MACHINE\SOFTWARE\MPICH\SMPD
  1st Entry:   (Default)   REG_SZ  (value not set)
  2nd Entry:   binary     REG_SZ
C:\MSC.Software\Marc\2007r1\marc2007r1\mpich2\bin\smpd.exe
  3rd Entry:    version    REG_SZ   1.0.3  
 
 
I do not have:
    HKEY_CURRENT_USER\Software\MPICH\SMPD 
or HKEY_CURRENT_USER\Software\MPICH\SMPD\CACHE 
 
For
values in HKEY_CURRENT_USER\Software\MPICH
  1st Entry:   (Default)                REG_SZ  (value not set)
  2nd Entry:   smpdAccount       REG_SZ   kparnell
  3rd Entry:    smpdAccount1     REG_SZ   NA\kparnell
  4th Entry:   smpdPassword       REG_binary      (contains an encoded
binary string)
  5th Entry:    smpdPassword1    REG_binary      (contains an encoded binary
string)
 

  _____  

From: Kim Parnell 
Sent: Monday, January 14, 2008 11:24 AM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP - Is Reg enties ok ?


For:
values in HKEY_LOCAL_MACHINE\SOFTWARE\MPICH\SMPD
  1st Entry:   (Default)   REG_SZ  (value not set)
  2nd Entry:   binary     REG_SZ
C:\MSC.Software\Marc\2007r1\marc2007r1\mpich2\bin\smpd.exe
  3rd Entry:    version    REG_SZ   1.0.3  
 
For:
values in HKEY_CURRENT_USER\Software\MPICH\SMPD\CACHE 
  1st Entry:   (Default)                REG_SZ  (value not set)
  2nd Entry:   smpdAccount       REG_SZ   kparnell
  3rd Entry:    smpdAccount1     REG_SZ   NA\kparnell
  4th Entry:   smpdPassword      REG_SZ     (contains an encoded binary
string)
  5th Entry:    smpdPassword1    REG_SZ    (contains an encoded binary
string)
 


  _____  

From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
Sent: Monday, January 14, 2008 10:43 AM
To: Kim Parnell
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP - Is Reg enties ok ?


Hi,
 Can you see the registry entries for MPICH2 and check if they are *normal*
?
 
1. Run "regedit.exe"
2. Check the values in HKEY_LOCAL_MACHINE\SOFTWARE\MPICH\SMPD  -- This
should contain binary (executable), phrase (passphrase) & version (version
number)
3. Check the values in HKEY_CURRENT_USER\Software\MPICH\SMPD\CACHE --- This
should contain the cached username(smpda)/password(smpdp) for smpd/mpiexec.
 
Regards,
Jayesh

  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Kim Parnell
Sent: Friday, January 11, 2008 5:12 PM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP


All of the results that I have reported thus far have been with MPICH2
uninstalled.  Just using smpd.exe and wmpiexec.exe
It seems like the issue on this system is in the communication since
  smpd -status
still does not return any info and has to be killed.  
(I tested on a different but similar system with MPICH2 uninstalled and
returns a status)
 
On this system with the problem, the last lines returned by 
  smpd -status -d
 
....
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_do_console
 
(no message about the smpd process running, it just hangs here;  my other
system returns a status here and a few additional lines)
 
 
 

  _____  

From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
Sent: Friday, January 11, 2008 1:22 PM
To: Kim Parnell
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP


Hi,
 Can you try a fresh installation (uninstall any existing version of MPICH2
on the machine and install MPICH2 again) of MPICH2
(mpich2-1.0.3-1-win32-ia32.msi) and let us know if it works ?
 
Regards,
Jayesh

  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Kim Parnell
Sent: Friday, January 11, 2008 2:54 PM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP


I can kill the smpd process.  Then I can start it with:
  smpd -start
and stop it with
  smpd -stop
 
  smpd -status
still does not return any info and has to be killed.
 

  _____  

From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
Sent: Friday, January 11, 2008 12:45 PM
To: Kim Parnell
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP


Hi,
 To restart smpd you should use "smpd -start" (not "smpd -install").
 
Regards,
Jayesh

  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Kim Parnell
Sent: Friday, January 11, 2008 1:20 PM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP


Let me provide the last items first:
 
  Can you provide us the output of "smpd -status -d" ? Can you try "mpiexec
-n 1 dir" ?
 
mpiexec -n 1 dir
    hangs at the command prompt and does not do a directory listing;   if I
include  -timeout 10  then it will timeout
 
smpd -status -d
[-1:5056]..\smpd_get_opt_int
[-1:5056]../smpd_get_opt_int
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt_int
[-1:5056]../smpd_get_opt_int
[-1:5056]..\smpd_get_opt_int
[-1:5056]../smpd_get_opt_int
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt_string
[-1:5056]../smpd_get_opt_string
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_hostname
[-1:5056]../smpd_get_hostname
[-1:5056]..\smpd_get_opt_string
[-1:5056]../smpd_get_opt_string
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt_string
[-1:5056]../smpd_get_opt_string
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt_string
[-1:5056]../smpd_get_opt_string
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt_string
[-1:5056]../smpd_get_opt_string
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt_string
[-1:5056]../smpd_get_opt_string
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_get_opt_string
[-1:5056]../smpd_get_opt_string
[-1:5056]..\smpd_get_opt_string
[-1:5056]../smpd_get_opt_string
[-1:5056]..\smpd_get_opt
[-1:5056]../smpd_get_opt
[-1:5056]..\smpd_do_console
^C  to kill it    (does not return to command prompt)
 
I just  used  "smpd -install"   to restart the process after I had killed it
from the TaskManager.
 
I am not using the latest versions just to avoid possibly introducing
another variable with the MPICH2 applications that I ultimately want to run.
The version used is:
          mpich2-1.0.3-1-win32-ia32.msi
 
There are not multiple versions and I running smpd and mpiexec from the
local directory.
 
Thanks!


  _____  

From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
Sent: Friday, January 11, 2008 11:01 AM
To: Kim Parnell
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP


Hi,
  Are you using the latest version of MPICH2 (1.0.6p1)? I would recommend
that you try installing the latest version of MPICH2 on your machine. (Are
there multiple versions of smpd/mpiexec on your machine ? Check the PATH and
make sure that you are running the right smpd/mpiexec.)
  Is there a reason you chose to run "smpd -install" (Instead of letting the
MPICH2 installer install SMPD as a service, which gets started when you
logon the next time)?
  Can you provide us the output of "smpd -status -d" ? Can you try "mpiexec
-n 1 dir" ?

Regards,

Jayesh


  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Kim Parnell
Sent: Friday, January 11, 2008 12:25 PM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] setup and testing of smpd and mpiexec for MPICH2 under
WinXP


Windows XP SP2 machine with Dual Core CPU
 
I have an MPICH2 application that was working correctly and now does not get
submitted.
I am trying to go back and test the smpd and mpiexec installations and
hitting some problems that I do not understand.
 
smpd -install
  starts an smpd process under user SYSTEM that seems to be running
 
smpd -status 
   starts another smpd process under my username but never returns a status;
basically just hangs at the command prompt
 
smpd -stop
   Stopping MPICH2 Process Manager, Argonne National
Lab...................................
   MPICH2 Process Manager, Argonne National Lab failed to stop.
 
I can kill the smpd  process from the Windows Task Manager
I will put the Debug output from "smpd -d"  below
 
mpiexec -validate -user USERNAME
   hangs at the command prompt without any further output
 
mpiexec -register
prompts for username and password which I can register and is confirmed as
"Password encrypted into the Registry."
 
a simple test like:
mpiexec  -timeout 20   -verbose   -n 1 ping
 
will just terminate due to the timeout
I will put the verbose output below  under the smpd debug
 
What am I missing?   Thanks for any assistance.
Regards,
Kim
 
smpd -install -d
[-1:5076]..\smpd_get_opt_int
[-1:5076]../smpd_get_opt_int
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt_int
[-1:5076]../smpd_get_opt_int
[-1:5076]..\smpd_get_opt_int
[-1:5076]../smpd_get_opt_int
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
MPICH2 Process Manager, Argonne National Lab removed.
[-1:5076]..\smpd_get_opt_string
[-1:5076]../smpd_get_opt_string
[-1:5076]..\smpd_get_win_opt_string
[-1:5076]../smpd_get_win_opt_string
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
[-1:5076]..\smpd_get_opt_string
[-1:5076]../smpd_get_opt_string
[-1:5076]..\smpd_get_opt
[-1:5076]../smpd_get_opt
MPICH2 Process Manager, Argonne National Lab installed.
[-1:5076]..\smpd_set_smpd_data
[-1:5076]../smpd_set_smpd_data

 
C:\MSC.Software\Marc\2007r1\marc2007r1\mpich2\bin>mpiexec  -timeout 20
-verbose   -n 1 ping
..\smpd_get_full_path_name
...fixing up exe name: 'ping' -> '(null)'
../smpd_get_full_path_name
..handling executable:
ping.exe
..\smpd_get_next_host
...\smpd_get_next_hostname
.../smpd_get_next_hostname
...\smpd_get_host_id
.../smpd_get_host_id
../smpd_get_next_host
..\smpd_create_cliques
...\next_launch_node
.../next_launch_node
...\next_launch_node
.../next_launch_node
../smpd_create_cliques
..\smpd_fix_up_host_tree
../smpd_fix_up_host_tree
./mp_parse_command_args
.host tree:
. host: l30544.na.mscsoftware.com, parent: 0, id: 1
.launch nodes:
. iproc: 0, id: 1, exe: ping.exe
.\smpd_create_context
..\smpd_init_context
...\smpd_init_command
.../smpd_init_command
../smpd_init_context
./smpd_create_context
.\smpd_make_socket_loop
..\smpd_get_hostname
../smpd_get_hostname
./smpd_make_socket_loop
.\smpd_create_context
..\smpd_init_context
...\smpd_init_command
.../smpd_init_command
../smpd_init_context
./smpd_create_context
.\smpd_enter_at_state
..sock_waiting for the next event.
..SOCK_OP_CONNECT
..\smpd_handle_op_connect
...connect succeeded, posting read of the challenge string
../smpd_handle_op_connect
..sock_waiting for the next event.
..
mpiexec terminated job due to 20 second timeout.
 
mpiexec  -timeout 100   -verbose   -n 1 ping
..\smpd_get_full_path_name
...fixing up exe name: 'ping' -> '(null)'
../smpd_get_full_path_name
..handling executable:
ping.exe
..\smpd_get_next_host
...\smpd_get_next_hostname
.../smpd_get_next_hostname
...\smpd_get_host_id
.../smpd_get_host_id
../smpd_get_next_host
..\smpd_create_cliques
...\next_launch_node
.../next_launch_node
...\next_launch_node
.../next_launch_node
../smpd_create_cliques
..\smpd_fix_up_host_tree
../smpd_fix_up_host_tree
./mp_parse_command_args
.host tree:
. host: l30544.na.xxx.com, parent: 0, id: 1
.launch nodes:
. iproc: 0, id: 1, exe: ping.exe
.\smpd_create_context
..\smpd_init_context
...\smpd_init_command
.../smpd_init_command
../smpd_init_context
./smpd_create_context
.\smpd_make_socket_loop
..\smpd_get_hostname
../smpd_get_hostname
./smpd_make_socket_loop
.\smpd_create_context
..\smpd_init_context
...\smpd_init_command
.../smpd_init_command
../smpd_init_context
./smpd_create_context
.\smpd_enter_at_state
..sock_waiting for the next event.
..SOCK_OP_CONNECT
..\smpd_handle_op_connect
...connect succeeded, posting read of the challenge string
../smpd_handle_op_connect
..sock_waiting for the next event.
..
mpiexec terminated job due to 100 second timeout.
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080115/aa0a1a5c/attachment.htm>


More information about the mpich-discuss mailing list