[mpich-discuss] problem with the port number when "the reconnect request"

Jayesh Krishna jayesh at mcs.anl.gov
Mon Sep 20 14:59:39 CDT 2010


Hi,
 So were you able to use the SMPD_PORT_RANGE environment variable to specify the port range and successfully launch your MPI processes ?
 Do you need a command line option to specify this range ?

Regards,
Jayesh
----- Original Message -----
From: "Wei Lu" <weilu at microsoft.com>
To: "Jayesh Krishna" <jayesh at mcs.anl.gov>
Cc: mpich-discuss at mcs.anl.gov
Sent: Monday, September 20, 2010 2:45:04 PM GMT -06:00 US/Canada Central
Subject: RE: [mpich-discuss] problem with the port number when "the reconnect request"

SMPD_PORT_RANGE works!

Thanks a lot, 

Wei

-----Original Message-----
From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
Sent: Monday, September 20, 2010 12:08 PM
To: Wei Lu
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] problem with the port number when "the reconnect request"

Hi,
 Would adding a range for the TCP ports used by SMPD (SMPD_PORT_RANGE/-port=a,b that limits the range of ports used by SMPD) work for you ?
 If so, I can add a feature request for the same.

Regards,
Jayesh
----- Original Message -----
From: "Wei Lu" <weilu at microsoft.com>
To: "Jayesh Krishna" <jayesh at mcs.anl.gov>, mpich-discuss at mcs.anl.gov
Sent: Monday, September 20, 2010 1:31:36 PM GMT -06:00 US/Canada Central
Subject: RE: [mpich-discuss] problem with the port number when "the reconnect request"

Hi, Jayesh

  I  understood the firewall issue in the Guide. However, my machines are in a special environment where the router has some special limitations on the ports. So only opening the local firewall doesn't work

Thanks
Wei

-----Original Message-----
From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
Sent: Monday, September 20, 2010 11:27 AM
To: mpich-discuss at mcs.anl.gov
Cc: Wei Lu
Subject: Re: [mpich-discuss] problem with the port number when "the reconnect request"

Hi,
 Please follow the recommendations in Section 9.5 of the Windows developer's Guide (available at http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-1.2.1-windevguide.pdf).
 Let us know if it works for you.

Regards,
Jayesh
----- Original Message -----
From: "Wei Lu" <weilu at microsoft.com>
To: mpich-discuss at mcs.anl.gov
Sent: Monday, September 20, 2010 1:21:07 PM GMT -06:00 US/Canada Central
Subject: [mpich-discuss] problem with the port number when "the reconnect request"





Hi, 

I am running mpich2 on couple windows server machines, which have some firewall restriction , namely only several ports are accessible on each machine. 

So I explicitly have smpd listened on port 18000, which is enabled by the firewall, by “smpd.exe –p 18000” 

However, when I call mpiexec on another machine by mpiexec.exe -hosts 1 10.115.130.153 1 -p 18000 hostname, the mpiexec stuck until timeout was thrown. 

>From the log of smpd (listed below) , seems to me smpd issued the “reconnection request” and then spawns a new smpd manager process , which listens on port 56412, which unfortunately is not enabled by the firewall. 

So my question is 

Why I got this situation? And how can I limit the port to be used if this situation is unavoidable? 



Thank you very much 

Wei 







D:\Program Files\MPICH2\bin>smpd.exe -d 5 -p 18000 

created a set for the listener: 500 

smpd listening on port 18000 

Unable to get the data for the key 'no_dynamic_hosts' 

sock_waiting for the next event. 

SOCK_OP_ACCEPT event.error = 0, result = 0, context=listener 

authenticating new connection 

posting a write of the challenge string: 1.2.1p1 9116 

sock_waiting for the next event. 

SOCK_OP_WRITE event.error = 0, result = 0, context=undetermined 

wrote challenge string: '1.2.1p1 9116' 

sock_waiting for the next event. 

SOCK_OP_READ event.error = 0, result = 0, context=undetermined 

read challenge response: '0ddcd0a20b925177af7e47a03eb402dd' 

sock_waiting for the next event. 

SOCK_OP_WRITE event.error = 0, result = 0, context=undetermined 

wrote connect result: 'SUCCESS' 

sock_waiting for the next event. 

SOCK_OP_READ event.error = 0, result = 0, context=undetermined 

read session request: 'process' 

sock_waiting for the next event. 

SOCK_OP_WRITE event.error = 0, result = 0, context=undetermined 

wrote no cred request: 'nocredentials' 

starting command: "D:\Program Files\MPICH2\bin\smpd.exe" -p 18000 -d 5 -mgr -re 

d 00000000000002C0 -write 00000000000002BC 

CreateProcess 

smpd reading the port string from the manager 

manager creating listener and session sets. 

created set for manager listener, 764 

smpd manager listening on port 54614 

manager writing port back to smpd. 

smpd sending the account to the manager 

smpd sending the password to the manager 

smpd sending the smpd passphrase to the manager 

closing the pipe to the manager 

smpd writing reconnect request: port 54614 

sock_waiting for the next event. 

SOCK_OP_WRITE event.error = 0, result = 0, context=undetermined 

wrote reconnect request: '54614' 

sock_waiting for the next event. 

manager reading account and password from smpd. 

sock_waiting for the next event. 

SOCK_OP_CLOSE event.error = 0, result = 0, context=undetermined 

op_close received - SMPD_CLOSING state. 

Unaffiliated undetermined context closing. 

freeing undetermined context. 

sock_waiting for the next event. 
_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss




More information about the mpich-discuss mailing list