[mpich-discuss] Error while connecting to host

jayesh at mcs.anl.gov jayesh at mcs.anl.gov
Fri Mar 12 09:35:52 CST 2010


Hi,
 With 1.3a1 (and 1.2.1p1) you should also be able to start MPI processes as described in the "Debugging jobs by starting them manually" section of the developer's guide. I verified that the setup works (on 1.2.1p1 & should work on 1.3 series too) on my local machine. Please follow the steps below to start MPI processes manually on the local machine,

1) Open two command prompts, command_prompt_1 & command_prompt_2
2) On command_prompt_1 please set the PMI environment as follows,

   set PMI_ROOT_HOST=000520p80812.ad.hitachi-metals.co.jp
   set PMI_ROOT_PORT=9222
   set PMI_RANK=0
   set PMI_SIZE=2
   set PMI_KVS=mpich2
   set PMI_ROOT_LOCAL=1

3) On command_prompt_2 please set the PMI environment as follows,

   set PMI_ROOT_HOST=000520p80812.ad.hitachi-metals.co.jp
   set PMI_ROOT_PORT=9222
   set PMI_RANK=1
   set PMI_SIZE=2
   set PMI_KVS=mpich2
   set PMI_ROOT_LOCAL=1

4) Run your MPI process (by typing the executable name at the command prompt - don't use mpiexec to launch your program) on the command_prompt_1 . (Note: The process with PMI_RANK=0 should be run before other ranks)

5) Run your MPI process on command_prompt_2

 Let us know if it works.

(PS: The environment variable PMI_RANK determines the rank of the current MPI process and the var PMI_SIZE determines the total number of MPI processes in the job)
Regards,
Jayesh


----- Original Message -----
From: "Takahiro Someji" <Takahiro_Someji at hitachi-metals.co.jp>
To: "Jayesh Krishna" <jayesh at mcs.anl.gov>
Cc: mpich-discuss at mcs.anl.gov
Sent: Friday, March 12, 2010 2:42:39 AM GMT -06:00 US/Canada Central
Subject: Re: [mpich-discuss] Error while connecting to host

Hi.
  When rev1.3a1 was installed, it worked well. (At rev1.0.3, when 
setting up as the manual, it operated normally.)

  However, there is a problem by this method.
  It is a problem that a port number and a domain name, kvs name change 
whenever it starts a PMI server.
  It is troublesome to start a PMI server, and to set up these 
parameters again at every time.
  Are there better methods.

>   Let us know if it works. Meanwhile, why do you want to start your MPI processes manually ?
  I am developing image-processing system using MPI.
  In order to obtain high performance, I adopted the distributed 
processing by multi PCs and MPI.
  The program is developed by many men. And the contents of the program 
differ for every PC (different progarams are operated on each PCs).
  Therefore, everybody debug individually at the time of debugging. 
Finally, it is necessary to build as one application.
  At the time of execution, it starts at once from a host. However, I 
want to start separately at the time of debugging and to perform MPI 
communication.
  If there is a better idea, please let me know.

Regards,
Someji

(2010/03/12 13:48), Jayesh Krishna wrote:
> Hi,
>   You are very close. When specifying the environment for MPI processes using SMPD as the process manager you should not specify the PMI_LOCAL environment variable.
>   Try the following steps to start MPI processes manually,
>
> 1) Run "mpiexec -pmiserver 2" on the command prompt. The command would output the host, port and kvs information in the order.
>
>   >D:\temp>mpiexec -pmiserver 2
>   >000520P80812.ad.hitachi-metals.co.jp (set PMI_HOST&  PMI_ROOT_HOST to this hostname)
>   >2612 (set PMI_PORT&  PMI_ROOT_PORT to this port number)
>   >65B8416A-4669-436b-A342-9FD0FF3357F9 (set PMI_KVS&  PMI_DOMAIN to this hostname)
>
> 2) Now open two command prompts - command_prompt_1&  command_prompt_2
>
> 3) On command_prompt_1 set the PMI environment as follows,
>   >set PMI_ROOT_HOST=000520p80812.ad.hitachi-metals.co.jp
>   >set PMI_HOST=000520p80812.ad.hitachi-metals.co.jp
>   >set PMI_ROOT_PORT=2612
>   >set PMI_PORT=2612
>   >set PMI_RANK=0
>   >set PMI_SIZE=2
>   >set PMI_KVS=65B8416A-4669-436b-A342-9FD0FF3357F9
>   >set PMI_DOMAIN=65B8416A-4669-436b-A342-9FD0FF3357F9
>
> 4) On command_prompt_2 set the PMI environment as follows,
>   >set PMI_ROOT_HOST=000520p80812.ad.hitachi-metals.co.jp
>   >set PMI_HOST=000520p80812.ad.hitachi-metals.co.jp
>   >set PMI_ROOT_PORT=2612
>   >set PMI_PORT=2612
>   >set PMI_RANK=1
>   >set PMI_SIZE=2
>   >set PMI_KVS=65B8416A-4669-436b-A342-9FD0FF3357F9
>   >set PMI_DOMAIN=65B8416A-4669-436b-A342-9FD0FF3357F9
>
> 5) Now run the MPI program (by typing the name of the executable at the command prompt) on command_prompt_1 and command_prompt_2
>
>   If the above steps don't work please try the preview release of MPICH2 (1.3a1 - there has been several bug fixes added after the 1.2.1p1 release) and see if it works.
>   Let us know if it works. Meanwhile, why do you want to start your MPI processes manually ?
>
> Regards,
> Jayesh
>
> ----- Original Message -----
> From: "Takahiro Someji"<Takahiro_Someji at hitachi-metals.co.jp>
> To: jayesh at mcs.anl.gov
> Cc: mpich-discuss at mcs.anl.gov
> Sent: Thursday, March 11, 2010 10:30:39 PM GMT -06:00 US/Canada Central
> Subject: Re: [mpich-discuss] Error while connecting to host
>
> Hello.
>    Sorry, I was not able to understand your explanation completely. Since
> I do not know well about the PMI, please teach me more closely.
>    If there is another method of launch processes manually without using
> PMI, please let me know.
>
>    Well, I tried.
>
> I operated as shown below in one command window.
>   >D:\temp>mpiexec -pmiserver 2
>   >000520P80812.ad.hitachi-metals.co.jp
>   >2612
>   >65B8416A-4669-436b-A342-9FD0FF3357F9
>
> Next, in anther two window (Host&  Sub Window), I set up PMI
> environment, as shown below.
>   >set PMI_ROOT_HOST=000520p80812.ad.hitachi-metals.co.jp
>   >set PMI_HOST=000520p80812.ad.hitachi-metals.co.jp
>   >set PMI_ROOT_PORT=2612
>   >set PMI_PORT=2612
>   >set PMI_LOCAL=1
>   >set PMI_RANK=%1
>   >set PMI_SIZE=2
>   >set PMI_KVS=65B8416A-4669-436b-A342-9FD0FF3357F9
>   >set PMI_DOMAIN=65B8416A-4669-436b-A342-9FD0FF3357F9
>
> Next , I started the program with cmd "program.exe" in Host window&  Sub
> window.
> Then, the program was terminated for the fatal error (access violation).
>     (Is this a fundamental problem?)
>
> Next, I set PMI_ROOT_PORT and PMI_PORT to 9222.
> Then, the result is "ERROR:Error while connecting to host......".
>
> What is bad?.
>
> Regars,
> Someji
>
>
> (2010/03/12 1:05), jayesh at mcs.anl.gov wrote:
>    
>> Hi,
>>    Try also setting the PMI_DOMAIN value to the PMI_KVS value.
>>    If you just want to start your programs manually (launch processes manually) and still use SMPD as a PMI server I would recommend using the "-pmiserver" option of mpiexec (mpiexec -pmiserver 2 =>   Starts an instance of SMPD - the port/kvs/domain values are printed out - and you can launch your MPI processes with the provided PMI environment. I set PMI_SIZE, PMI_RANK, PMI_KVS, PMI_DOMAIN, PMI_PORT, PMI_ROOT_PORT, PMI_HOST, PMI_ROOT_HOST env).
>>    Let us know if it works for you.
>> Regards,
>> Jayesh
>>
>> (PS: When connecting to the process manager the default values of the kvs won't work.)
>> ----- Original Message -----
>> From: "Takahiro Someji"<Takahiro_Someji at hitachi-metals.co.jp>
>> To:jayesh at mcs.anl.gov,mpich-discuss at mcs.anl.gov
>> Sent: Wednesday, March 10, 2010 6:53:22 PM GMT -06:00 US/Canada Central
>> Subject: Re: [mpich-discuss] Error while connecting to host
>>
>> Hello.
>>
>>     I tried as your proposal. However, the result did not change.
>>
>>     I tried a setup of PMI_ROOT_PORT=8676 as smpd default port number in
>> Host window&   Sub window.
>>     Then, the program was terminated for the fatal error.
>>     Is communication with smpd blocked? Or is this a unique phenomenon by
>> japanese OS?
>>
>> Regards,
>> Someji
>>
>> (2010/03/11 0:08),jayesh at mcs.anl.gov  wrote:
>>
>>      
>>> Hi,
>>>     Try adding PMI_HOST (with value of PMI_ROOT_HOST) and PMI_PORT (with value of PMI_ROOT_PORT) into your PMI environment and see if it works. Make sure that you start the process with rank=0 before other processes.
>>>     Are you trying to debug your program (Another way to debug your code would be to attach to the MPI process using a debugger)?
>>>
>>> Regards,
>>> Jayesh
>>> ----- Original Message -----
>>> From: "染次 孝博"<Takahiro_Someji at hitachi-metals.co.jp>
>>> To:mpich-discuss at mcs.anl.gov
>>> Sent: Tuesday, March 9, 2010 11:51:40 PM GMT -06:00 US/Canada Central
>>> Subject: [mpich-discuss] Error while connecting to host
>>>
>>>
>>> Hi.
>>>
>>> I am developing the manual start program. (WindowsXP sp3, MPICH2-1.2.1p1, VisualStudio2008 sp1, C++)
>>> Program is very simple as below.
>>>
>>> ************************
>>> *int main(int argc, char* argv[])
>>> *{
>>> * int numprocs,myid,namelen,i;
>>> * char processor_name[256];
>>> *
>>> * MPI_Init(&argc,&argv);
>>> * MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
>>> * MPI_Comm_rank(MPI_COMM_WORLD,&myid);
>>> * MPI_Get_processor_name(processor_name,&namelen);
>>> *
>>> * for (i=0 ; i<argc ; i++)
>>> * {
>>> * printf("arg%d:%s\n",i,argv[i]);
>>> * }
>>> * printf("numprocs:%d\n",numprocs);
>>> * printf("myid:%d\n",myid);
>>> * printf("processor name:%s\n",processor_name);
>>> *
>>> * MPI_Finalize();
>>> * return 0;
>>> *}
>>> ************************
>>>
>>> The result of the program with "mpiexec -n 2 program.exe" is below.
>>>
>>> *************************
>>> *arg0:D:\temp\MPITest1\Debug\mpitest1.exe
>>> *numprocs:2
>>> *myid:1
>>> *processor name:HOSTPC
>>> *arg0:D:\temp\MPITest1\Debug\mpitest1.exe
>>> *numprocs:2
>>> *myid:0
>>> *processor name:HOSTPC
>>> *************************
>>>
>>> Next, I tried to start of the two programs manually with one PC.
>>> As shown in the manual "Debugging jobs by starting them manually" , it set up as follows.
>>> In Host window:
>>> set PMI_ROOT_HOST=HOSTPC
>>> set PMI_RANK=0
>>> set PMI_ROOT_PORT=9222
>>> set PMI_ROOT_LOCAL=1
>>> set PMI_SIZE=2
>>> set PMI_KVS=mpich2
>>> In Sub window:
>>> set PMI_ROOT_HOST=HOSTPC
>>> set PMI_RANK=1
>>> :
>>> : same as Host
>>>
>>> Then , I started this program with cmd "program.exe" in Host window&    Sub window.
>>> As a result, the following errors occurred and it did not operate well.
>>>
>>> [01:3156]..ERROR:Error while connecting to host,
>>> Can not connected because refused target computer (10061)
>>> [01:3156]..ERROR:Connect on sock (host=000520p80812.ad.hitachi-metals.co.jp, por
>>> t=9222) failed, exhaused all end points
>>> SMPDU_Sock_post_connect failed.
>>> [0] PMI_ConnectToHost failed: unable to post a connect to 000520p80812.ad.hitach
>>> i-metals.co.jp:9222, error: Undefined dynamic error code
>>> uPMI_ConnectToHost returning PMI_FAIL
>>> [0] PMI_Init failed.
>>> Fatal error in MPI_Init: Other MPI error, error stack:
>>> MPIR_Init_thread(394): Initialization failed
>>> MPID_Init(103).......: channel initialization failed
>>> MPID_Init(374).......: PMI_Init returned -1
>>>
>>>
>>> Please let me know solution.
>>>
>>> -- Thank you.
>>>
>>> -----------------------------------------------
>>> Takahiro Someji , Senior Engineer
>>>
>>> Hitachi Metals Ltd. Production System Lab.
>>> 6010, Mikajiri
>>> Kumagaya city,Saitama pref. JAPAN
>>> zip: 360-0843
>>>
>>> phone: +81-485-31-1720
>>> fax: +81-485-33-3398
>>> eMail:takahiro_someji at hitachi-metals.co.jp
>>> web:http://www.hitachi-metals.co.jp
>>> -----------------------------------------------
>>>        
>
>    




More information about the mpich-discuss mailing list