[mpich-discuss] Error while connecting to host
Takahiro Someji
Takahiro_Someji at hitachi-metals.co.jp
Thu Mar 11 22:30:39 CST 2010
Hello.
Sorry, I was not able to understand your explanation completely. Since
I do not know well about the PMI, please teach me more closely.
If there is another method of launch processes manually without using
PMI, please let me know.
Well, I tried.
I operated as shown below in one command window.
>D:\temp>mpiexec -pmiserver 2
>000520P80812.ad.hitachi-metals.co.jp
>2612
>65B8416A-4669-436b-A342-9FD0FF3357F9
Next, in anther two window (Host & Sub Window), I set up PMI
environment, as shown below.
>set PMI_ROOT_HOST=000520p80812.ad.hitachi-metals.co.jp
>set PMI_HOST=000520p80812.ad.hitachi-metals.co.jp
>set PMI_ROOT_PORT=2612
>set PMI_PORT=2612
>set PMI_LOCAL=1
>set PMI_RANK=%1
>set PMI_SIZE=2
>set PMI_KVS=65B8416A-4669-436b-A342-9FD0FF3357F9
>set PMI_DOMAIN=65B8416A-4669-436b-A342-9FD0FF3357F9
Next , I started the program with cmd "program.exe" in Host window& Sub
window.
Then, the program was terminated for the fatal error (access violation).
(Is this a fundamental problem?)
Next, I set PMI_ROOT_PORT and PMI_PORT to 9222.
Then, the result is "ERROR:Error while connecting to host......".
What is bad?.
Regars,
Someji
(2010/03/12 1:05), jayesh at mcs.anl.gov wrote:
> Hi,
> Try also setting the PMI_DOMAIN value to the PMI_KVS value.
> If you just want to start your programs manually (launch processes manually) and still use SMPD as a PMI server I would recommend using the "-pmiserver" option of mpiexec (mpiexec -pmiserver 2 => Starts an instance of SMPD - the port/kvs/domain values are printed out - and you can launch your MPI processes with the provided PMI environment. I set PMI_SIZE, PMI_RANK, PMI_KVS, PMI_DOMAIN, PMI_PORT, PMI_ROOT_PORT, PMI_HOST, PMI_ROOT_HOST env).
> Let us know if it works for you.
> Regards,
> Jayesh
>
> (PS: When connecting to the process manager the default values of the kvs won't work.)
> ----- Original Message -----
> From: "Takahiro Someji"<Takahiro_Someji at hitachi-metals.co.jp>
> To:jayesh at mcs.anl.gov,mpich-discuss at mcs.anl.gov
> Sent: Wednesday, March 10, 2010 6:53:22 PM GMT -06:00 US/Canada Central
> Subject: Re: [mpich-discuss] Error while connecting to host
>
> Hello.
>
> I tried as your proposal. However, the result did not change.
>
> I tried a setup of PMI_ROOT_PORT=8676 as smpd default port number in
> Host window& Sub window.
> Then, the program was terminated for the fatal error.
> Is communication with smpd blocked? Or is this a unique phenomenon by
> japanese OS?
>
> Regards,
> Someji
>
> (2010/03/11 0:08),jayesh at mcs.anl.gov wrote:
>
>> Hi,
>> Try adding PMI_HOST (with value of PMI_ROOT_HOST) and PMI_PORT (with value of PMI_ROOT_PORT) into your PMI environment and see if it works. Make sure that you start the process with rank=0 before other processes.
>> Are you trying to debug your program (Another way to debug your code would be to attach to the MPI process using a debugger)?
>>
>> Regards,
>> Jayesh
>> ----- Original Message -----
>> From: "染次 孝博"<Takahiro_Someji at hitachi-metals.co.jp>
>> To:mpich-discuss at mcs.anl.gov
>> Sent: Tuesday, March 9, 2010 11:51:40 PM GMT -06:00 US/Canada Central
>> Subject: [mpich-discuss] Error while connecting to host
>>
>>
>> Hi.
>>
>> I am developing the manual start program. (WindowsXP sp3, MPICH2-1.2.1p1, VisualStudio2008 sp1, C++)
>> Program is very simple as below.
>>
>> ************************
>> *int main(int argc, char* argv[])
>> *{
>> * int numprocs,myid,namelen,i;
>> * char processor_name[256];
>> *
>> * MPI_Init(&argc,&argv);
>> * MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
>> * MPI_Comm_rank(MPI_COMM_WORLD,&myid);
>> * MPI_Get_processor_name(processor_name,&namelen);
>> *
>> * for (i=0 ; i<argc ; i++)
>> * {
>> * printf("arg%d:%s\n",i,argv[i]);
>> * }
>> * printf("numprocs:%d\n",numprocs);
>> * printf("myid:%d\n",myid);
>> * printf("processor name:%s\n",processor_name);
>> *
>> * MPI_Finalize();
>> * return 0;
>> *}
>> ************************
>>
>> The result of the program with "mpiexec -n 2 program.exe" is below.
>>
>> *************************
>> *arg0:D:\temp\MPITest1\Debug\mpitest1.exe
>> *numprocs:2
>> *myid:1
>> *processor name:HOSTPC
>> *arg0:D:\temp\MPITest1\Debug\mpitest1.exe
>> *numprocs:2
>> *myid:0
>> *processor name:HOSTPC
>> *************************
>>
>> Next, I tried to start of the two programs manually with one PC.
>> As shown in the manual "Debugging jobs by starting them manually" , it set up as follows.
>> In Host window:
>> set PMI_ROOT_HOST=HOSTPC
>> set PMI_RANK=0
>> set PMI_ROOT_PORT=9222
>> set PMI_ROOT_LOCAL=1
>> set PMI_SIZE=2
>> set PMI_KVS=mpich2
>> In Sub window:
>> set PMI_ROOT_HOST=HOSTPC
>> set PMI_RANK=1
>> :
>> : same as Host
>>
>> Then , I started this program with cmd "program.exe" in Host window& Sub window.
>> As a result, the following errors occurred and it did not operate well.
>>
>> [01:3156]..ERROR:Error while connecting to host,
>> Can not connected because refused target computer (10061)
>> [01:3156]..ERROR:Connect on sock (host=000520p80812.ad.hitachi-metals.co.jp, por
>> t=9222) failed, exhaused all end points
>> SMPDU_Sock_post_connect failed.
>> [0] PMI_ConnectToHost failed: unable to post a connect to 000520p80812.ad.hitach
>> i-metals.co.jp:9222, error: Undefined dynamic error code
>> uPMI_ConnectToHost returning PMI_FAIL
>> [0] PMI_Init failed.
>> Fatal error in MPI_Init: Other MPI error, error stack:
>> MPIR_Init_thread(394): Initialization failed
>> MPID_Init(103).......: channel initialization failed
>> MPID_Init(374).......: PMI_Init returned -1
>>
>>
>> Please let me know solution.
>>
>> -- Thank you.
>>
>> -----------------------------------------------
>> Takahiro Someji , Senior Engineer
>>
>> Hitachi Metals Ltd. Production System Lab.
>> 6010, Mikajiri
>> Kumagaya city,Saitama pref. JAPAN
>> zip: 360-0843
>>
>> phone: +81-485-31-1720
>> fax: +81-485-33-3398
>> eMail:takahiro_someji at hitachi-metals.co.jp
>> web:http://www.hitachi-metals.co.jp
>> -----------------------------------------------
More information about the mpich-discuss
mailing list