[mpich-discuss] networking problems with mpich2-1.3

Saurav Pathak saurav at sas.upenn.edu
Thu Nov 11 11:38:55 CST 2010


I could locate some firewall rules and disabled them.  Now it works.

Thanks!
Saurav

Saurav Pathak wrote:
> Manhui,
>
> Thank you the solution, which partially solved my problem.  From 
> comp2, I could run cpi cleanly. But from comp1, I get this message:
>
> ------------
> [proxy:0:1 at comp2] HYDU_sock_connect 
> (/home/saurav/local/src/mpich2-1.3/src/pm/hydra/utils/sock/sock.c:151): 
> connect error (Connection timed out)
> [proxy:0:1 at comp2] main 
> (/home/saurav/local/src/mpich2-1.3/src/pm/hydra/pm/pmiserv/pmip.c:204): 
> unable to connect to server comp1 at port 56740 (check for firewalls!)
> -----------
>
> I have looked at iptables -L, and hosts.deny on comp1, but I can't 
> seem to find the problem.
> When on comp2 I try "nmap comp1" I get an error message:
> Note: Host seems down. If it is really up, but blocking our ping 
> probes, try -PN
>
> When I do use the -PN option, I get exactly the same analogous result 
> I get running  "nmap comp2" on comp1.  So I am guessing there is some 
> network configuration on comp1 that I haven't been able to place my 
> finger on.
>
> Saurav
>
>
> Manhui Wang wrote:
>> Saurav,
>>
>> I met the the similar problem before, but it was resolved by changing
>> /etc/hosts. See previous discussion:
>>
>> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2010-July/007469.html
>>
>> Best wishes,
>> Manhui
>>
>> Saurav Pathak wrote:
>>  
>>> Hi,
>>>
>>> I am trying to set up two computers for running MPI.  I have compiled
>>> mpich2-1.3 from source, and have run the following without any incident
>>> on both machines (Ubuntu 9.04):
>>>
>>> mpiexec -n 2 ./cpi
>>>
>>> But when I try to run it on two computers (comp1 and comp2), then I run
>>> into the following problems.
>>>
>>> On comp1 when I run "mpiexec -f hosts -n 2 ./cpi", I get the following
>>> error.
>>> ----
>>> [proxy:0:1 at comp2] HYDU_sock_connect
>>> (/home/saurav/local/src/mpich2-1.3/src/pm/hydra/utils/sock/sock.c:151):
>>> connect error (Connection timed out)
>>> [proxy:0:1 at comp2] main
>>> (/home/saurav/local/src/mpich2-1.3/src/pm/hydra/pm/pmiserv/pmip.c:204):
>>> unable to connect to server comp1 at port 38203 (check for firewalls!)
>>> ----
>>>
>>> I have checked for firewalls via "sudo  /sbin/iptables -L" on both
>>> computers, and there are no firewalls.
>>>
>>> When I execute "mpiexec -f hosts -n 2 ./cpi" on comp2, I get the
>>> following output:
>>> ----
>>> Process 1 of 2 is on comp2
>>> Process 0 of 2 is on comp1
>>> Fatal error in PMPI_Bcast: Other MPI error, error stack:
>>> PMPI_Bcast(1306)..................: MPI_Bcast(buf=0x7ffff6b7465c,
>>> count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed
>>> MPIR_Bcast_impl(1150).............:
>>> MPIR_Bcast_intra(1021)............:
>>> MPIR_Bcast_binomial(187)..........:
>>> MPIC_Send(66).....................:
>>> MPIC_Wait(528)....................:
>>> MPIDI_CH3I_Progress(333)..........:
>>> MPID_nem_mpich2_blocking_recv(906):
>>> MPID_nem_tcp_connpoll(1861).......: Communication error with rank 1:
>>> APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
>>> ----
>>>
>>> It looks like a networking issue, but I can't figure out what it is.  I
>>> can ssh and rsh from one machine to the other without the need for a
>>> password and run commands (e.g, I can run "ssh comp2 hostname" on comp1
>>> and vice versa).
>>>
>>> I seem to be at a dead end.  Any help on this issue will be greatlt
>>> appreciated.
>>>
>>> Saurav
>>>
>>>
>>> _______________________________________________
>>> mpich-discuss mailing list
>>> mpich-discuss at mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>     
>>
>>
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>>   
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>



More information about the mpich-discuss mailing list