[Swift-devel] [ZeptoOS] hostname returns none in Surveyor

Jonathan Monette jonmon at mcs.anl.gov
Sun Mar 4 13:52:31 CST 2012


Ah, I thought the workers provided the service with the ip to communicate with.   That makes sense. 

On Mar 4, 2012, at 1:50 PM, Michael Wilde <wilde at mcs.anl.gov> wrote:

> John, the workers can connect to the coaster service IP which they are passed as an argument. They should be able to reach the coaster service via NAT. Once the workers connect, the service and workers communicate on the bidirectional socket. The service doesnt need to know the workers' IPs.
> 
> - Mike
> 
> ----- Original Message -----
>> From: "Jonathan Monette" <jonmon at mcs.anl.gov>
>> To: "ZHAO ZHANG" <zhaozhang at uchicago.edu>
>> Cc: "Michael Wilde" <wilde at mcs.anl.gov>, swift-devel at ci.uchicago.edu
>> Sent: Sunday, March 4, 2012 1:30:53 PM
>> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in Surveyor
>> This logic could be added to the worker-init.pl script. It shouldn't
>> be to difficult.
>> 
>> But one thought that has been nagging me is can the workers connect
>> back to the service and actually do work in the other kernel profile?
>> The coaster service needs to know what worker to send jobs to doesn't
>> it? How does it send the work to the worker if the worker doesn't know
>> it's ip to peovide to the service. So the worker logic still may need
>> to be changed a bit to work with this kernel profile.
>> 
>> On Mar 4, 2012, at 1:17 PM, ZHAO ZHANG <zhaozhang at uchicago.edu> wrote:
>> 
>>> Yes, each compute node needs to run this script to bring up the
>>> network
>>> interface.
>>> 
>>> zhao
>>> 
>>> On 3/4/2012 12:53 PM, Michael Wilde wrote:
>>>> Thanks, Zhao. Does this need to run on each node at startup?
>>>> 
>>>> If so should this logic be integrated into the worker startup
>>>> script, Jon, Justin, Emalayan?
>>>> 
>>>> Ive not looked at the current scripts much; I would think that all
>>>> the BG/P specific logic of enabling the torus network and finding
>>>> each node's IP address on the torus should be done in the init
>>>> script rather than in the worker.
>>>> 
>>>> - Mike
>>>> 
>>>> ----- Original Message -----
>>>>> From: "ZHAO ZHANG"<zhaozhang at uchicago.edu>
>>>>> To: "Michael Wilde"<wilde at mcs.anl.gov>
>>>>> Cc: "Emalayan Vairavanathan"<svemalayan at yahoo.com>,
>>>>> swift-devel at ci.uchicago.edu
>>>>> Sent: Sunday, March 4, 2012 12:18:28 PM
>>>>> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in
>>>>> Surveyor
>>>>> Hi, Mike
>>>>> 
>>>>> With 192.168.1.*, we could only access the tree network. In order
>>>>> to
>>>>> use
>>>>> the torus network, we need to use the 12.x.y.z+1 ip address. (x,
>>>>> y, z
>>>>> here is the coordinates of the compute nodes).
>>>>> The code below could bring the torus ip address up.
>>>>> 
>>>>> IP=""
>>>>> set_torus_ip()
>>>>> {
>>>>> x=$1
>>>>> y=$2
>>>>> z=$3
>>>>> z=`expr $3 + 1`
>>>>> ifconfig eth1 12.$x.$y.$z netmask 255.0.0.0 mtu 8996 -arp
>>>>> IP=12.$x.$y.$z
>>>>> }
>>>>> BG_PSETORG=`cat /proc/personality.sh | grep BG_PSETORG | cut -d
>>>>> '"' -f
>>>>> 2`
>>>>> echo ${BG_PSETORG}>> /dev/shm/localip
>>>>> set_torus_ip $BG_PSETORG
>>>>> 
>>>>> best
>>>>> zhao
>>>>> 
>>>>> On 3/4/2012 10:24 AM, Michael Wilde wrote:
>>>>>> Zhao,
>>>>>> 
>>>>>> Can you tell us if the nodes on the torus network are accessed
>>>>>> over
>>>>>> the 192.168 network? I just realized they cant all be on the
>>>>>> 192.168.1 subnet, so I hope I suggested the right network here.
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> - Mike
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>>> From: "Emalayan Vairavanathan"<svemalayan at yahoo.com>
>>>>>>> To: swift-devel at ci.uchicago.edu
>>>>>>> Sent: Sunday, March 4, 2012 1:40:53 AM
>>>>>>> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in
>>>>>>> Surveyor
>>>>>>> Thank you very much Mike. I agree with you suggestion. I can do
>>>>>>> that
>>>>>>> in worker.pl.
>>>>>>> 
>>>>>>> 
>>>>>>> Thank you
>>>>>>> Emalayan
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> From: Michael Wilde<wilde at mcs.anl.gov>
>>>>>>> To: emalayan at ece.ubc.ca
>>>>>>> Cc: swift-devel<swift-devel at ci.uchicago.edu>
>>>>>>> Sent: Saturday, 3 March 2012 7:39 PM
>>>>>>> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in
>>>>>>> Surveyor
>>>>>>> 
>>>>>>> Emalayan,
>>>>>>> 
>>>>>>> I wasnt paying much attention to the actual IP address returned
>>>>>>> by
>>>>>>> hostname in the zeptoos profile.
>>>>>>> 
>>>>>>> Since these are the addresses that Mosa will communicate over, I
>>>>>>> think
>>>>>>> you *do* want them to be the 192.168.1.* addresses of the nodes
>>>>>>> on
>>>>>>> the
>>>>>>> torus network (in other words tun0).
>>>>>>> 
>>>>>>> So, since both profiles return 192.168.1.64 for the tun0 IP, I
>>>>>>> think
>>>>>>> thats what you should use. So try replacing `hostname` in
>>>>>>> worker.pl
>>>>>>> with something like:
>>>>>>> 
>>>>>>> `ifconfig | grep 192.168 | sed -e 's/^inet addr://' -e 's/
>>>>>>> .*//'`
>>>>>>> 
>>>>>>> You may have to adapt this a bit to meet your needs. Im assuming
>>>>>>> that
>>>>>>> the only code that will uses these IPs is MosaStore.
>>>>>>> 
>>>>>>> - Mike
>>>>>>> 
>>>>>>> 
>>>>>>> ----- Original Message -----
>>>>>>>> From: "Kazutomo Yoshii"< kazutomo at mcs.anl.gov>
>>>>>>>> To: zeptoos at lists.mcs.anl.gov
>>>>>>>> Sent: Saturday, March 3, 2012 8:52:00 PM
>>>>>>>> Subject: Re: [ZeptoOS] hostname returns none in Surveyor
>>>>>>>> Hi Emalayan,
>>>>>>>> 
>>>>>>>> The zeptoos profile returns the IP address of associated I/O
>>>>>>>> node,
>>>>>>>> which is kind of wrong in my opinion (influence of IBM CNK).
>>>>>>>> ifconfig on compute nodes returns CN's IP address, which is
>>>>>>>> correct.
>>>>>>>> e.g. tun0 192.168.1.64
>>>>>>>> 
>>>>>>>> If you want to find associated ION's IP address from CNs,
>>>>>>>> do something like this.
>>>>>>>> 
>>>>>>>> $ grep BG_IP= /proc/personality.sh
>>>>>>>> 
>>>>>>>> - kaz
>>>>>>>> 
>>>>>>>> On 03/03/2012 08:25 PM, Emalayan Vairavanathan wrote:
>>>>>>>>> Hi All,
>>>>>>>>> 
>>>>>>>>> I am trying to run some experiments in Surveyor. The software
>>>>>>>>> I
>>>>>>>>> am
>>>>>>>>> using
>>>>>>>>> gets the IP-address of compute-nodes using hostname command.
>>>>>>>>> 
>>>>>>>>> With zepto-vn-eval/mosatest profile hostname command returns
>>>>>>>>> none.
>>>>>>>>> But with zeptoos profile hostname returns the correct IP
>>>>>>>>> address.
>>>>>>>>> 
>>>>>>>>> Is this due to some configuration issues in
>>>>>>>>> zepto-vn-eval/mosatest
>>>>>>>>> profile?As a workaround I tired to use ifconfig with both
>>>>>>>>> profiles,
>>>>>>>>> but
>>>>>>>>> it seems ifconfig is not returning the correct IP address.
>>>>>>>>> 
>>>>>>>>> Is there any command / files which I can used to retrieve the
>>>>>>>>> hostname
>>>>>>>>> on compute nodes? I have pasted the console output with both
>>>>>>>>> profiles
>>>>>>>>> below. Please let me know if you need more details.
>>>>>>>>> 
>>>>>>>>> Thank you
>>>>>>>>> Emalayan
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> =======================With zeptoos profile
>>>>>>>>> ===============================
>>>>>>>>> 
>>>>>>>>> / # hostname
>>>>>>>>> 172.18.3.19
>>>>>>>>> / #
>>>>>>>>> / # cat /proc/sys/kernel/hostname
>>>>>>>>> 172.18.3.19
>>>>>>>>> / #
>>>>>>>>> / #
>>>>>>>>> / # ifconfig -a
>>>>>>>>> lo Link encap:Local Loopback
>>>>>>>>> inet addr:127.0.0.1 Mask:255.0.0.0
>>>>>>>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
>>>>>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>>>>>>> collisions:0 txqueuelen:0
>>>>>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
>>>>>>>>> 
>>>>>>>>> tun0 Link encap:UNSPEC HWaddr
>>>>>>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>>>>>>>>> inet addr:192.168.1.64 P-t-P:192.168.1.254
>>>>>>>>> Mask:255.255.255.255
>>>>>>>>> UP POINTOPOINT RUNNING NOARP MULTICAST MTU:65535 Metric:1
>>>>>>>>> RX packets:2662 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>>> TX packets:1772 errors:0 dropped:0 overruns:0 carrier:0
>>>>>>>>> collisions:0 txqueuelen:500
>>>>>>>>> RX bytes:140206 (136.9 KiB) TX bytes:125412 (122.4 KiB)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> =======================With zepto-vn-eval/mosatest profile
>>>>>>>>> ===============================
>>>>>>>>> 
>>>>>>>>> /etc # hostname
>>>>>>>>> (none)
>>>>>>>>> /etc #
>>>>>>>>> /etc # cat /proc/sys/kernel/hostname
>>>>>>>>> (none)
>>>>>>>>> /etc #
>>>>>>>>> /etc # ifconfig -a
>>>>>>>>> eth0 Link encap:Ethernet HWaddr 00:80:46:00:00:00
>>>>>>>>> BROADCAST MULTICAST MTU:1500 Metric:1
>>>>>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>>>>>>> collisions:0 txqueuelen:1000
>>>>>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
>>>>>>>>> 
>>>>>>>>> eth1 Link encap:Ethernet HWaddr 00:80:47:00:00:00
>>>>>>>>> BROADCAST MULTICAST MTU:1500 Metric:1
>>>>>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>>>>>>> collisions:0 txqueuelen:1000
>>>>>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
>>>>>>>>> 
>>>>>>>>> lo Link encap:Local Loopback
>>>>>>>>> inet addr:127.0.0.1 Mask:255.0.0.0
>>>>>>>>> inet6 addr: ::1/128 Scope:Host
>>>>>>>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
>>>>>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>>>>>>> collisions:0 txqueuelen:0
>>>>>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
>>>>>>>>> 
>>>>>>>>> sit0 Link encap:IPv6-in-IPv4
>>>>>>>>> NOARP MTU:1480 Metric:1
>>>>>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>>>>>>> collisions:0 txqueuelen:0
>>>>>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
>>>>>>>>> 
>>>>>>>>> tun0 Link encap:UNSPEC HWaddr
>>>>>>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>>>>>>>>> inet addr:192.168.1.64 P-t-P:192.168.1.254
>>>>>>>>> Mask:255.255.255.255
>>>>>>>>> UP POINTOPOINT RUNNING NOARP MULTICAST MTU:65535 Metric:1
>>>>>>>>> RX packets:965 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>>> TX packets:627 errors:0 dropped:0 overruns:0 carrier:0
>>>>>>>>> collisions:0 txqueuelen:500
>>>>>>>>> RX bytes:50984 (49.7 KiB) TX bytes:50530 (49.3 KiB)
>>>>>>>>> 
>>>>>>> --
>>>>>>> Michael Wilde
>>>>>>> Computation Institute, University of Chicago
>>>>>>> Mathematics and Computer Science Division
>>>>>>> Argonne National Laboratory
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Swift-devel mailing list
>>>>>>> Swift-devel at ci.uchicago.edu
>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Swift-devel mailing list
>>>>>>> Swift-devel at ci.uchicago.edu
>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> 
> -- 
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 



More information about the Swift-devel mailing list