[Swift-devel] [ZeptoOS] hostname returns none in Surveyor

Michael Wilde wilde at mcs.anl.gov
Sun Mar 4 19:04:08 CST 2012


Thanks, Zhao.

Im guessing that BG_PSETORG is a triplet in the space XSIZE, YSIZE, ZSIZE?

If thats true, then unlike the 10.128 driver, the low three octets do not form a consecutive integer when using Zhao's script, and you need to form the set of IP addresses from:

12.{0..ZSIZE-1}.{0..YSIZE-1}.{1..ZSIZE}

And then you could designate one (eg 12.0.0.1) to be the master.

I *suspect* that the selection of these IPs is arbitrary, so you *may* be able to use rank as the low order octets.

I think a simple first test is to dump out all the BG_PSETORG values for a few sample job sizes submitted by cqsub. Also do tests to verify that you can ifconfig each interface and ping the others.

Also in answer to your prior question, I also *suspect* that you can name the interface anything (such as tor0) except for the interface name thats already assigned. 

- Mike

----- Original Message -----
> From: "ZHAO ZHANG" <zhaozhang at uchicago.edu>
> To: "Emalayan Vairavanathan" <svemalayan at yahoo.com>
> Cc: "Michael Wilde" <wilde at mcs.anl.gov>, "Justin M Wozniak" <wozniak at mcs.anl.gov>, swift-devel at ci.uchicago.edu
> Sent: Sunday, March 4, 2012 6:28:08 PM
> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in Surveyor
> Hi,
> 
> I am attaching one personality.sh file of a 4096 CN allocation. From
> that you can see, there are couple of ways to calculate the
> allocation size:
> 1) BG_BLOCKID="ANL-R10-R13-4096"
> 2) Multiplication of BG_XSIZE, BG_YSIZE, BG_ZSIZE
> 
> I am not quite sure how it will work on allocation that is smaller
> than 64, you could simply give it a check.
> 
> best
> zhao
> 
> BG_UCI=68801700
> BG_LOCATION=R10-M0-N00-J23
> BG_MAC=00:00:00:00:00:00
> BG_IP=0.0.0.0
> BG_NETMASK=255.255.255.112
> BG_BROADCAST=0.0.0.0
> BG_GATEWAY=0.0.0.0
> BG_MTU=1536
> BG_FS=0.0.0.0
> BG_EXPORTDIR=""
> BG_SIMULATION=0
> BG_PSETNUM=0
> BG_NUMPSETS=64
> BG_NODESINPSET=64
> BG_XSIZE=8
> BG_YSIZE=16
> BG_ZSIZE=32
> BG_VERBOSE=0
> BG_PSETSIZE="4 4 4"
> BG_PSETORG="0 0 0"
> BG_CLOCKHZ=850
> BG_GLINTS=1
> BG_ISTORUS=""
> BG_BLOCKID="ANL-R10-R13-4096"
> BG_SN=0.0.0.0
> BG_IS_IO_NODE=0
> BG_RANK_IN_PSET=64
> BG_RANK=0
> BG_IP_OVER_COL=0
> BG_IP_OVER_TOR=0
> BG_IP_OVER_COL_VC=0
> BG_CIO_MODE=FULL
> BG_BGSYS_FS_TYPE=NFSv3
> BG_HTC_MODE=0
> 
> 
> On 3/4/2012 6:23 PM, Emalayan Vairavanathan wrote:
> 
> 
> 
> Zhao, Thank you very much for the answers.
> 
> 
> One more question: :)
> 
> 
> Do you know how I can calculate this ? Does this method works
> regardless of the number of nodes allocated (even with any fraction of
> pset ) ?
> 
> 
> Thank you
> Emalayan
> 
> 
> 
> 
> 
> From: ZHAO ZHANG <zhaozhang at uchicago.edu>
> To: Emalayan Vairavanathan <svemalayan at yahoo.com>
> Cc: Michael Wilde <wilde at mcs.anl.gov> ; Justin M Wozniak
> <wozniak at mcs.anl.gov> ; "swift-devel at ci.uchicago.edu"
> <swift-devel at ci.uchicago.edu>
> Sent: Sunday, 4 March 2012 4:06 PM
> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in Surveyor
> 
> 
> 
> Hi, Emalayan
> 
> On 3/4/2012 6:00 PM, Emalayan Vairavanathan wrote:
> 
> 
> 
> Mike, that sounds like a good idea.
> 
> 
> Zhao , In addition to Mike's questions I have two more questions.
> 
> 
> 1) Is it possible to get/ calculate the MAX_RANK / number of nodes in
> an allocation from persoanlity.h or some other data structure ? Yes,
> you could calculate the MAX_RANK from personality.sh.
> 
> 
> 
> 
> 
> 
> 2) Which interface should be configured to have Tours ? (Does this
> matter at all ?)
> In your scripts you are configuring eth1. But in
> http://wiki.mcs.anl.gov/zeptoos/index.php/Other_Packages tun1 is
> configured. To use the torus network, there are two ways. One is to
> use the 12.x.y.z+1 interface, which we have to configure ourselves.
> The other way is to use the "ipfwd.sh", aka the 10.128 interface. The
> drawback of the second interface is it takes up one core
> for polling, and there is some scalability issue beyond 2K compute
> nodes as far as I remember. Mosa could use either of them.
> 
> best
> zhao
> 
> 
> 
> 
> 
> 
> Thank you
> Emalayan
> 
> 
> 
> 
> 
> 
> From: Michael Wilde <wilde at mcs.anl.gov>
> To: ZHAO ZHANG <zhaozhang at uchicago.edu> ; Justin M Wozniak
> <wozniak at mcs.anl.gov>
> Cc: Emalayan Vairavanathan <svemalayan at yahoo.com> ;
> swift-devel at ci.uchicago.edu
> Sent: Sunday, 4 March 2012 2:33 PM
> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in Surveyor
> 
> Zhao, with this procedure do you get consecutive host IP addresses
> starting from 0.0 through 640*64 in the two low order octets?
> 
> In other words, does your just do what this page describes under "IP
> over Torus":
> 
> http://wiki.mcs.anl.gov/zeptoos/index.php/Other_Packages
> 
> Is the "ipfwd.sh" script mentioned there still needed, or does that
> now happen automatically?
> 
> If so, perhaps we can greatly simplify the Mosa startup: we need only
> pass the max rank of the running job, and Mosa will know that it can
> use 12.128.0.0 for example. Then we dont need any manual intervention,
> nor complicated/brittle file-waiting logic.
> 
> Zhao, I dont understand why your example is using the 12.0.0.0 network
> vs the example on the page above which uses 10.128.0.0. Can you help
> me understand what is going on here? Is the "IP Over Torus" info on
> the ZeptoOS wiki outdated? Or does it describe a different technique?
> 
> Justin, have you also mastered similar techniques for JETS? Do we need
> help form the ZeptoOS team on this?
> 
> Thanks,
> 
> - Mike
> 
> 
> 
> ----- Original Message -----
> > From: "ZHAO ZHANG" < zhaozhang at uchicago.edu >
> > To: "Michael Wilde" < wilde at mcs.anl.gov >
> > Cc: "Emalayan Vairavanathan" < svemalayan at yahoo.com >,
> > swift-devel at ci.uchicago.edu
> > Sent: Sunday, March 4, 2012 1:17:18 PM
> > Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in
> > Surveyor
> > Yes, each compute node needs to run this script to bring up the
> > network
> > interface.
> >
> > zhao
> >
> > On 3/4/2012 12:53 PM, Michael Wilde wrote:
> > > Thanks, Zhao. Does this need to run on each node at startup?
> > >
> > > If so should this logic be integrated into the worker startup
> > > script, Jon, Justin, Emalayan?
> > >
> > > Ive not looked at the current scripts much; I would think that all
> > > the BG/P specific logic of enabling the torus network and finding
> > > each node's IP address on the torus should be done in the init
> > > script rather than in the worker.
> > >
> > > - Mike
> > >
> > > ----- Original Message -----
> > >> From: "ZHAO ZHANG"< zhaozhang at uchicago.edu >
> > >> To: "Michael Wilde"< wilde at mcs.anl.gov >
> > >> Cc: "Emalayan Vairavanathan"< svemalayan at yahoo.com >,
> > >> swift-devel at ci.uchicago.edu
> > >> Sent: Sunday, March 4, 2012 12:18:28 PM
> > >> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in
> > >> Surveyor
> > >> Hi, Mike
> > >>
> > >> With 192.168.1.*, we could only access the tree network. In order
> > >> to
> > >> use
> > >> the torus network, we need to use the 12.x.y.z+1 ip address. (x,
> > >> y,
> > >> z
> > >> here is the coordinates of the compute nodes).
> > >> The code below could bring the torus ip address up.
> > >>
> > >> IP=""
> > >> set_torus_ip()
> > >> {
> > >> x=$1
> > >> y=$2
> > >> z=$3
> > >> z=`expr $3 + 1`
> > >> ifconfig eth1 12.$x.$y.$z netmask 255.0.0.0 mtu 8996 -arp
> > >> IP=12.$x.$y.$z
> > >> }
> > >> BG_PSETORG=`cat /proc/personality.sh | grep BG_PSETORG | cut -d
> > >> '"'
> > >> -f
> > >> 2`
> > >> echo ${BG_PSETORG}>> /dev/shm/localip
> > >> set_torus_ip $BG_PSETORG
> > >>
> > >> best
> > >> zhao
> > >>
> > >> On 3/4/2012 10:24 AM, Michael Wilde wrote:
> > >>> Zhao,
> > >>>
> > >>> Can you tell us if the nodes on the torus network are accessed
> > >>> over
> > >>> the 192.168 network? I just realized they cant all be on the
> > >>> 192.168.1 subnet, so I hope I suggested the right network here.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> - Mike
> > >>>
> > >>> ----- Original Message -----
> > >>>> From: "Emalayan Vairavanathan"< svemalayan at yahoo.com >
> > >>>> To: swift-devel at ci.uchicago.edu
> > >>>> Sent: Sunday, March 4, 2012 1:40:53 AM
> > >>>> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in
> > >>>> Surveyor
> > >>>> Thank you very much Mike. I agree with you suggestion. I can do
> > >>>> that
> > >>>> in worker.pl.
> > >>>>
> > >>>>
> > >>>> Thank you
> > >>>> Emalayan
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> From: Michael Wilde< wilde at mcs.anl.gov >
> > >>>> To: emalayan at ece.ubc.ca
> > >>>> Cc: swift-devel< swift-devel at ci.uchicago.edu >
> > >>>> Sent: Saturday, 3 March 2012 7:39 PM
> > >>>> Subject: Re: [Swift-devel] [ZeptoOS] hostname returns none in
> > >>>> Surveyor
> > >>>>
> > >>>> Emalayan,
> > >>>>
> > >>>> I wasnt paying much attention to the actual IP address returned
> > >>>> by
> > >>>> hostname in the zeptoos profile.
> > >>>>
> > >>>> Since these are the addresses that Mosa will communicate over,
> > >>>> I
> > >>>> think
> > >>>> you *do* want them to be the 192.168.1.* addresses of the nodes
> > >>>> on
> > >>>> the
> > >>>> torus network (in other words tun0).
> > >>>>
> > >>>> So, since both profiles return 192.168.1.64 for the tun0 IP, I
> > >>>> think
> > >>>> thats what you should use. So try replacing `hostname` in
> > >>>> worker.pl
> > >>>> with something like:
> > >>>>
> > >>>> `ifconfig | grep 192.168 | sed -e 's/^inet addr://' -e 's/
> > >>>> .*//'`
> > >>>>
> > >>>> You may have to adapt this a bit to meet your needs. Im
> > >>>> assuming
> > >>>> that
> > >>>> the only code that will uses these IPs is MosaStore.
> > >>>>
> > >>>> - Mike
> > >>>>
> > >>>>
> > >>>> ----- Original Message -----
> > >>>>> From: "Kazutomo Yoshii"< kazutomo at mcs.anl.gov >
> > >>>>> To: zeptoos at lists.mcs.anl.gov
> > >>>>> Sent: Saturday, March 3, 2012 8:52:00 PM
> > >>>>> Subject: Re: [ZeptoOS] hostname returns none in Surveyor
> > >>>>> Hi Emalayan,
> > >>>>>
> > >>>>> The zeptoos profile returns the IP address of associated I/O
> > >>>>> node,
> > >>>>> which is kind of wrong in my opinion (influence of IBM CNK).
> > >>>>> ifconfig on compute nodes returns CN's IP address, which is
> > >>>>> correct.
> > >>>>> e.g. tun0 192.168.1.64
> > >>>>>
> > >>>>> If you want to find associated ION's IP address from CNs,
> > >>>>> do something like this.
> > >>>>>
> > >>>>> $ grep BG_IP= /proc/personality.sh
> > >>>>>
> > >>>>> - kaz
> > >>>>>
> > >>>>> On 03/03/2012 08:25 PM, Emalayan Vairavanathan wrote:
> > >>>>>> Hi All,
> > >>>>>>
> > >>>>>> I am trying to run some experiments in Surveyor. The software
> > >>>>>> I
> > >>>>>> am
> > >>>>>> using
> > >>>>>> gets the IP-address of compute-nodes using hostname command.
> > >>>>>>
> > >>>>>> With zepto-vn-eval/mosatest profile hostname command returns
> > >>>>>> none.
> > >>>>>> But with zeptoos profile hostname returns the correct IP
> > >>>>>> address.
> > >>>>>>
> > >>>>>> Is this due to some configuration issues in
> > >>>>>> zepto-vn-eval/mosatest
> > >>>>>> profile?As a workaround I tired to use ifconfig with both
> > >>>>>> profiles,
> > >>>>>> but
> > >>>>>> it seems ifconfig is not returning the correct IP address.
> > >>>>>>
> > >>>>>> Is there any command / files which I can used to retrieve the
> > >>>>>> hostname
> > >>>>>> on compute nodes? I have pasted the console output with both
> > >>>>>> profiles
> > >>>>>> below. Please let me know if you need more details.
> > >>>>>>
> > >>>>>> Thank you
> > >>>>>> Emalayan
> > >>>>>>
> > >>>>>>
> > >>>>>> =======================With zeptoos profile
> > >>>>>> ===============================
> > >>>>>>
> > >>>>>> / # hostname
> > >>>>>> 172.18.3.19
> > >>>>>> / #
> > >>>>>> / # cat /proc/sys/kernel/hostname
> > >>>>>> 172.18.3.19
> > >>>>>> / #
> > >>>>>> / #
> > >>>>>> / # ifconfig -a
> > >>>>>> lo Link encap:Local Loopback
> > >>>>>> inet addr:127.0.0.1 Mask:255.0.0.0
> > >>>>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
> > >>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > >>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> > >>>>>> collisions:0 txqueuelen:0
> > >>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
> > >>>>>>
> > >>>>>> tun0 Link encap:UNSPEC HWaddr
> > >>>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> > >>>>>> inet addr:192.168.1.64 P-t-P:192.168.1.254
> > >>>>>> Mask:255.255.255.255
> > >>>>>> UP POINTOPOINT RUNNING NOARP MULTICAST MTU:65535 Metric:1
> > >>>>>> RX packets:2662 errors:0 dropped:0 overruns:0 frame:0
> > >>>>>> TX packets:1772 errors:0 dropped:0 overruns:0 carrier:0
> > >>>>>> collisions:0 txqueuelen:500
> > >>>>>> RX bytes:140206 (136.9 KiB) TX bytes:125412 (122.4 KiB)
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> =======================With zepto-vn-eval/mosatest profile
> > >>>>>> ===============================
> > >>>>>>
> > >>>>>> /etc # hostname
> > >>>>>> (none)
> > >>>>>> /etc #
> > >>>>>> /etc # cat /proc/sys/kernel/hostname
> > >>>>>> (none)
> > >>>>>> /etc #
> > >>>>>> /etc # ifconfig -a
> > >>>>>> eth0 Link encap:Ethernet HWaddr 00:80:46:00:00:00
> > >>>>>> BROADCAST MULTICAST MTU:1500 Metric:1
> > >>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > >>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> > >>>>>> collisions:0 txqueuelen:1000
> > >>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
> > >>>>>>
> > >>>>>> eth1 Link encap:Ethernet HWaddr 00:80:47:00:00:00
> > >>>>>> BROADCAST MULTICAST MTU:1500 Metric:1
> > >>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > >>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> > >>>>>> collisions:0 txqueuelen:1000
> > >>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
> > >>>>>>
> > >>>>>> lo Link encap:Local Loopback
> > >>>>>> inet addr:127.0.0.1 Mask:255.0.0.0
> > >>>>>> inet6 addr: ::1/128 Scope:Host
> > >>>>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
> > >>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > >>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> > >>>>>> collisions:0 txqueuelen:0
> > >>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
> > >>>>>>
> > >>>>>> sit0 Link encap:IPv6-in-IPv4
> > >>>>>> NOARP MTU:1480 Metric:1
> > >>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > >>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> > >>>>>> collisions:0 txqueuelen:0
> > >>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
> > >>>>>>
> > >>>>>> tun0 Link encap:UNSPEC HWaddr
> > >>>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> > >>>>>> inet addr:192.168.1.64 P-t-P:192.168.1.254
> > >>>>>> Mask:255.255.255.255
> > >>>>>> UP POINTOPOINT RUNNING NOARP MULTICAST MTU:65535 Metric:1
> > >>>>>> RX packets:965 errors:0 dropped:0 overruns:0 frame:0
> > >>>>>> TX packets:627 errors:0 dropped:0 overruns:0 carrier:0
> > >>>>>> collisions:0 txqueuelen:500
> > >>>>>> RX bytes:50984 (49.7 KiB) TX bytes:50530 (49.3 KiB)
> > >>>>>>
> > >>>> --
> > >>>> Michael Wilde
> > >>>> Computation Institute, University of Chicago
> > >>>> Mathematics and Computer Science Division
> > >>>> Argonne National Laboratory
> > >>>>
> > >>>> _______________________________________________
> > >>>> Swift-devel mailing list
> > >>>> Swift-devel at ci.uchicago.edu
> > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > >>>>
> > >>>>
> > >>>>
> > >>>> _______________________________________________
> > >>>> Swift-devel mailing list
> > >>>> Swift-devel at ci.uchicago.edu
> > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> 
> --
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list