[Swift-devel] Cant get auto-coasters to run from midway to beagle
Michael Wilde
wilde at mcs.anl.gov
Sun Mar 10 16:37:49 CDT 2013
Mihael, it seems that the problem is still there under the current trunk - see below. This is in:
midway:/home/wilde/osgdemo/modis/svn/run035.
The "cog modified locally" is a hopefully inconsequential change in worker.pl where I open stdin to /dev/null rather than close it, before launching an app, to remedy an unrelated MPI problem.
- Mike
Swift trunk swift-r6362 cog-r3637 (cog modified locally)
RunID: 20130310-2055-4lqjiftd
Progress: time: Sun, 10 Mar 2013 20:55:52 +0000
Progress: time: Sun, 10 Mar 2013 20:56:06 +0000 Selecting site:269 Submitting:47 Submitted:1
Progress: time: Sun, 10 Mar 2013 20:56:12 +0000 Selecting site:269 Stage in:1 Submitted:47
Progress: time: Sun, 10 Mar 2013 20:56:16 +0000 Selecting site:269 Stage in:25 Submitted:23
Progress: time: Sun, 10 Mar 2013 20:56:22 +0000 Selecting site:269 Stage in:48
Progress: time: Sun, 10 Mar 2013 20:56:52 +0000 Selecting site:269 Stage in:48
Progress: time: Sun, 10 Mar 2013 20:57:22 +0000 Selecting site:269 Stage in:48
Progress: time: Sun, 10 Mar 2013 20:57:52 +0000 Selecting site:269 Stage in:48
Progress: time: Sun, 10 Mar 2013 20:58:19 +0000 Selecting site:269 Stage in:47 Active:1
Progress: time: Sun, 10 Mar 2013 20:58:20 +0000 Selecting site:269 Stage in:26 Active:22
Progress: time: Sun, 10 Mar 2013 20:58:22 +0000 Selecting site:269 Stage in:24 Active:24
Progress: time: Sun, 10 Mar 2013 20:58:24 +0000 Selecting site:269 Stage in:23 Active:25
Progress: time: Sun, 10 Mar 2013 20:58:26 +0000 Selecting site:269 Active:47 Stage out:1
Progress: time: Sun, 10 Mar 2013 20:58:27 +0000 Selecting site:260 Stage in:7 Submitting:1 Submitted:1 Active:39 Finished successfully:9
Progress: time: Sun, 10 Mar 2013 20:58:28 +0000 Selecting site:258 Stage in:9 Submitting:1 Submitted:1 Active:24 Stage out:13 Finished successfully:11
Progress: time: Sun, 10 Mar 2013 20:58:29 +0000 Selecting site:245 Stage in:23 Submitted:1 Active:24 Finished successfully:24
Progress: time: Sun, 10 Mar 2013 20:58:31 +0000 Selecting site:245 Stage in:24 Active:23 Stage out:1 Finished successfully:24
Progress: time: Sun, 10 Mar 2013 20:58:32 +0000 Selecting site:245 Stage in:24 Active:23 Finished successfully:25
Progress: time: Sun, 10 Mar 2013 20:58:34 +0000 Selecting site:244 Stage in:24 Submitting:1 Stage out:22 Finished successfully:26
Progress: time: Sun, 10 Mar 2013 20:58:35 +0000 Selecting site:221 Stage in:25 Submitting:22 Submitted:1 Finished successfully:48
Progress: time: Sun, 10 Mar 2013 20:58:52 +0000 Selecting site:221 Stage in:48 Finished successfully:48
Progress: time: Sun, 10 Mar 2013 20:59:22 +0000 Selecting site:221 Stage in:48 Finished successfully:48
Progress: time: Sun, 10 Mar 2013 20:59:52 +0000 Selecting site:221 Stage in:48 Finished successfully:48
Progress: time: Sun, 10 Mar 2013 20:59:56 +0000 Selecting site:221 Stage in:47 Active:1 Finished successfully:48
Progress: time: Sun, 10 Mar 2013 21:00:02 +0000 Selecting site:221 Stage in:47 Stage out:1 Finished successfully:48
Progress: time: Sun, 10 Mar 2013 21:00:05 +0000 Selecting site:221 Stage in:47 Finished successfully:49
Channels: {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000], null at id://u-3bec3eab-13d5616c4bd--8000-u4684d136-13d5616c4d0--8000S=MetaChannel[service-60734] -> BufferingChannel, /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60734] -> BufferingChannel, null at id://u4684d136-13d5616c4d0--7fff-u-3bec3eab-13d5616c4bd--7fffC=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000]}
Context: service-60018
Meta context: service-60734
Progress: time: Sun, 10 Mar 2013 21:00:07 +0000 Selecting site:220 Stage in:47 Submitted:1 Finished successfully:49
Channels: {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000], null at id://u-3bec3eab-13d5616c4bd--8000-u4684d136-13d5616c4d0--8000S=MetaChannel[service-60734] -> BufferingChannel, /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60734] -> BufferingChannel, null at id://u4684d136-13d5616c4d0--7fff-u-3bec3eab-13d5616c4bd--7fffC=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000]}
Context: service-60263
Meta context: service-60734
Channels: {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000], null at id://u-3bec3eab-13d5616c4bd--8000-u4684d136-13d5616c4d0--8000S=MetaChannel[service-60734] -> BufferingChannel, /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60734] -> BufferingChannel, null at id://u4684d136-13d5616c4d0--7fff-u-3bec3eab-13d5616c4bd--7fffC=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000]}
Context: service-60408
Meta context: service-60734
Progress: time: Sun, 10 Mar 2013 21:00:18 +0000 Selecting site:220 Stage in:47 Active:1 Finished successfully:49
Progress: time: Sun, 10 Mar 2013 21:00:19 +0000 Selecting site:220 Stage in:46 Active:2 Finished successfully:49
Execution failed:
Exception in getlanduse:
Arguments: [home/wilde/osgdemo/modis/svn/data/modis/2002/h12v09.rgb]
Host: beagle
Directory: modis02-20130310-2055-4lqjiftd/jobs/y/getlanduse-yht64f6l
Caused by:
Shutting down worker
getLandUse, modis02.swift, line 20
error null
real 4m29.509s
user 2m45.981s
sys 0m3.520s
----- Original Message -----
> From: "Michael Wilde" <wilde at mcs.anl.gov>
> To: "Mihael Hategan" <hategan at mcs.anl.gov>
> Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> Sent: Sunday, March 10, 2013 3:20:53 PM
> Subject: Re: [Swift-devel] Cant get auto-coasters to run from midway to beagle
>
> Duh. Thank you. I didn't build a new release, was using same 0.94
> RC4 code.
>
> Sorry about that. Will retest.
>
> - Mike
>
>
> ----- Original Message -----
> > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> > Sent: Sunday, March 10, 2013 3:06:25 PM
> > Subject: Re: [Swift-devel] Cant get auto-coasters to run from
> > midway to beagle
> >
> > ChannelContext Notifying commands and handlers about exception
> > org.globus.cog.karajan.workflow.service.TimeoutException: Channel
> > timed
> > out. lastTime=940817-071255.807, now=130310-164156.506,
> > channel=GSSSChannel-1463847073(1)[service-60519]
> >
> > Are you sure you are running with the latest code? . There was a
> > (inconsequential mostly) bug before that set lastTime to
> > Long.MAX_TIME
> > before creating that exception. That was fixed. Your message
> > indicates
> > the code you are using does not have that fix (year xx94 is what
> > comes
> > out of Long.MAX_TIME).
> >
> > I gotta go now, but I'll come back later and check some more. There
> > is
> > something weird going on there besides that.
> >
> > Mihael
> >
> > On Sun, 2013-03-10 at 12:01 -0500, Michael Wilde wrote:
> > > Here's run034: seems to be a bit better, but still dies. This is
> > > with throttle of 48 jobs on 48 cores (2 nodes), fom swift.rcc to
> > > beagle. 17MB files. Still seems to curiously die about 4 mins
> > > into the run, which suggests some kind of timeout is still
> > > lurking???
> > >
> > > - Mike
> > >
> > > Swift 0.94RC4 swift-r6284 cog-r3607 (cog modified locally)
> > >
> > > RunID: 20130310-1639-kyb8hca9
> > > Progress: time: Sun, 10 Mar 2013 16:39:45 +0000
> > > Progress: time: Sun, 10 Mar 2013 16:39:56 +0000 Selecting
> > > site:269 Submitting:47 Submitted:1
> > > Progress: time: Sun, 10 Mar 2013 16:40:01 +0000 Selecting
> > > site:269 Stage in:1 Submitted:47
> > > Progress: time: Sun, 10 Mar 2013 16:40:15 +0000 Selecting
> > > site:269 Stage in:48
> > > Progress: time: Sun, 10 Mar 2013 16:40:45 +0000 Selecting
> > > site:269 Stage in:48
> > > Progress: time: Sun, 10 Mar 2013 16:41:15 +0000 Selecting
> > > site:269 Stage in:48
> > > Progress: time: Sun, 10 Mar 2013 16:41:45 +0000 Selecting
> > > site:269 Stage in:48
> > > Progress: time: Sun, 10 Mar 2013 16:42:11 +0000 Selecting
> > > site:269 Stage in:47 Active:1
> > > Progress: time: Sun, 10 Mar 2013 16:42:12 +0000 Selecting
> > > site:269 Stage in:41 Active:7
> > > Progress: time: Sun, 10 Mar 2013 16:42:13 +0000 Selecting
> > > site:269 Stage in:23 Active:25
> > > Progress: time: Sun, 10 Mar 2013 16:42:15 +0000 Selecting
> > > site:269 Active:48
> > > Progress: time: Sun, 10 Mar 2013 16:42:17 +0000 Selecting
> > > site:269 Active:47 Stage out:1
> > > Progress: time: Sun, 10 Mar 2013 16:42:18 +0000 Selecting
> > > site:268 Stage in:1 Active:46 Stage out:1 Finished
> > > successfully:1
> > > Progress: time: Sun, 10 Mar 2013 16:42:19 +0000 Selecting
> > > site:265 Stage in:3 Submitted:1 Active:42 Stage out:2
> > > Finished successfully:4
> > > Progress: time: Sun, 10 Mar 2013 16:42:20 +0000 Selecting
> > > site:258 Stage in:6 Submitting:5 Active:23 Stage out:13
> > > Finished successfully:12
> > > Progress: time: Sun, 10 Mar 2013 16:42:21 +0000 Selecting
> > > site:244 Stage in:24 Submitting:1 Active:20 Stage out:3
> > > Finished successfully:25
> > > Progress: time: Sun, 10 Mar 2013 16:42:23 +0000 Selecting
> > > site:241 Stage in:25 Submitting:3 Stage out:19 Finished
> > > successfully:29
> > > Progress: time: Sun, 10 Mar 2013 16:42:24 +0000 Selecting
> > > site:221 Stage in:28 Submitting:19 Submitted:1 Finished
> > > successfully:48
> > > Progress: time: Sun, 10 Mar 2013 16:42:45 +0000 Selecting
> > > site:221 Stage in:48 Finished successfully:48
> > > Progress: time: Sun, 10 Mar 2013 16:42:54 +0000 Selecting
> > > site:221 Stage in:47 Active:1 Finished successfully:48
> > > Progress: time: Sun, 10 Mar 2013 16:43:00 +0000 Selecting
> > > site:221 Stage in:47 Stage out:1 Finished successfully:48
> > > Progress: time: Sun, 10 Mar 2013 16:43:02 +0000 Selecting
> > > site:221 Stage in:47 Finished successfully:49
> > > Progress: time: Sun, 10 Mar 2013 16:43:05 +0000 Selecting
> > > site:220 Stage in:47 Submitted:1 Finished successfully:49
> > > Progress: time: Sun, 10 Mar 2013 16:43:15 +0000 Selecting
> > > site:220 Stage in:48 Finished successfully:49
> > > Progress: time: Sun, 10 Mar 2013 16:43:45 +0000 Selecting
> > > site:220 Stage in:48 Finished successfully:49
> > > Channels:
> > > {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60519]
> > > -> BufferingChannel,
> > > null at id://u7b315f9a-13d552c3f68--7fff-u-362f30fc-13d552c3f50--7fffC=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > null at id://u-362f30fc-13d552c3f50--8000-u7b315f9a-13d552c3f68--8000S=MetaChannel[service-60519]
> > > -> BufferingChannel}
> > > Context: service-60859
> > > Meta context: service-60519
> > > Progress: time: Sun, 10 Mar 2013 16:43:59 +0000 Selecting
> > > site:220 Stage in:47 Active:1 Finished successfully:49
> > > Channels:
> > > {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60519]
> > > -> BufferingChannel,
> > > null at id://u7b315f9a-13d552c3f68--7fff-u-362f30fc-13d552c3f50--7fffC=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > null at id://u-362f30fc-13d552c3f50--8000-u7b315f9a-13d552c3f68--8000S=MetaChannel[service-60519]
> > > -> BufferingChannel}
> > > Context: service-60663
> > > Meta context: service-60519
> > > Progress: time: Sun, 10 Mar 2013 16:44:05 +0000 Selecting
> > > site:220 Stage in:47 Stage out:1 Finished successfully:49
> > > Progress: time: Sun, 10 Mar 2013 16:44:07 +0000 Selecting
> > > site:220 Stage in:47 Finished successfully:50
> > > Channels:
> > > {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60519]
> > > -> BufferingChannel,
> > > null at id://u7b315f9a-13d552c3f68--7fff-u-362f30fc-13d552c3f50--7fffC=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > null at id://u-362f30fc-13d552c3f50--8000-u7b315f9a-13d552c3f68--8000S=MetaChannel[service-60519]
> > > -> BufferingChannel}
> > > Context: service-60081
> > > Meta context: service-60519
> > > Progress: time: Sun, 10 Mar 2013 16:44:09 +0000 Selecting
> > > site:219 Stage in:45 Submitting:1 Active:2 Finished
> > > successfully:50
> > > Execution failed:
> > > Exception in getlanduse:
> > > Arguments:
> > > [home/wilde/osgdemo/modis/svn/data/modis/2002/h02v11.rgb]
> > > Host: beagle
> > > Directory:
> > > modis02-20130310-1639-kyb8hca9/jobs/9/getlanduse-90fyse6l
> > >
> > > Caused by:
> > > Shutting down worker
> > > getLandUse, modis02.swift, line 20
> > > error null
> > >
> > > real 4m27.007s
> > > user 2m44.221s
> > > sys 0m3.448s
> > > + mv /home/wilde/.swift/runs/current/run034.1362933583
> > > /home/wilde/.swift/runs/completed
> > > midway001$
> > >
> > >
> > > ----- Original Message -----
> > > > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> > > > Sent: Sunday, March 10, 2013 1:36:26 AM
> > > > Subject: Re: [Swift-devel] Cant get auto-coasters to run from
> > > > midway to beagle
> > > >
> > > > Please try now. I made some changes:
> > > >
> > > > 1. start the service with "-l" so that things in your .profile
> > > > (such
> > > > as
> > > > module load sun-java) would be picked up. However, this also
> > > > means
> > > > that
> > > > you should unset X509_* stuff or the sshcl proxy forwarding
> > > > will
> > > > not
> > > > work properly.
> > > >
> > > > 2. I fixed a bug that caused an extra connection to the coaster
> > > > service.
> > > > Normally the service connects back to the client and both use
> > > > that
> > > > connection. However, due to some changes in the way credentials
> > > > were
> > > > set
> > > > for jobs, and the fact that connections were looked up based on
> > > > both
> > > > hostname and credential, the coaster client would ignore the
> > > > existing
> > > > connection and create another one. The initial one with then
> > > > time
> > > > out
> > > > at
> > > > some point causing the service to crash.
> > > >
> > > > Mihael
> > > >
> > > > On Sat, 2013-03-09 at 17:49 -0600, Michael Wilde wrote:
> > > > > An update on this provider staging related issue: reducing
> > > > > filesize
> > > > > from 17MB to 600KB runs well.
> > > > >
> > > > > So seems like some kind of flow control or buffer management
> > > > > problem, possibly?
> > > > >
> > > > > May need to take that problem offline - would be a perfect
> > > > > test
> > > > > case for Yadu to develop a new stress test for.
> > > > >
> > > > > - Mike
> > > > >
> > > > >
> > > > > ----- Forwarded Message -----
> > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > Sent: Saturday, March 9, 2013 5:21:49 PM
> > > > > Subject: Re: runs for OSG talk
> > > > >
> > > > > OK, much better: with 600K files (5x5 reduction or 25X
> > > > > smaller)
> > > > > it
> > > > > works well, and fast (form midway to beagle!)
> > > > >
> > > > > Swift 0.94RC4 swift-r6284 cog-r3607 (cog modified locally)
> > > > >
> > > > > RunID: 20130309-2319-5zq0jrfg
> > > > > Progress: time: Sat, 09 Mar 2013 23:19:45 +0000
> > > > > Progress: time: Sat, 09 Mar 2013 23:19:56 +0000 Selecting
> > > > > site:269 Submitting:47 Submitted:1
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:05 +0000 Selecting
> > > > > site:269 Stage in:1 Submitted:47
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:09 +0000 Selecting
> > > > > site:269 Stage in:47 Active:1
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:10 +0000 Selecting
> > > > > site:269 Stage in:46 Active:1 Stage out:1
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:11 +0000 Selecting
> > > > > site:250 Stage in:19 Active:28 Stage out:1 Finished
> > > > > successfully:19
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:12 +0000 Selecting
> > > > > site:229 Stage in:18 Submitting:21 Active:1 Stage out:7
> > > > > Finished successfully:41
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:13 +0000 Selecting
> > > > > site:220 Stage in:41 Submitting:1 Active:5 Stage out:1
> > > > > Finished successfully:49
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:14 +0000 Selecting
> > > > > site:220 Stage in:38 Active:1 Stage out:9 Finished
> > > > > successfully:49
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:15 +0000 Selecting
> > > > > site:212 Stage in:30 Submitting:8 Stage out:9 Finished
> > > > > successfully:58
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:16 +0000 Selecting
> > > > > site:203 Stage in:38 Submitting:8 Submitted:1 Finished
> > > > > successfully:67
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:18 +0000 Selecting
> > > > > site:202 Stage in:19 Stage out:28 Finished successfully:68
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:19 +0000 Selecting
> > > > > site:172 Stage in:33 Submitting:2 Submitted:6 Active:5
> > > > > Stage
> > > > > out:2 Finished successfully:97
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:20 +0000 Selecting
> > > > > site:170 Stage in:31 Submitting:2 Stage out:14 Finished
> > > > > successfully:100
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:21 +0000 Selecting
> > > > > site:162 Stage in:30 Submitting:10 Stage out:6 Finished
> > > > > successfully:109
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:22 +0000 Selecting
> > > > > site:154 Stage in:39 Submitting:5 Submitted:3 Active:1
> > > > > Finished successfully:115
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:23 +0000 Selecting
> > > > > site:154 Stage in:21 Active:10 Stage out:16 Finished
> > > > > successfully:116
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:24 +0000 Selecting
> > > > > site:126 Stage in:20 Submitting:25 Submitted:1 Stage
> > > > > out:2
> > > > > Finished successfully:143
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:25 +0000 Selecting
> > > > > site:124 Stage in:31 Active:2 Stage out:15 Finished
> > > > > successfully:145
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:26 +0000 Selecting
> > > > > site:110 Stage in:30 Submitting:14 Stage out:3 Finished
> > > > > successfully:160
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:27 +0000 Selecting
> > > > > site:106 Stage in:43 Submitting:1 Submitted:1 Active:1
> > > > > Stage
> > > > > out:2 Finished successfully:163
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:28 +0000 Selecting
> > > > > site:104 Stage in:20 Submitting:2 Active:7 Stage out:19
> > > > > Finished successfully:165
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:29 +0000 Selecting
> > > > > site:78
> > > > > Stage in:29 Submitting:16 Submitted:1 Stage out:2
> > > > > Finished
> > > > > successfully:191
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:31 +0000 Selecting
> > > > > site:76
> > > > > Stage in:30 Stage out:17 Finished successfully:194
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:32 +0000 Selecting
> > > > > site:58
> > > > > Stage in:29 Submitting:18 Active:1 Finished
> > > > > successfully:211
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:33 +0000 Selecting
> > > > > site:58
> > > > > Stage in:33 Active:3 Stage out:12 Finished
> > > > > successfully:211
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:34 +0000 Selecting
> > > > > site:46
> > > > > Stage in:18 Submitting:11 Submitted:1 Active:2 Stage
> > > > > out:14
> > > > > Finished successfully:225
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:35 +0000 Selecting
> > > > > site:30
> > > > > Stage in:29 Active:14 Stage out:3 Finished
> > > > > successfully:241
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:36 +0000 Selecting
> > > > > site:28
> > > > > Stage in:28 Submitting:2 Stage out:17 Finished
> > > > > successfully:242
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:37 +0000 Selecting
> > > > > site:10
> > > > > Stage in:30 Submitting:17 Submitted:1 Finished
> > > > > successfully:259
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:38 +0000 Selecting
> > > > > site:10
> > > > > Stage in:35 Stage out:13 Finished successfully:259
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:39 +0000 Stage in:21
> > > > > Submitting:6 Submitted:3 Stage out:15 Finished
> > > > > successfully:272
> > > > > Progress: time: Sat, 09 Mar 2013 23:20:40 +0000 Stage in:10
> > > > > Active:5 Stage out:14 Finished successfully:288
> > > > > Final status: Sat, 09 Mar 2013 23:20:41 +0000 Finished
> > > > > successfully:317
> > > > >
> > > > > real 0m58.953s
> > > > > user 0m32.573s
> > > > > sys 0m1.263s
> > > > > + mv /home/wilde/.swift/runs/current/run029.1362871183
> > > > > /home/wilde/.swift/runs/completed
> > > > > midway001$
> > > > >
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > Sent: Saturday, March 9, 2013 5:12:59 PM
> > > > > > Subject: Re: runs for OSG talk
> > > > > >
> > > > > >
> > > > > > Yep - I had a version where the input files were in a very
> > > > > > similar
> > > > > > format (PGM, 1 byte per pixel). I'll add that back, but
> > > > > > without
> > > > > > the
> > > > > > small PGM header in the files.
> > > > > >
> > > > > > ----- Original Message -----
> > > > > >
> > > > > >
> > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > Sent: Saturday, March 9, 2013 5:04:43 PM
> > > > > > Subject: Re: runs for OSG talk
> > > > > >
> > > > > > I think we need to cut down the size of these files for a
> > > > > > demo
> > > > > > (although they are great for a stress test).
> > > > > >
> > > > > > First, the RGB format by itself uses 3 bytes per pixel when
> > > > > > it
> > > > > > only
> > > > > > needs one (for land use)
> > > > > >
> > > > > > Second, we should cut down by a factor of 9 (3x3) or 16
> > > > > > (4x4).
> > > > > >
> > > > > > I tried that using simple convert statements, but it always
> > > > > > seems
> > > > > > to
> > > > > > yield a file exactly double what it should be.
> > > > > >
> > > > > > More on this later; was hoping to get things working "as
> > > > > > is"
> > > > > > first.
> > > > > >
> > > > > > I assume you could get the perl code to work on
> > > > > > one-byte-per-pixel
> > > > > > instead of the default 3 for the convert rgb format?
> > > > > >
> > > > > > - Mike
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > Sent: Saturday, March 9, 2013 4:36:30 PM
> > > > > > > Subject: Re: runs for OSG talk
> > > > > > >
> > > > > > >
> > > > > > > That would probably be a good idea for a new script, to
> > > > > > > show
> > > > > > > how to
> > > > > > > stage apps like that. For now I updated the scripts on
> > > > > > > lustre..
> > > > > > > hopefully that helps.
> > > > > > >
> > > > > > > ----- Original Message -----
> > > > > > >
> > > > > > >
> > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > Sent: Saturday, March 9, 2013 4:29:14 PM
> > > > > > > Subject: Re: runs for OSG talk
> > > > > > >
> > > > > > > OK, I see that its trying to run getlanduse.sh from your
> > > > > > > /lustre
> > > > > > > dir
> > > > > > > on beagle, which is different than the one Ive got
> > > > > > > checked
> > > > > > > out.
> > > > > > > It
> > > > > > > seems to get an error in a stderr redirect??? Let me se
> > > > > > > what I
> > > > > > > need
> > > > > > > to do to get the beagle side in sync.
> > > > > > >
> > > > > > > Seems like since these are perl scripts, we should make
> > > > > > > the
> > > > > > > app()
> > > > > > > /bin/sh and send the script as data, perhaps?
> > > > > > >
> > > > > > > - Mike
> > > > > > >
> > > > > > > ----- Original Message -----
> > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > Sent: Saturday, March 9, 2013 4:19:31 PM
> > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > >
> > > > > > > > OK, making progress. Now I dialed down the throttle and
> > > > > > > > node
> > > > > > > > counts
> > > > > > > > to 48 jobs.
> > > > > > > >
> > > > > > > > Now I get further, for ./demo and site=4 script=2:
> > > > > > > >
> > > > > > > > RunID: 20130309-2214-1oi3rvea
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:06 +0000
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:17 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Submitting:47 Submitted:1
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:22 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:1 Submitted:47
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:28 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:25 Submitted:23
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:36 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:48
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:15:06 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:48
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:15:36 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:48
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:06 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:48
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:26 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:47 Active:1
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:27 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:36 Active:12
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:29 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:24 Active:24
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:34 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:24 Active:23 Stage out:1
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:35 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:14 Active:33 Stage out:1
> > > > > > > > Execution failed:
> > > > > > > > Exception in getlanduse:
> > > > > > > > Arguments:
> > > > > > > > [home/wilde/osgdemo/modis/svn/data/modis/2002/h08v04.rgb]
> > > > > > > > Host: beagle
> > > > > > > > Directory:
> > > > > > > > modis02-20130309-2214-1oi3rvea/jobs/k/getlanduse-ko5qjd6l
> > > > > > > >
> > > > > > > > Caused by:
> > > > > > > > Application
> > > > > > > > /lustre/beagle/davidk/modis/bin/getlanduse.sh
> > > > > > > > failed
> > > > > > > > with an exit code of 1
> > > > > > > > getLandUse, modis02.swift, line 20
> > > > > > > >
> > > > > > > > real 2m31.463s
> > > > > > > > user 1m33.238s
> > > > > > > > sys 0m2.160s
> > > > > > > > + mv /home/wilde/.swift/runs/current/run024.1362867244
> > > > > > > > /home/wilde/.swift/runs/completed
> > > > > > > > midway001$
> > > > > > > >
> > > > > > > >
> > > > > > > > ----- Original Message -----
> > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > Sent: Saturday, March 9, 2013 3:55:30 PM
> > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ok, I'll take a look at that. The run dir I used was
> > > > > > > > > /scratch/midway/davidkelly999/modis/run011
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ----- Original Message -----
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > Sent: Saturday, March 9, 2013 3:52:28 PM
> > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > >
> > > > > > > > > I just tried this, but didnt work - same prob.
> > > > > > > > >
> > > > > > > > > But if its working for you now, we must be close.
> > > > > > > > >
> > > > > > > > > Not yet sure what the diff is...
> > > > > > > > >
> > > > > > > > > My run dir is /home/wilde/osgdemo/modis/svn/run021
> > > > > > > > >
> > > > > > > > > - Mike
> > > > > > > > >
> > > > > > > > > ----- Original Message -----
> > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > Sent: Saturday, March 9, 2013 3:46:13 PM
> > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Had to make sure I was using the IP address on eth4
> > > > > > > > > > (128.135.112.71
> > > > > > > > > > for midway-login1), not a local address or an
> > > > > > > > > > infiniband
> > > > > > > > > > address.
> > > > > > > > > >
> > > > > > > > > > ----- Original Message -----
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > Sent: Saturday, March 9, 2013 3:43:51 PM
> > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I just got it working. I had to adjust for the
> > > > > > > > > > differences in
> > > > > > > > > > my
> > > > > > > > > > username on Beagle/Midway, then I had to set
> > > > > > > > > > GLOBUS_HOSTNAME
> > > > > > > > > > on
> > > > > > > > > > Midway to the IP address, rather than the full
> > > > > > > > > > hostname
> > > > > > > > > >
> > > > > > > > > > ----- Original Message -----
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > Sent: Saturday, March 9, 2013 3:40:03 PM
> > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > Sent: Saturday, March 9, 2013 3:34:58 PM
> > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Is your username the same on beagle and midway?
> > > > > > > > > >
> > > > > > > > > > Yes. And I verified that I can ssh to login4 on
> > > > > > > > > > beagle
> > > > > > > > > > from
> > > > > > > > > > my
> > > > > > > > > > midway
> > > > > > > > > > session (as indeed the scp's of the proxy files
> > > > > > > > > > seem
> > > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > working)
> > > > > > > > > >
> > > > > > > > > > - Mike
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > Sent: Saturday, March 9, 2013 3:34:28 PM
> > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > >
> > > > > > > > > > > OK.
> > > > > > > > > > >
> > > > > > > > > > > Ignore what I said about "problem finding java" -
> > > > > > > > > > > thats
> > > > > > > > > > > code
> > > > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > very long escaped shell command that gets sent to
> > > > > > > > > > > the
> > > > > > > > > > > remote
> > > > > > > > > > > side.
> > > > > > > > > > > I
> > > > > > > > > > > dont *think* thats the problem.
> > > > > > > > > > >
> > > > > > > > > > > I also verified that beagle can connect to ports
> > > > > > > > > > > 50001
> > > > > > > > > > > etc
> > > > > > > > > > > on
> > > > > > > > > > > swift.rcc, and that seems OK.
> > > > > > > > > > >
> > > > > > > > > > > I exported
> > > > > > > > > > > GLOBUS_HOSTNAME=midway001.rcc.uchicago.edu
> > > > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > midway
> > > > > > > > > > > side. And the beagle side seems to be connecting
> > > > > > > > > > > there.
> > > > > > > > > > >
> > > > > > > > > > > Im a bit confused about the timestamps I see for
> > > > > > > > > > > the
> > > > > > > > > > > proxy
> > > > > > > > > > > expiration
> > > > > > > > > > > time, but am not yet suspicious of that (although
> > > > > > > > > > > it
> > > > > > > > > > > seems
> > > > > > > > > > > less
> > > > > > > > > > > than
> > > > > > > > > > > 5 hours past GMT... not sure.)
> > > > > > > > > > >
> > > > > > > > > > > - Mike
> > > > > > > > > > >
> > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > Sent: Saturday, March 9, 2013 3:26:32 PM
> > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I'm seeing the same error now.. looking into it
> > > > > > > > > > > >
> > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > > Sent: Saturday, March 9, 2013 3:21:30 PM
> > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > >
> > > > > > > > > > > > Looking deeper I see that the logs show
> > > > > > > > > > > > problems
> > > > > > > > > > > > with
> > > > > > > > > > > > finding
> > > > > > > > > > > > Java,
> > > > > > > > > > > > I
> > > > > > > > > > > > assume on beagle, ans also service ending
> > > > > > > > > > > > (presumably
> > > > > > > > > > > > coaster
> > > > > > > > > > > > service on midway host).
> > > > > > > > > > > >
> > > > > > > > > > > > I'll dig into these two.
> > > > > > > > > > > >
> > > > > > > > > > > > I see that it scp's the proxies to beagle which
> > > > > > > > > > > > I
> > > > > > > > > > > > think
> > > > > > > > > > > > answers
> > > > > > > > > > > > my
> > > > > > > > > > > > question about security.
> > > > > > > > > > > >
> > > > > > > > > > > > - Mike
> > > > > > > > > > > >
> > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > Sent: Saturday, March 9, 2013 3:15:01 PM
> > > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > >
> > > > > > > > > > > > > OK. Any thoughts about beagle?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ive been experimenting but still cant get it
> > > > > > > > > > > > > to
> > > > > > > > > > > > > work,
> > > > > > > > > > > > > same
> > > > > > > > > > > > > error
> > > > > > > > > > > > > (cant connect to bootstrap port)
> > > > > > > > > > > > >
> > > > > > > > > > > > > WHen you tried ssh-cl to beagle with
> > > > > > > > > > > > > automatic
> > > > > > > > > > > > > coasters,
> > > > > > > > > > > > > what
> > > > > > > > > > > > > configuration (sites env etc) did you use?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I verified that beagle can connect back to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > midway
> > > > > > > > > > > > > hosts
> > > > > > > > > > > > > and
> > > > > > > > > > > > > ports.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Do we need to specify security or create a
> > > > > > > > > > > > > proxy
> > > > > > > > > > > > > etc?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Mike
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > From: "David Kelly"
> > > > > > > > > > > > > > <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > > Sent: Saturday, March 9, 2013 3:08:58 PM
> > > > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > One way you can override/customize the
> > > > > > > > > > > > > > default
> > > > > > > > > > > > > > templates
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > create
> > > > > > > > > > > > > > them in $HOME/.swift/sites (I'm not sure if
> > > > > > > > > > > > > > that's
> > > > > > > > > > > > > > what
> > > > > > > > > > > > > > you
> > > > > > > > > > > > > > mean
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > a local sites dir or not). But you are
> > > > > > > > > > > > > > right
> > > > > > > > > > > > > > about
> > > > > > > > > > > > > > Midway
> > > > > > > > > > > > > > -
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > noticed that when using modis it will
> > > > > > > > > > > > > > sometimes
> > > > > > > > > > > > > > get
> > > > > > > > > > > > > > stuck
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > goes to a queue that is busy. Ideally swift
> > > > > > > > > > > > > > replication
> > > > > > > > > > > > > > would
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > able to help better handle that, but I
> > > > > > > > > > > > > > haven't
> > > > > > > > > > > > > > had
> > > > > > > > > > > > > > much
> > > > > > > > > > > > > > luck
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > that yet. Another way around this may be to
> > > > > > > > > > > > > > add
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > template:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > <profile namespace="globus"
> > > > > > > > > > > > > > key="slurm.exclusive">false</profile>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The swift.log issue was never fixed. It
> > > > > > > > > > > > > > went
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > swift-devel
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > discussion but was never fixed. I think it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > relatively
> > > > > > > > > > > > > > simple
> > > > > > > > > > > > > > though.. probably worth fixing before
> > > > > > > > > > > > > > release.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > > Sent: Saturday, March 9, 2013 1:38:47 PM
> > > > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > OK, sounds good re the trip plan. Feel free
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > stay
> > > > > > > > > > > > > > Tue
> > > > > > > > > > > > > > night
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > avoid a 4hr drive after a long day.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Im trying the modis demo.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I tried to create a local sites/ dir so I
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > modify
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > sites
> > > > > > > > > > > > > > templates; thats not working for me either
> > > > > > > > > > > > > > yet.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > For midway, need to force to westmere or
> > > > > > > > > > > > > > sandyb
> > > > > > > > > > > > > > (but
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > both)
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > ensure 1-node jobs, because either queue
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > get
> > > > > > > > > > > > > > filled
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > yield an idle node for a long time. maybe
> > > > > > > > > > > > > > need to
> > > > > > > > > > > > > > fiddle
> > > > > > > > > > > > > > jobsPerNode
> > > > > > > > > > > > > > to get at least 1 core when the system is
> > > > > > > > > > > > > > busy
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > *pretend*
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > its a node.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So to get response I tried beagle-ssh; That
> > > > > > > > > > > > > > isnt
> > > > > > > > > > > > > > working
> > > > > > > > > > > > > > because
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > template sites file is wrong in swift 0.94
> > > > > > > > > > > > > > rc4.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I also see that swift.log is still getting
> > > > > > > > > > > > > > produced -
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > thought
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > eliminated that. Did it come back due to a
> > > > > > > > > > > > > > problem
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > fix?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'll keep hacking; suggestions welcome.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Mike
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > > From: "David Kelly"
> > > > > > > > > > > > > > > <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > > > Sent: Saturday, March 9, 2013 12:20:00 PM
> > > > > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Mike,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Looking more closely at the agenda, I
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > most
> > > > > > > > > > > > > > > interesting/useful talks will be on
> > > > > > > > > > > > > > > Tuesday.
> > > > > > > > > > > > > > > Monday
> > > > > > > > > > > > > > > I'll
> > > > > > > > > > > > > > > come
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > Argonne to work on any loose ends and put
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > finishing
> > > > > > > > > > > > > > > touches
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > any slides/runs/scripts, then drive to
> > > > > > > > > > > > > > > Indianapolis
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > Monday
> > > > > > > > > > > > > > > afternoon/evening. I have a hotel booked
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > Monday
> > > > > > > > > > > > > > > night.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I'll do some runs using the routes we
> > > > > > > > > > > > > > > talked
> > > > > > > > > > > > > > > about.
> > > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > pretty
> > > > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > have working configurations for
> > > > > > > > > > > > > > > everything
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > talked
> > > > > > > > > > > > > > > about,
> > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > think it's really just a matter of
> > > > > > > > > > > > > > > plugging
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > apps.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > David
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > > > To: "David Kelly"
> > > > > > > > > > > > > > > <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > > > Sent: Saturday, March 9, 2013 11:03:15 AM
> > > > > > > > > > > > > > > Subject: runs for OSG talk
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi David,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I just wanted to let you know that Im
> > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > run
> > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > now. Im hoping to try a few... WIll see
> > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > much
> > > > > > > > > > > > > > > help
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > need.
> > > > > > > > > > > > > > > Have
> > > > > > > > > > > > > > > you decided on a driving time and made
> > > > > > > > > > > > > > > hotel
> > > > > > > > > > > > > > > arrangements?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I would feel free to stay for whatever
> > > > > > > > > > > > > > > portion
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > OSG
> > > > > > > > > > > > > > > meeting
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > feel is of value. The only thing I ask is
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > Wed
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > Thu
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > stay available online for user-support or
> > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > assistance
> > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > that come up here. And that you engage
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > help
> > > > > > > > > > > > > > > us
> > > > > > > > > > > > > > > develop the Swift user community and
> > > > > > > > > > > > > > > reliable
> > > > > > > > > > > > > > > OSG
> > > > > > > > > > > > > > > usage.
> > > > > > > > > > > > > > > Rob,
> > > > > > > > > > > > > > > Marco,
> > > > > > > > > > > > > > > Lincoln, and Suchandra would be good to
> > > > > > > > > > > > > > > hang
> > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > introduce you to good contacts.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Of course we will cover your expenses via
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > UChicago
> > > > > > > > > > > > > > > travel
> > > > > > > > > > > > > > > expense
> > > > > > > > > > > > > > > report.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > We'll be starting a project with a tiny
> > > > > > > > > > > > > > > bit
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > additional
> > > > > > > > > > > > > > > ExTENCI
> > > > > > > > > > > > > > > funds to make Swift do smarter data
> > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > OSG
> > > > > > > > > > > > > > > sites
> > > > > > > > > > > > > > > (and
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > general) so anything you learn about OSG
> > > > > > > > > > > > > > > storage
> > > > > > > > > > > > > > > elements/services/tools will be valuable
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > (srmcp,
> > > > > > > > > > > > > > > lcgcp,
> > > > > > > > > > > > > > > etc).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Between now and your talk, lets just
> > > > > > > > > > > > > > > focus
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > talk,
> > > > > > > > > > > > > > > OK?
> > > > > > > > > > > > > > > Im
> > > > > > > > > > > > > > > hoping
> > > > > > > > > > > > > > > we have slides frozen by Monday.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > While I fiddle, if you could do catsn or
> > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > hello-world-like
> > > > > > > > > > > > > > > tests
> > > > > > > > > > > > > > > to cover the "routes" we discussed, that
> > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > pave
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > plugging in the real app examples.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Sound good? Let me know of any concerns
> > > > > > > > > > > > > > > (other
> > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > fact
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > this is a tad rushed ;)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks and regards,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - Mike
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Michael Wilde
> > > > > > > > > > > > > > > Computation Institute, University of
> > > > > > > > > > > > > > > Chicago
> > > > > > > > > > > > > > > Mathematics and Computer Science Division
> > > > > > > > > > > > > > > Argonne National Laboratory
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > _______________________________________________
> > > > > Swift-devel mailing list
> > > > > Swift-devel at ci.uchicago.edu
> > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > > >
> > > >
> > > >
> >
> >
> >
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
More information about the Swift-devel
mailing list