[Swift-devel] Cant get auto-coasters to run from midway to beagle

Michael Wilde wilde at mcs.anl.gov
Sun Mar 10 16:37:49 CDT 2013


Mihael, it seems that the problem is still there under the current trunk - see below.  This is in:

midway:/home/wilde/osgdemo/modis/svn/run035.

The "cog modified locally" is a hopefully inconsequential change in worker.pl where I open stdin to /dev/null rather than close it, before launching an app, to remedy an unrelated MPI problem.

- Mike

Swift trunk swift-r6362 cog-r3637 (cog modified locally)

RunID: 20130310-2055-4lqjiftd
Progress:  time: Sun, 10 Mar 2013 20:55:52 +0000
Progress:  time: Sun, 10 Mar 2013 20:56:06 +0000  Selecting site:269  Submitting:47  Submitted:1
Progress:  time: Sun, 10 Mar 2013 20:56:12 +0000  Selecting site:269  Stage in:1  Submitted:47
Progress:  time: Sun, 10 Mar 2013 20:56:16 +0000  Selecting site:269  Stage in:25  Submitted:23
Progress:  time: Sun, 10 Mar 2013 20:56:22 +0000  Selecting site:269  Stage in:48
Progress:  time: Sun, 10 Mar 2013 20:56:52 +0000  Selecting site:269  Stage in:48
Progress:  time: Sun, 10 Mar 2013 20:57:22 +0000  Selecting site:269  Stage in:48
Progress:  time: Sun, 10 Mar 2013 20:57:52 +0000  Selecting site:269  Stage in:48
Progress:  time: Sun, 10 Mar 2013 20:58:19 +0000  Selecting site:269  Stage in:47  Active:1
Progress:  time: Sun, 10 Mar 2013 20:58:20 +0000  Selecting site:269  Stage in:26  Active:22
Progress:  time: Sun, 10 Mar 2013 20:58:22 +0000  Selecting site:269  Stage in:24  Active:24
Progress:  time: Sun, 10 Mar 2013 20:58:24 +0000  Selecting site:269  Stage in:23  Active:25
Progress:  time: Sun, 10 Mar 2013 20:58:26 +0000  Selecting site:269  Active:47  Stage out:1
Progress:  time: Sun, 10 Mar 2013 20:58:27 +0000  Selecting site:260  Stage in:7  Submitting:1  Submitted:1  Active:39  Finished successfully:9
Progress:  time: Sun, 10 Mar 2013 20:58:28 +0000  Selecting site:258  Stage in:9  Submitting:1  Submitted:1  Active:24  Stage out:13  Finished successfully:11
Progress:  time: Sun, 10 Mar 2013 20:58:29 +0000  Selecting site:245  Stage in:23  Submitted:1  Active:24  Finished successfully:24
Progress:  time: Sun, 10 Mar 2013 20:58:31 +0000  Selecting site:245  Stage in:24  Active:23  Stage out:1  Finished successfully:24
Progress:  time: Sun, 10 Mar 2013 20:58:32 +0000  Selecting site:245  Stage in:24  Active:23  Finished successfully:25
Progress:  time: Sun, 10 Mar 2013 20:58:34 +0000  Selecting site:244  Stage in:24  Submitting:1  Stage out:22  Finished successfully:26
Progress:  time: Sun, 10 Mar 2013 20:58:35 +0000  Selecting site:221  Stage in:25  Submitting:22  Submitted:1  Finished successfully:48
Progress:  time: Sun, 10 Mar 2013 20:58:52 +0000  Selecting site:221  Stage in:48  Finished successfully:48
Progress:  time: Sun, 10 Mar 2013 20:59:22 +0000  Selecting site:221  Stage in:48  Finished successfully:48
Progress:  time: Sun, 10 Mar 2013 20:59:52 +0000  Selecting site:221  Stage in:48  Finished successfully:48
Progress:  time: Sun, 10 Mar 2013 20:59:56 +0000  Selecting site:221  Stage in:47  Active:1  Finished successfully:48
Progress:  time: Sun, 10 Mar 2013 21:00:02 +0000  Selecting site:221  Stage in:47  Stage out:1  Finished successfully:48
Progress:  time: Sun, 10 Mar 2013 21:00:05 +0000  Selecting site:221  Stage in:47  Finished successfully:49
Channels: {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000], null at id://u-3bec3eab-13d5616c4bd--8000-u4684d136-13d5616c4d0--8000S=MetaChannel[service-60734] -> BufferingChannel, /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60734] -> BufferingChannel, null at id://u4684d136-13d5616c4d0--7fff-u-3bec3eab-13d5616c4bd--7fffC=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000]}
Context: service-60018
Meta context: service-60734
Progress:  time: Sun, 10 Mar 2013 21:00:07 +0000  Selecting site:220  Stage in:47  Submitted:1  Finished successfully:49
Channels: {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000], null at id://u-3bec3eab-13d5616c4bd--8000-u4684d136-13d5616c4d0--8000S=MetaChannel[service-60734] -> BufferingChannel, /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60734] -> BufferingChannel, null at id://u4684d136-13d5616c4d0--7fff-u-3bec3eab-13d5616c4bd--7fffC=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000]}
Context: service-60263
Meta context: service-60734
Channels: {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000], null at id://u-3bec3eab-13d5616c4bd--8000-u4684d136-13d5616c4d0--8000S=MetaChannel[service-60734] -> BufferingChannel, /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60734] -> BufferingChannel, null at id://u4684d136-13d5616c4d0--7fff-u-3bec3eab-13d5616c4bd--7fffC=MetaChannel[https://192.5.86.107:50000] -> GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000]}
Context: service-60408
Meta context: service-60734
Progress:  time: Sun, 10 Mar 2013 21:00:18 +0000  Selecting site:220  Stage in:47  Active:1  Finished successfully:49
Progress:  time: Sun, 10 Mar 2013 21:00:19 +0000  Selecting site:220  Stage in:46  Active:2  Finished successfully:49
Execution failed:
	Exception in getlanduse:
    Arguments: [home/wilde/osgdemo/modis/svn/data/modis/2002/h12v09.rgb]
    Host: beagle
    Directory: modis02-20130310-2055-4lqjiftd/jobs/y/getlanduse-yht64f6l

Caused by:
	Shutting down worker
	getLandUse, modis02.swift, line 20
error null

real	4m29.509s
user	2m45.981s
sys	0m3.520s


----- Original Message -----
> From: "Michael Wilde" <wilde at mcs.anl.gov>
> To: "Mihael Hategan" <hategan at mcs.anl.gov>
> Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> Sent: Sunday, March 10, 2013 3:20:53 PM
> Subject: Re: [Swift-devel] Cant get auto-coasters to run	from	midway	to beagle
> 
> Duh. Thank you.  I didn't build a new release, was using same 0.94
> RC4 code.
> 
> Sorry about that.  Will retest.
> 
> - Mike
> 
> 
> ----- Original Message -----
> > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> > Sent: Sunday, March 10, 2013 3:06:25 PM
> > Subject: Re: [Swift-devel] Cant get auto-coasters to run	from
> > 	midway	to beagle
> > 
> > ChannelContext Notifying commands and handlers about exception
> > org.globus.cog.karajan.workflow.service.TimeoutException: Channel
> > timed
> > out. lastTime=940817-071255.807, now=130310-164156.506,
> > channel=GSSSChannel-1463847073(1)[service-60519]
> > 
> > Are you sure you are running with the latest code? . There was a
> > (inconsequential mostly) bug before that set lastTime to
> > Long.MAX_TIME
> > before creating that exception. That was fixed. Your message
> > indicates
> > the code you are using does not have that fix (year xx94 is what
> > comes
> > out of Long.MAX_TIME).
> > 
> > I gotta go now, but I'll come back later and check some more. There
> > is
> > something weird going on there besides that.
> > 
> > Mihael
> > 
> > On Sun, 2013-03-10 at 12:01 -0500, Michael Wilde wrote:
> > > Here's run034: seems to be a bit better, but still dies.  This is
> > > with throttle of 48 jobs on 48 cores (2 nodes), fom swift.rcc to
> > > beagle.  17MB files. Still seems to curiously die about 4 mins
> > > into the run, which suggests some kind of timeout is still
> > > lurking???
> > > 
> > > - Mike
> > > 
> > > Swift 0.94RC4 swift-r6284 cog-r3607 (cog modified locally)
> > > 
> > > RunID: 20130310-1639-kyb8hca9
> > > Progress:  time: Sun, 10 Mar 2013 16:39:45 +0000
> > > Progress:  time: Sun, 10 Mar 2013 16:39:56 +0000  Selecting
> > > site:269  Submitting:47  Submitted:1
> > > Progress:  time: Sun, 10 Mar 2013 16:40:01 +0000  Selecting
> > > site:269  Stage in:1  Submitted:47
> > > Progress:  time: Sun, 10 Mar 2013 16:40:15 +0000  Selecting
> > > site:269  Stage in:48
> > > Progress:  time: Sun, 10 Mar 2013 16:40:45 +0000  Selecting
> > > site:269  Stage in:48
> > > Progress:  time: Sun, 10 Mar 2013 16:41:15 +0000  Selecting
> > > site:269  Stage in:48
> > > Progress:  time: Sun, 10 Mar 2013 16:41:45 +0000  Selecting
> > > site:269  Stage in:48
> > > Progress:  time: Sun, 10 Mar 2013 16:42:11 +0000  Selecting
> > > site:269  Stage in:47  Active:1
> > > Progress:  time: Sun, 10 Mar 2013 16:42:12 +0000  Selecting
> > > site:269  Stage in:41  Active:7
> > > Progress:  time: Sun, 10 Mar 2013 16:42:13 +0000  Selecting
> > > site:269  Stage in:23  Active:25
> > > Progress:  time: Sun, 10 Mar 2013 16:42:15 +0000  Selecting
> > > site:269  Active:48
> > > Progress:  time: Sun, 10 Mar 2013 16:42:17 +0000  Selecting
> > > site:269  Active:47  Stage out:1
> > > Progress:  time: Sun, 10 Mar 2013 16:42:18 +0000  Selecting
> > > site:268  Stage in:1  Active:46  Stage out:1  Finished
> > > successfully:1
> > > Progress:  time: Sun, 10 Mar 2013 16:42:19 +0000  Selecting
> > > site:265  Stage in:3  Submitted:1  Active:42  Stage out:2
> > >  Finished successfully:4
> > > Progress:  time: Sun, 10 Mar 2013 16:42:20 +0000  Selecting
> > > site:258  Stage in:6  Submitting:5  Active:23  Stage out:13
> > >  Finished successfully:12
> > > Progress:  time: Sun, 10 Mar 2013 16:42:21 +0000  Selecting
> > > site:244  Stage in:24  Submitting:1  Active:20  Stage out:3
> > >  Finished successfully:25
> > > Progress:  time: Sun, 10 Mar 2013 16:42:23 +0000  Selecting
> > > site:241  Stage in:25  Submitting:3  Stage out:19  Finished
> > > successfully:29
> > > Progress:  time: Sun, 10 Mar 2013 16:42:24 +0000  Selecting
> > > site:221  Stage in:28  Submitting:19  Submitted:1  Finished
> > > successfully:48
> > > Progress:  time: Sun, 10 Mar 2013 16:42:45 +0000  Selecting
> > > site:221  Stage in:48  Finished successfully:48
> > > Progress:  time: Sun, 10 Mar 2013 16:42:54 +0000  Selecting
> > > site:221  Stage in:47  Active:1  Finished successfully:48
> > > Progress:  time: Sun, 10 Mar 2013 16:43:00 +0000  Selecting
> > > site:221  Stage in:47  Stage out:1  Finished successfully:48
> > > Progress:  time: Sun, 10 Mar 2013 16:43:02 +0000  Selecting
> > > site:221  Stage in:47  Finished successfully:49
> > > Progress:  time: Sun, 10 Mar 2013 16:43:05 +0000  Selecting
> > > site:220  Stage in:47  Submitted:1  Finished successfully:49
> > > Progress:  time: Sun, 10 Mar 2013 16:43:15 +0000  Selecting
> > > site:220  Stage in:48  Finished successfully:49
> > > Progress:  time: Sun, 10 Mar 2013 16:43:45 +0000  Selecting
> > > site:220  Stage in:48  Finished successfully:49
> > > Channels:
> > > {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60519]
> > > -> BufferingChannel,
> > > null at id://u7b315f9a-13d552c3f68--7fff-u-362f30fc-13d552c3f50--7fffC=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > null at id://u-362f30fc-13d552c3f50--8000-u7b315f9a-13d552c3f68--8000S=MetaChannel[service-60519]
> > > -> BufferingChannel}
> > > Context: service-60859
> > > Meta context: service-60519
> > > Progress:  time: Sun, 10 Mar 2013 16:43:59 +0000  Selecting
> > > site:220  Stage in:47  Active:1  Finished successfully:49
> > > Channels:
> > > {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60519]
> > > -> BufferingChannel,
> > > null at id://u7b315f9a-13d552c3f68--7fff-u-362f30fc-13d552c3f50--7fffC=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > null at id://u-362f30fc-13d552c3f50--8000-u7b315f9a-13d552c3f68--8000S=MetaChannel[service-60519]
> > > -> BufferingChannel}
> > > Context: service-60663
> > > Meta context: service-60519
> > > Progress:  time: Sun, 10 Mar 2013 16:44:05 +0000  Selecting
> > > site:220  Stage in:47  Stage out:1  Finished successfully:49
> > > Progress:  time: Sun, 10 Mar 2013 16:44:07 +0000  Selecting
> > > site:220  Stage in:47  Finished successfully:50
> > > Channels:
> > > {null at https://192.5.86.107:50000=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > /C=US/O=JavaCoG/OU=AutoCA/CN=User at https://192.5.86.107:50000=MetaChannel[service-60519]
> > > -> BufferingChannel,
> > > null at id://u7b315f9a-13d552c3f68--7fff-u-362f30fc-13d552c3f50--7fffC=MetaChannel[https://192.5.86.107:50000]
> > > ->
> > > GSSCChannel-https://192.5.86.107:50000(2)[https://192.5.86.107:50000],
> > > null at id://u-362f30fc-13d552c3f50--8000-u7b315f9a-13d552c3f68--8000S=MetaChannel[service-60519]
> > > -> BufferingChannel}
> > > Context: service-60081
> > > Meta context: service-60519
> > > Progress:  time: Sun, 10 Mar 2013 16:44:09 +0000  Selecting
> > > site:219  Stage in:45  Submitting:1  Active:2  Finished
> > > successfully:50
> > > Execution failed:
> > > 	Exception in getlanduse:
> > >     Arguments:
> > >     [home/wilde/osgdemo/modis/svn/data/modis/2002/h02v11.rgb]
> > >     Host: beagle
> > >     Directory:
> > >     modis02-20130310-1639-kyb8hca9/jobs/9/getlanduse-90fyse6l
> > > 
> > > Caused by:
> > > 	Shutting down worker
> > > 	getLandUse, modis02.swift, line 20
> > > error null
> > > 
> > > real	4m27.007s
> > > user	2m44.221s
> > > sys	0m3.448s
> > > + mv /home/wilde/.swift/runs/current/run034.1362933583
> > > /home/wilde/.swift/runs/completed
> > > midway001$
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> > > > Sent: Sunday, March 10, 2013 1:36:26 AM
> > > > Subject: Re: [Swift-devel] Cant get auto-coasters to run	from
> > > > 	midway	to	beagle
> > > > 
> > > > Please try now. I made some changes:
> > > > 
> > > > 1. start the service with "-l" so that things in your .profile
> > > > (such
> > > > as
> > > > module load sun-java) would be picked up. However, this also
> > > > means
> > > > that
> > > > you should unset X509_* stuff or the sshcl proxy forwarding
> > > > will
> > > > not
> > > > work properly.
> > > > 
> > > > 2. I fixed a bug that caused an extra connection to the coaster
> > > > service.
> > > > Normally the service connects back to the client and both use
> > > > that
> > > > connection. However, due to some changes in the way credentials
> > > > were
> > > > set
> > > > for jobs, and the fact that connections were looked up based on
> > > > both
> > > > hostname and credential, the coaster client would ignore the
> > > > existing
> > > > connection and create another one. The initial one with then
> > > > time
> > > > out
> > > > at
> > > > some point causing the service to crash.
> > > > 
> > > > Mihael
> > > > 
> > > > On Sat, 2013-03-09 at 17:49 -0600, Michael Wilde wrote:
> > > > > An update on this provider staging related issue: reducing
> > > > > filesize
> > > > > from 17MB to 600KB runs well.
> > > > > 
> > > > > So seems like some kind of flow control or buffer management
> > > > > problem, possibly?
> > > > > 
> > > > > May need to take that problem offline - would be a perfect
> > > > > test
> > > > > case for Yadu to develop a new stress test for.
> > > > > 
> > > > > - Mike
> > > > > 
> > > > > 
> > > > > ----- Forwarded Message -----
> > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > Sent: Saturday, March 9, 2013 5:21:49 PM
> > > > > Subject: Re: runs for OSG talk
> > > > > 
> > > > > OK, much better: with 600K files (5x5 reduction or 25X
> > > > > smaller)
> > > > > it
> > > > > works well, and fast (form midway to beagle!)
> > > > > 
> > > > > Swift 0.94RC4 swift-r6284 cog-r3607 (cog modified locally)
> > > > > 
> > > > > RunID: 20130309-2319-5zq0jrfg
> > > > > Progress:  time: Sat, 09 Mar 2013 23:19:45 +0000
> > > > > Progress:  time: Sat, 09 Mar 2013 23:19:56 +0000  Selecting
> > > > > site:269  Submitting:47  Submitted:1
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:05 +0000  Selecting
> > > > > site:269  Stage in:1  Submitted:47
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:09 +0000  Selecting
> > > > > site:269  Stage in:47  Active:1
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:10 +0000  Selecting
> > > > > site:269  Stage in:46  Active:1  Stage out:1
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:11 +0000  Selecting
> > > > > site:250  Stage in:19  Active:28  Stage out:1  Finished
> > > > > successfully:19
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:12 +0000  Selecting
> > > > > site:229  Stage in:18  Submitting:21  Active:1  Stage out:7
> > > > >  Finished successfully:41
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:13 +0000  Selecting
> > > > > site:220  Stage in:41  Submitting:1  Active:5  Stage out:1
> > > > >  Finished successfully:49
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:14 +0000  Selecting
> > > > > site:220  Stage in:38  Active:1  Stage out:9  Finished
> > > > > successfully:49
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:15 +0000  Selecting
> > > > > site:212  Stage in:30  Submitting:8  Stage out:9  Finished
> > > > > successfully:58
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:16 +0000  Selecting
> > > > > site:203  Stage in:38  Submitting:8  Submitted:1  Finished
> > > > > successfully:67
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:18 +0000  Selecting
> > > > > site:202  Stage in:19  Stage out:28  Finished successfully:68
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:19 +0000  Selecting
> > > > > site:172  Stage in:33  Submitting:2  Submitted:6  Active:5
> > > > >  Stage
> > > > > out:2  Finished successfully:97
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:20 +0000  Selecting
> > > > > site:170  Stage in:31  Submitting:2  Stage out:14  Finished
> > > > > successfully:100
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:21 +0000  Selecting
> > > > > site:162  Stage in:30  Submitting:10  Stage out:6  Finished
> > > > > successfully:109
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:22 +0000  Selecting
> > > > > site:154  Stage in:39  Submitting:5  Submitted:3  Active:1
> > > > >  Finished successfully:115
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:23 +0000  Selecting
> > > > > site:154  Stage in:21  Active:10  Stage out:16  Finished
> > > > > successfully:116
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:24 +0000  Selecting
> > > > > site:126  Stage in:20  Submitting:25  Submitted:1  Stage
> > > > > out:2
> > > > >  Finished successfully:143
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:25 +0000  Selecting
> > > > > site:124  Stage in:31  Active:2  Stage out:15  Finished
> > > > > successfully:145
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:26 +0000  Selecting
> > > > > site:110  Stage in:30  Submitting:14  Stage out:3  Finished
> > > > > successfully:160
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:27 +0000  Selecting
> > > > > site:106  Stage in:43  Submitting:1  Submitted:1  Active:1
> > > > >  Stage
> > > > > out:2  Finished successfully:163
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:28 +0000  Selecting
> > > > > site:104  Stage in:20  Submitting:2  Active:7  Stage out:19
> > > > >  Finished successfully:165
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:29 +0000  Selecting
> > > > > site:78
> > > > >  Stage in:29  Submitting:16  Submitted:1  Stage out:2
> > > > >   Finished
> > > > > successfully:191
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:31 +0000  Selecting
> > > > > site:76
> > > > >  Stage in:30  Stage out:17  Finished successfully:194
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:32 +0000  Selecting
> > > > > site:58
> > > > >  Stage in:29  Submitting:18  Active:1  Finished
> > > > >  successfully:211
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:33 +0000  Selecting
> > > > > site:58
> > > > >  Stage in:33  Active:3  Stage out:12  Finished
> > > > >  successfully:211
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:34 +0000  Selecting
> > > > > site:46
> > > > >  Stage in:18  Submitting:11  Submitted:1  Active:2  Stage
> > > > >  out:14
> > > > >  Finished successfully:225
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:35 +0000  Selecting
> > > > > site:30
> > > > >  Stage in:29  Active:14  Stage out:3  Finished
> > > > >  successfully:241
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:36 +0000  Selecting
> > > > > site:28
> > > > >  Stage in:28  Submitting:2  Stage out:17  Finished
> > > > > successfully:242
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:37 +0000  Selecting
> > > > > site:10
> > > > >  Stage in:30  Submitting:17  Submitted:1  Finished
> > > > > successfully:259
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:38 +0000  Selecting
> > > > > site:10
> > > > >  Stage in:35  Stage out:13  Finished successfully:259
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:39 +0000  Stage in:21
> > > > >  Submitting:6  Submitted:3  Stage out:15  Finished
> > > > > successfully:272
> > > > > Progress:  time: Sat, 09 Mar 2013 23:20:40 +0000  Stage in:10
> > > > >  Active:5  Stage out:14  Finished successfully:288
> > > > > Final status: Sat, 09 Mar 2013 23:20:41 +0000  Finished
> > > > > successfully:317
> > > > > 
> > > > > real	0m58.953s
> > > > > user	0m32.573s
> > > > > sys	0m1.263s
> > > > > + mv /home/wilde/.swift/runs/current/run029.1362871183
> > > > > /home/wilde/.swift/runs/completed
> > > > > midway001$
> > > > > 
> > > > > 
> > > > > 
> > > > > ----- Original Message -----
> > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > Sent: Saturday, March 9, 2013 5:12:59 PM
> > > > > > Subject: Re: runs for OSG talk
> > > > > > 
> > > > > > 
> > > > > > Yep - I had a version where the input files were in a very
> > > > > > similar
> > > > > > format (PGM, 1 byte per pixel). I'll add that back, but
> > > > > > without
> > > > > > the
> > > > > > small PGM header in the files.
> > > > > > 
> > > > > > ----- Original Message -----
> > > > > > 
> > > > > > 
> > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > Sent: Saturday, March 9, 2013 5:04:43 PM
> > > > > > Subject: Re: runs for OSG talk
> > > > > > 
> > > > > > I think we need to cut down the size of these files for a
> > > > > > demo
> > > > > > (although they are great for a stress test).
> > > > > > 
> > > > > > First, the RGB format by itself uses 3 bytes per pixel when
> > > > > > it
> > > > > > only
> > > > > > needs one (for land use)
> > > > > > 
> > > > > > Second, we should cut down by a factor of 9 (3x3) or 16
> > > > > > (4x4).
> > > > > > 
> > > > > > I tried that using simple convert statements, but it always
> > > > > > seems
> > > > > > to
> > > > > > yield a file exactly double what it should be.
> > > > > > 
> > > > > > More on this later; was hoping to get things working "as
> > > > > > is"
> > > > > > first.
> > > > > > 
> > > > > > I assume you could get the perl code to work on
> > > > > > one-byte-per-pixel
> > > > > > instead of the default 3 for the convert rgb format?
> > > > > > 
> > > > > > - Mike
> > > > > > 
> > > > > > ----- Original Message -----
> > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > Sent: Saturday, March 9, 2013 4:36:30 PM
> > > > > > > Subject: Re: runs for OSG talk
> > > > > > > 
> > > > > > > 
> > > > > > > That would probably be a good idea for a new script, to
> > > > > > > show
> > > > > > > how to
> > > > > > > stage apps like that. For now I updated the scripts on
> > > > > > > lustre..
> > > > > > > hopefully that helps.
> > > > > > > 
> > > > > > > ----- Original Message -----
> > > > > > > 
> > > > > > > 
> > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > Sent: Saturday, March 9, 2013 4:29:14 PM
> > > > > > > Subject: Re: runs for OSG talk
> > > > > > > 
> > > > > > > OK, I see that its trying to run getlanduse.sh from your
> > > > > > > /lustre
> > > > > > > dir
> > > > > > > on beagle, which is different than the one Ive got
> > > > > > > checked
> > > > > > > out.
> > > > > > > It
> > > > > > > seems to get an error in a stderr redirect??? Let me se
> > > > > > > what I
> > > > > > > need
> > > > > > > to do to get the beagle side in sync.
> > > > > > > 
> > > > > > > Seems like since these are perl scripts, we should make
> > > > > > > the
> > > > > > > app()
> > > > > > > /bin/sh and send the script as data, perhaps?
> > > > > > > 
> > > > > > > - Mike
> > > > > > > 
> > > > > > > ----- Original Message -----
> > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > Sent: Saturday, March 9, 2013 4:19:31 PM
> > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > 
> > > > > > > > OK, making progress. Now I dialed down the throttle and
> > > > > > > > node
> > > > > > > > counts
> > > > > > > > to 48 jobs.
> > > > > > > > 
> > > > > > > > Now I get further, for ./demo and site=4 script=2:
> > > > > > > > 
> > > > > > > > RunID: 20130309-2214-1oi3rvea
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:06 +0000
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:17 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Submitting:47 Submitted:1
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:22 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:1 Submitted:47
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:28 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:25 Submitted:23
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:14:36 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:48
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:15:06 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:48
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:15:36 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:48
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:06 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:48
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:26 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:47 Active:1
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:27 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:36 Active:12
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:29 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:24 Active:24
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:34 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:24 Active:23 Stage out:1
> > > > > > > > Progress: time: Sat, 09 Mar 2013 22:16:35 +0000
> > > > > > > > Selecting
> > > > > > > > site:269
> > > > > > > > Stage in:14 Active:33 Stage out:1
> > > > > > > > Execution failed:
> > > > > > > > Exception in getlanduse:
> > > > > > > > Arguments:
> > > > > > > > [home/wilde/osgdemo/modis/svn/data/modis/2002/h08v04.rgb]
> > > > > > > > Host: beagle
> > > > > > > > Directory:
> > > > > > > > modis02-20130309-2214-1oi3rvea/jobs/k/getlanduse-ko5qjd6l
> > > > > > > > 
> > > > > > > > Caused by:
> > > > > > > > Application
> > > > > > > > /lustre/beagle/davidk/modis/bin/getlanduse.sh
> > > > > > > > failed
> > > > > > > > with an exit code of 1
> > > > > > > > getLandUse, modis02.swift, line 20
> > > > > > > > 
> > > > > > > > real 2m31.463s
> > > > > > > > user 1m33.238s
> > > > > > > > sys 0m2.160s
> > > > > > > > + mv /home/wilde/.swift/runs/current/run024.1362867244
> > > > > > > > /home/wilde/.swift/runs/completed
> > > > > > > > midway001$
> > > > > > > > 
> > > > > > > > 
> > > > > > > > ----- Original Message -----
> > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > Sent: Saturday, March 9, 2013 3:55:30 PM
> > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > ok, I'll take a look at that. The run dir I used was
> > > > > > > > > /scratch/midway/davidkelly999/modis/run011
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > ----- Original Message -----
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > Sent: Saturday, March 9, 2013 3:52:28 PM
> > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > 
> > > > > > > > > I just tried this, but didnt work - same prob.
> > > > > > > > > 
> > > > > > > > > But if its working for you now, we must be close.
> > > > > > > > > 
> > > > > > > > > Not yet sure what the diff is...
> > > > > > > > > 
> > > > > > > > > My run dir is /home/wilde/osgdemo/modis/svn/run021
> > > > > > > > > 
> > > > > > > > > - Mike
> > > > > > > > > 
> > > > > > > > > ----- Original Message -----
> > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > Sent: Saturday, March 9, 2013 3:46:13 PM
> > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Had to make sure I was using the IP address on eth4
> > > > > > > > > > (128.135.112.71
> > > > > > > > > > for midway-login1), not a local address or an
> > > > > > > > > > infiniband
> > > > > > > > > > address.
> > > > > > > > > > 
> > > > > > > > > > ----- Original Message -----
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > Sent: Saturday, March 9, 2013 3:43:51 PM
> > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > I just got it working. I had to adjust for the
> > > > > > > > > > differences in
> > > > > > > > > > my
> > > > > > > > > > username on Beagle/Midway, then I had to set
> > > > > > > > > > GLOBUS_HOSTNAME
> > > > > > > > > > on
> > > > > > > > > > Midway to the IP address, rather than the full
> > > > > > > > > > hostname
> > > > > > > > > > 
> > > > > > > > > > ----- Original Message -----
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > Sent: Saturday, March 9, 2013 3:40:03 PM
> > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > Sent: Saturday, March 9, 2013 3:34:58 PM
> > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Is your username the same on beagle and midway?
> > > > > > > > > > 
> > > > > > > > > > Yes. And I verified that I can ssh to login4 on
> > > > > > > > > > beagle
> > > > > > > > > > from
> > > > > > > > > > my
> > > > > > > > > > midway
> > > > > > > > > > session (as indeed the scp's of the proxy files
> > > > > > > > > > seem
> > > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > working)
> > > > > > > > > > 
> > > > > > > > > > - Mike
> > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > Sent: Saturday, March 9, 2013 3:34:28 PM
> > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > 
> > > > > > > > > > > OK.
> > > > > > > > > > > 
> > > > > > > > > > > Ignore what I said about "problem finding java" -
> > > > > > > > > > > thats
> > > > > > > > > > > code
> > > > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > very long escaped shell command that gets sent to
> > > > > > > > > > > the
> > > > > > > > > > > remote
> > > > > > > > > > > side.
> > > > > > > > > > > I
> > > > > > > > > > > dont *think* thats the problem.
> > > > > > > > > > > 
> > > > > > > > > > > I also verified that beagle can connect to ports
> > > > > > > > > > > 50001
> > > > > > > > > > > etc
> > > > > > > > > > > on
> > > > > > > > > > > swift.rcc, and that seems OK.
> > > > > > > > > > > 
> > > > > > > > > > > I exported
> > > > > > > > > > > GLOBUS_HOSTNAME=midway001.rcc.uchicago.edu
> > > > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > midway
> > > > > > > > > > > side. And the beagle side seems to be connecting
> > > > > > > > > > > there.
> > > > > > > > > > > 
> > > > > > > > > > > Im a bit confused about the timestamps I see for
> > > > > > > > > > > the
> > > > > > > > > > > proxy
> > > > > > > > > > > expiration
> > > > > > > > > > > time, but am not yet suspicious of that (although
> > > > > > > > > > > it
> > > > > > > > > > > seems
> > > > > > > > > > > less
> > > > > > > > > > > than
> > > > > > > > > > > 5 hours past GMT... not sure.)
> > > > > > > > > > > 
> > > > > > > > > > > - Mike
> > > > > > > > > > > 
> > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > Sent: Saturday, March 9, 2013 3:26:32 PM
> > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > I'm seeing the same error now.. looking into it
> > > > > > > > > > > > 
> > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > > Sent: Saturday, March 9, 2013 3:21:30 PM
> > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > 
> > > > > > > > > > > > Looking deeper I see that the logs show
> > > > > > > > > > > > problems
> > > > > > > > > > > > with
> > > > > > > > > > > > finding
> > > > > > > > > > > > Java,
> > > > > > > > > > > > I
> > > > > > > > > > > > assume on beagle, ans also service ending
> > > > > > > > > > > > (presumably
> > > > > > > > > > > > coaster
> > > > > > > > > > > > service on midway host).
> > > > > > > > > > > > 
> > > > > > > > > > > > I'll dig into these two.
> > > > > > > > > > > > 
> > > > > > > > > > > > I see that it scp's the proxies to beagle which
> > > > > > > > > > > > I
> > > > > > > > > > > > think
> > > > > > > > > > > > answers
> > > > > > > > > > > > my
> > > > > > > > > > > > question about security.
> > > > > > > > > > > > 
> > > > > > > > > > > > - Mike
> > > > > > > > > > > > 
> > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > Sent: Saturday, March 9, 2013 3:15:01 PM
> > > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > > 
> > > > > > > > > > > > > OK. Any thoughts about beagle?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Ive been experimenting but still cant get it
> > > > > > > > > > > > > to
> > > > > > > > > > > > > work,
> > > > > > > > > > > > > same
> > > > > > > > > > > > > error
> > > > > > > > > > > > > (cant connect to bootstrap port)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > WHen you tried ssh-cl to beagle with
> > > > > > > > > > > > > automatic
> > > > > > > > > > > > > coasters,
> > > > > > > > > > > > > what
> > > > > > > > > > > > > configuration (sites env etc) did you use?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I verified that beagle can connect back to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > midway
> > > > > > > > > > > > > hosts
> > > > > > > > > > > > > and
> > > > > > > > > > > > > ports.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Do we need to specify security or create a
> > > > > > > > > > > > > proxy
> > > > > > > > > > > > > etc?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > - Mike
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > From: "David Kelly"
> > > > > > > > > > > > > > <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > > Sent: Saturday, March 9, 2013 3:08:58 PM
> > > > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > One way you can override/customize the
> > > > > > > > > > > > > > default
> > > > > > > > > > > > > > templates
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > create
> > > > > > > > > > > > > > them in $HOME/.swift/sites (I'm not sure if
> > > > > > > > > > > > > > that's
> > > > > > > > > > > > > > what
> > > > > > > > > > > > > > you
> > > > > > > > > > > > > > mean
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > a local sites dir or not). But you are
> > > > > > > > > > > > > > right
> > > > > > > > > > > > > > about
> > > > > > > > > > > > > > Midway
> > > > > > > > > > > > > > -
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > noticed that when using modis it will
> > > > > > > > > > > > > > sometimes
> > > > > > > > > > > > > > get
> > > > > > > > > > > > > > stuck
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > goes to a queue that is busy. Ideally swift
> > > > > > > > > > > > > > replication
> > > > > > > > > > > > > > would
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > able to help better handle that, but I
> > > > > > > > > > > > > > haven't
> > > > > > > > > > > > > > had
> > > > > > > > > > > > > > much
> > > > > > > > > > > > > > luck
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > that yet. Another way around this may be to
> > > > > > > > > > > > > > add
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > template:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > <profile namespace="globus"
> > > > > > > > > > > > > > key="slurm.exclusive">false</profile>
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > The swift.log issue was never fixed. It
> > > > > > > > > > > > > > went
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > swift-devel
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > discussion but was never fixed. I think it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > relatively
> > > > > > > > > > > > > > simple
> > > > > > > > > > > > > > though.. probably worth fixing before
> > > > > > > > > > > > > > release.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > > Sent: Saturday, March 9, 2013 1:38:47 PM
> > > > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > OK, sounds good re the trip plan. Feel free
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > stay
> > > > > > > > > > > > > > Tue
> > > > > > > > > > > > > > night
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > avoid a 4hr drive after a long day.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Im trying the modis demo.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I tried to create a local sites/ dir so I
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > modify
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > sites
> > > > > > > > > > > > > > templates; thats not working for me either
> > > > > > > > > > > > > > yet.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > For midway, need to force to westmere or
> > > > > > > > > > > > > > sandyb
> > > > > > > > > > > > > > (but
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > both)
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > ensure 1-node jobs, because either queue
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > get
> > > > > > > > > > > > > > filled
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > yield an idle node for a long time. maybe
> > > > > > > > > > > > > > need to
> > > > > > > > > > > > > > fiddle
> > > > > > > > > > > > > > jobsPerNode
> > > > > > > > > > > > > > to get at least 1 core when the system is
> > > > > > > > > > > > > > busy
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > *pretend*
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > its a node.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > So to get response I tried beagle-ssh; That
> > > > > > > > > > > > > > isnt
> > > > > > > > > > > > > > working
> > > > > > > > > > > > > > because
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > template sites file is wrong in swift 0.94
> > > > > > > > > > > > > > rc4.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I also see that swift.log is still getting
> > > > > > > > > > > > > > produced -
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > thought
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > eliminated that. Did it come back due to a
> > > > > > > > > > > > > > problem
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > fix?
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I'll keep hacking; suggestions welcome.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > - Mike
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > > From: "David Kelly"
> > > > > > > > > > > > > > > <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > > > Sent: Saturday, March 9, 2013 12:20:00 PM
> > > > > > > > > > > > > > > Subject: Re: runs for OSG talk
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Hi Mike,
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Looking more closely at the agenda, I
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > most
> > > > > > > > > > > > > > > interesting/useful talks will be on
> > > > > > > > > > > > > > > Tuesday.
> > > > > > > > > > > > > > > Monday
> > > > > > > > > > > > > > > I'll
> > > > > > > > > > > > > > > come
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > Argonne to work on any loose ends and put
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > finishing
> > > > > > > > > > > > > > > touches
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > any slides/runs/scripts, then drive to
> > > > > > > > > > > > > > > Indianapolis
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > Monday
> > > > > > > > > > > > > > > afternoon/evening. I have a hotel booked
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > Monday
> > > > > > > > > > > > > > > night.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I'll do some runs using the routes we
> > > > > > > > > > > > > > > talked
> > > > > > > > > > > > > > > about.
> > > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > pretty
> > > > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > have working configurations for
> > > > > > > > > > > > > > > everything
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > talked
> > > > > > > > > > > > > > > about,
> > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > think it's really just a matter of
> > > > > > > > > > > > > > > plugging
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > apps.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > David
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > > > > To: "David Kelly"
> > > > > > > > > > > > > > > <davidk at ci.uchicago.edu>
> > > > > > > > > > > > > > > Sent: Saturday, March 9, 2013 11:03:15 AM
> > > > > > > > > > > > > > > Subject: runs for OSG talk
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Hi David,
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I just wanted to let you know that Im
> > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > run
> > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > now. Im hoping to try a few... WIll see
> > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > much
> > > > > > > > > > > > > > > help
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > need.
> > > > > > > > > > > > > > > Have
> > > > > > > > > > > > > > > you decided on a driving time and made
> > > > > > > > > > > > > > > hotel
> > > > > > > > > > > > > > > arrangements?
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I would feel free to stay for whatever
> > > > > > > > > > > > > > > portion
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > OSG
> > > > > > > > > > > > > > > meeting
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > feel is of value. The only thing I ask is
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > Wed
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > Thu
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > stay available online for user-support or
> > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > assistance
> > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > that come up here. And that you engage
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > help
> > > > > > > > > > > > > > > us
> > > > > > > > > > > > > > > develop the Swift user community and
> > > > > > > > > > > > > > > reliable
> > > > > > > > > > > > > > > OSG
> > > > > > > > > > > > > > > usage.
> > > > > > > > > > > > > > > Rob,
> > > > > > > > > > > > > > > Marco,
> > > > > > > > > > > > > > > Lincoln, and Suchandra would be good to
> > > > > > > > > > > > > > > hang
> > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > introduce you to good contacts.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Of course we will cover your expenses via
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > UChicago
> > > > > > > > > > > > > > > travel
> > > > > > > > > > > > > > > expense
> > > > > > > > > > > > > > > report.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > We'll be starting a project with a tiny
> > > > > > > > > > > > > > > bit
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > additional
> > > > > > > > > > > > > > > ExTENCI
> > > > > > > > > > > > > > > funds to make Swift do smarter data
> > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > OSG
> > > > > > > > > > > > > > > sites
> > > > > > > > > > > > > > > (and
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > general) so anything you learn about OSG
> > > > > > > > > > > > > > > storage
> > > > > > > > > > > > > > > elements/services/tools will be valuable
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > (srmcp,
> > > > > > > > > > > > > > > lcgcp,
> > > > > > > > > > > > > > > etc).
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Between now and your talk, lets just
> > > > > > > > > > > > > > > focus
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > talk,
> > > > > > > > > > > > > > > OK?
> > > > > > > > > > > > > > > Im
> > > > > > > > > > > > > > > hoping
> > > > > > > > > > > > > > > we have slides frozen by Monday.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > While I fiddle, if you could do catsn or
> > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > hello-world-like
> > > > > > > > > > > > > > > tests
> > > > > > > > > > > > > > > to cover the "routes" we discussed, that
> > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > pave
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > plugging in the real app examples.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Sound good? Let me know of any concerns
> > > > > > > > > > > > > > > (other
> > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > fact
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > this is a tad rushed ;)
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Thanks and regards,
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > - Mike
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Michael Wilde
> > > > > > > > > > > > > > > Computation Institute, University of
> > > > > > > > > > > > > > > Chicago
> > > > > > > > > > > > > > > Mathematics and Computer Science Division
> > > > > > > > > > > > > > > Argonne National Laboratory
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > _______________________________________________
> > > > > Swift-devel mailing list
> > > > > Swift-devel at ci.uchicago.edu
> > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > > > 
> > > > 
> > > > 
> > 
> > 
> > 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> 



More information about the Swift-devel mailing list