[Swift-devel] Standard Swift coaster behavior doesnt work well for sporadic jobs

wilde at mcs.anl.gov wilde at mcs.anl.gov
Tue Oct 12 10:14:56 CDT 2010


I see:

2010-10-11 14:54:35,010-0500 INFO  CoasterService Started coaster service: http://192.5.86.5:34445
2010-10-11 14:54:35,021-0500 INFO  Command Sending Command(2, SUBMITJOB) on null[244757954: {}]
2010-10-11 14:54:35,080-0500 INFO  BlockQueueProcessor allocsize = 0.0, queuedsize = 1.0660596665516473, qsz = 1
2010-10-11 14:54:35,081-0500 INFO  BlockQueueProcessor Requeued 1 non-fitting jobs
2010-10-11 14:54:35,082-0500 INFO  BlockQueueProcessor
Settings {
        slots = 32
        workersPerNode = 1
   

but for the second job, I see:

2010-10-11 14:59:55,200-0500 INFO  Command Sending Command(3, SUBMITJOB) on null[244757954: {}]
2010-10-11 14:59:55,224-0500 INFO  Cpu 1011-540235-000000:0 pull
2010-10-11 14:59:55,225-0500 INFO  Cpu 1011-540235-000000:0 submitting urn:1286826874044-1286826874046-1286826874047
2010-10-11 14:59:55,226-0500 INFO  Command Sending Command(3, SUBMITJOB) on SC-1011-540235-000000-000000
2010-10-11 14:59:55,226-0500 INFO  AbstractStreamKarajanChannel Sender 390276053 queue size: 0
2010-10-11 14:59:56,620-0500 INFO  BlockQueueProcessor allocsize = 0.0, queuedsize = 0.0, qsz = 0
2010-10-11 14:59:56,620-0500 INFO  BlockQueueProcessor Plan time: 0
2010-10-11 14:59:58,822-0500 INFO  BlockQueueProcessor allocsize = 0.0, queuedsize = 0.0, qsz = 0
2010-10-11 14:59:58,822-0500 INFO  BlockQueueProcessor Plan time: 0
2010-10-11 14:59:59,935-0500 INFO  AbstractStreamKarajanChannel$Multiplexer Avg stream buf: 0
2010-10-11 15:00:00,790-0500 INFO  Cpu runTime: 2, sleepTime: 10036
2010-10-11 15:00:01,024-0500 INFO  BlockQueueProcessor allocsize = 0.0, queuedsize = 0.0, qsz = 0
2010-10-11 15:00:01,024-0500 INFO  BlockQueueProcessor Plan time: 0
2010-10-11 15:00:03,226-0500 INFO  BlockQueueProcessor allocsize = 0.0, queuedsize = 0.0, qsz = 0
2010-10-11 15:00:03,226-0500 INFO  BlockQueueProcessor Plan time: 0
2010-10-11 15:00:05,112-0500 INFO  CoasterService Idle time: 0
2010-10-11 15:00:05,122-0500 INFO  TaskNotifier Congestion queue size: 0


...which suggests that the coaster service doesnt really see the job in the queue?

The one mod I made that may be causing this was to set the service-side timeout value for the coaster provider up high; this was needed to keep manually configured passive-persistent configurations alive while the Swift client was idle (eg for the multiple-ssh-server configuration of the R server).

- Mike


----- wilde at mcs.anl.gov wrote:

> Mihael, Justin, does the following sound like a likely coaster issue:
> 
> When using the standard Swift coaster code (not passive or
> persistent), if I have a job that runs at the start, and then there is
> a long delay before the next job, such that the coaster worker times
> out, then the coaster scheduler doesnt think that there is a valid
> block into which the job can fit, and Swift just hangs, with the job
> in submitted state but never getting assigned to a block.
> 
> The behavior seems similar to what you see when you try to run a job
> that doesnt fit into any block that you have defined using the
> coasters sites.xml parameters: in that case, too, Swift just hangs.
> 
> Both of these situations (whether or not they are indeed due to the
> same algorithmic issue) seem to be problems that we need to address.
> In the first case (which is my immediate and more important problem)
> you can see a log for the problem in ~wilde/swift-rserver-hangs, along
> with the swift script, tc, sites, and properties  file(cf).
> 
> Thanks,
> 
> Mike
> 
> -- 
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list