[Swift-devel] Problem with coaster workers shutting down early

Michael Wilde wilde at mcs.anl.gov
Wed Mar 3 09:43:55 CST 2010


This is fixed in CoG rev 2725.

The problem was causing all multi-node coaster blocks to fail to start.

- Mike

----- wilde at mcs.anl.gov wrote:

> Mihael, I dont yet have all the evidence for this issue collected nice
> and clean, but I want to send you what I have to start looking at
> this.
> 
> Ive been trying to recreate a problem that Zhao is encountering where
> he's trying to run >15,000 short (~ 1-second) jobs on PADS under
> coasters.
> 
> Basically, the worker jobs seem to be exiting for no reason that I can
> discern.
> 
> Ive re-created something that looks similar using this:
> 
> cd ~wilde/swift/lab
> swift -tc.file tc -sites.file pbscoast.xml cats.swift
> 
> Log is /home/wilde/swift/lab/cats-20100302-1751-8qy7m21c.log
> 
> Coaster worker logs are in ~wilde/globus.coasters
> 
> Seems to work OK when I request 1 node blocks
> With 2-node blocks, the workers seem to shutdown for no apparent
> reason, after about 2 seconds.
> 
> ...more details later when I get a chance.
> 
> - Mike
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel



More information about the Swift-devel mailing list