[Swift-devel] Problem with coaster workers shutting down early
Michael Wilde
wilde at mcs.anl.gov
Wed Mar 3 09:43:55 CST 2010
This is fixed in CoG rev 2725.
The problem was causing all multi-node coaster blocks to fail to start.
- Mike
----- wilde at mcs.anl.gov wrote:
> Mihael, I dont yet have all the evidence for this issue collected nice
> and clean, but I want to send you what I have to start looking at
> this.
>
> Ive been trying to recreate a problem that Zhao is encountering where
> he's trying to run >15,000 short (~ 1-second) jobs on PADS under
> coasters.
>
> Basically, the worker jobs seem to be exiting for no reason that I can
> discern.
>
> Ive re-created something that looks similar using this:
>
> cd ~wilde/swift/lab
> swift -tc.file tc -sites.file pbscoast.xml cats.swift
>
> Log is /home/wilde/swift/lab/cats-20100302-1751-8qy7m21c.log
>
> Coaster worker logs are in ~wilde/globus.coasters
>
> Seems to work OK when I request 1 node blocks
> With 2-node blocks, the workers seem to shutdown for no apparent
> reason, after about 2 seconds.
>
> ...more details later when I get a chance.
>
> - Mike
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
More information about the Swift-devel
mailing list