[Swift-devel] deadlock on workflow:
Michael Wilde
wilde at mcs.anl.gov
Thu Feb 17 17:11:04 CST 2011
Allan,
> 2011/2/17 Allan Espinosa <aespinosa at cs.uchicago.edu>:
> > Hi Mike,
> >
> > I haven't tested it yet. I will need my sites.xml definitions to not
> > use persistent+ passive coasters before being able to test it.
Why is that? 0.92 supports persistent, passive coasters, doesn't it?
> >
> > What's the difference in 'explicit' resume? In my setup i have a
> > "resumefile" that i've been using for the past few months.
By "explicit resume" I meant not using the Swift resume feature, but instead, having your input mapper not return any input dataset members that it knows have already been processed successfully, by checking the output dataset.
Both styles of resume have their pros and cons. The advantage of this "explicit" resume approach is that the definition of "done" for a dataset member can be application-specific. And that it doesn't depend on the automated Swift feature, which likely needs more testing and hardening. The disadvantage is that you have to program it yourself, explicitly.
- Mike
> >
> > -Allan
> >
> > 2011/2/17 Michael Wilde <wilde at mcs.anl.gov>:
> >> Allan,
> >>
> >> If you already stated this I missed it: are you able to run on
> >> 0.92? And does the deadlock occur there? Is resume working in 0.92?
> >>
> >> And, have you considered using explicit resume based on having your
> >> input mapper only return members of the dataset that are not yet
> >> competed? (I think Glen Hocky used that technique with good results
> >> in his latest Glass runs).
> >>
> >> - Mike
> >>
> >>
> >> ----- Original Message -----
> >>> Yeah. I'll take a look at that.
> >>>
> >>> But the other question is whether the deadlocking version is
> >>> something
> >>> that is worth fixing (i.e. a current stable branch or trunk).
> >>>
> >>>
> >>> On Thu, 2011-02-17 at 15:45 -0600, Allan Espinosa wrote:
> >>> > The latest trunk breaks for another case (see my post on 'broken
> >>> > resume files'). So I can't reproduce this there (yet).
> >>> >
> >>> > 2011/2/17 Mihael Hategan <hategan at mcs.anl.gov>:
> >>> > > Ok. Your deadlock is genuine, but your version of swift seems
> >>> > > old.
> >>> > > Are
> >>> > > you sure it wasn't fixed in the mean time?
> >>> > >
> >>> > > On Thu, 2011-02-17 at 13:49 -0600, Allan Espinosa wrote:
> >>> > >> Version
> >>> > >>
> >>> > >> swift-r3835 cog-r2988
> >>> > >>
> >>> > >> see attached:
> >>> > >>
>
>
> --
> Allan M. Espinosa <http://amespinosa.wordpress.com>
> PhD student, Computer Science
> University of Chicago <http://people.cs.uchicago.edu/~aespinosa>
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list